- Resources Home
- AI Research Assistant
- Search for Papers
- Chrome Extension
- AI Detector
- Paraphraser
- Citation Generator
- April Papers
- June Papers
- July Papers


The Craft of Writing a Strong Hypothesis

Table of Contents
Writing a hypothesis is one of the essential elements of a scientific research paper. It needs to be to the point, clearly communicating what your research is trying to accomplish. A blurry, drawn-out, or complexly-structured hypothesis can confuse your readers. Or worse, the editor and peer reviewers.
A captivating hypothesis is not too intricate. This blog will take you through the process so that, by the end of it, you have a better idea of how to convey your research paper's intent in just one sentence.
What is a Hypothesis?
The first step in your scientific endeavor, a hypothesis, is a strong, concise statement that forms the basis of your research. It is not the same as a thesis statement , which is a brief summary of your research paper.
The sole purpose of a hypothesis is to predict your paper's findings, data, and conclusion. It comes from a place of curiosity and intuition . When you write a hypothesis, you're essentially making an educated guess based on scientific prejudices and evidence, which is further proven or disproven through the scientific method.
The reason for undertaking research is to observe a specific phenomenon. A hypothesis, therefore, lays out what the said phenomenon is. And it does so through two variables, an independent and dependent variable.
The independent variable is the cause behind the observation, while the dependent variable is the effect of the cause. A good example of this is “mixing red and blue forms purple.” In this hypothesis, mixing red and blue is the independent variable as you're combining the two colors at your own will. The formation of purple is the dependent variable as, in this case, it is conditional to the independent variable.
Different Types of Hypotheses

Types of hypotheses
Some would stand by the notion that there are only two types of hypotheses: a Null hypothesis and an Alternative hypothesis. While that may have some truth to it, it would be better to fully distinguish the most common forms as these terms come up so often, which might leave you out of context.
Apart from Null and Alternative, there are Complex, Simple, Directional, Non-Directional, Statistical, and Associative and casual hypotheses. They don't necessarily have to be exclusive, as one hypothesis can tick many boxes, but knowing the distinctions between them will make it easier for you to construct your own.
1. Null hypothesis
A null hypothesis proposes no relationship between two variables. Denoted by H 0 , it is a negative statement like “Attending physiotherapy sessions does not affect athletes' on-field performance.” Here, the author claims physiotherapy sessions have no effect on on-field performances. Even if there is, it's only a coincidence.
2. Alternative hypothesis
Considered to be the opposite of a null hypothesis, an alternative hypothesis is donated as H1 or Ha. It explicitly states that the dependent variable affects the independent variable. A good alternative hypothesis example is “Attending physiotherapy sessions improves athletes' on-field performance.” or “Water evaporates at 100 °C. ” The alternative hypothesis further branches into directional and non-directional.
- Directional hypothesis: A hypothesis that states the result would be either positive or negative is called directional hypothesis. It accompanies H1 with either the ‘<' or ‘>' sign.
- Non-directional hypothesis: A non-directional hypothesis only claims an effect on the dependent variable. It does not clarify whether the result would be positive or negative. The sign for a non-directional hypothesis is ‘≠.'
3. Simple hypothesis
A simple hypothesis is a statement made to reflect the relation between exactly two variables. One independent and one dependent. Consider the example, “Smoking is a prominent cause of lung cancer." The dependent variable, lung cancer, is dependent on the independent variable, smoking.
4. Complex hypothesis
In contrast to a simple hypothesis, a complex hypothesis implies the relationship between multiple independent and dependent variables. For instance, “Individuals who eat more fruits tend to have higher immunity, lesser cholesterol, and high metabolism.” The independent variable is eating more fruits, while the dependent variables are higher immunity, lesser cholesterol, and high metabolism.
5. Associative and casual hypothesis
Associative and casual hypotheses don't exhibit how many variables there will be. They define the relationship between the variables. In an associative hypothesis, changing any one variable, dependent or independent, affects others. In a casual hypothesis, the independent variable directly affects the dependent.
6. Empirical hypothesis
Also referred to as the working hypothesis, an empirical hypothesis claims a theory's validation via experiments and observation. This way, the statement appears justifiable and different from a wild guess.
Say, the hypothesis is “Women who take iron tablets face a lesser risk of anemia than those who take vitamin B12.” This is an example of an empirical hypothesis where the researcher the statement after assessing a group of women who take iron tablets and charting the findings.
7. Statistical hypothesis
The point of a statistical hypothesis is to test an already existing hypothesis by studying a population sample. Hypothesis like “44% of the Indian population belong in the age group of 22-27.” leverage evidence to prove or disprove a particular statement.
Characteristics of a Good Hypothesis
Writing a hypothesis is essential as it can make or break your research for you. That includes your chances of getting published in a journal. So when you're designing one, keep an eye out for these pointers:
- A research hypothesis has to be simple yet clear to look justifiable enough.
- It has to be testable — your research would be rendered pointless if too far-fetched into reality or limited by technology.
- It has to be precise about the results —what you are trying to do and achieve through it should come out in your hypothesis.
- A research hypothesis should be self-explanatory, leaving no doubt in the reader's mind.
- If you are developing a relational hypothesis, you need to include the variables and establish an appropriate relationship among them.
- A hypothesis must keep and reflect the scope for further investigations and experiments.
Separating a Hypothesis from a Prediction
Outside of academia, hypothesis and prediction are often used interchangeably. In research writing, this is not only confusing but also incorrect. And although a hypothesis and prediction are guesses at their core, there are many differences between them.
A hypothesis is an educated guess or even a testable prediction validated through research. It aims to analyze the gathered evidence and facts to define a relationship between variables and put forth a logical explanation behind the nature of events.
Predictions are assumptions or expected outcomes made without any backing evidence. They are more fictionally inclined regardless of where they originate from.
For this reason, a hypothesis holds much more weight than a prediction. It sticks to the scientific method rather than pure guesswork. "Planets revolve around the Sun." is an example of a hypothesis as it is previous knowledge and observed trends. Additionally, we can test it through the scientific method.
Whereas "COVID-19 will be eradicated by 2030." is a prediction. Even though it results from past trends, we can't prove or disprove it. So, the only way this gets validated is to wait and watch if COVID-19 cases end by 2030.
Finally, How to Write a Hypothesis

Quick tips on writing a hypothesis
1. Be clear about your research question
A hypothesis should instantly address the research question or the problem statement. To do so, you need to ask a question. Understand the constraints of your undertaken research topic and then formulate a simple and topic-centric problem. Only after that can you develop a hypothesis and further test for evidence.
2. Carry out a recce
Once you have your research's foundation laid out, it would be best to conduct preliminary research. Go through previous theories, academic papers, data, and experiments before you start curating your research hypothesis. It will give you an idea of your hypothesis's viability or originality.
Making use of references from relevant research papers helps draft a good research hypothesis. SciSpace Discover offers a repository of over 270 million research papers to browse through and gain a deeper understanding of related studies on a particular topic. Additionally, you can use SciSpace Copilot , your AI research assistant, for reading any lengthy research paper and getting a more summarized context of it. A hypothesis can be formed after evaluating many such summarized research papers. Copilot also offers explanations for theories and equations, explains paper in simplified version, allows you to highlight any text in the paper or clip math equations and tables and provides a deeper, clear understanding of what is being said. This can improve the hypothesis by helping you identify potential research gaps.
3. Create a 3-dimensional hypothesis
Variables are an essential part of any reasonable hypothesis. So, identify your independent and dependent variable(s) and form a correlation between them. The ideal way to do this is to write the hypothetical assumption in the ‘if-then' form. If you use this form, make sure that you state the predefined relationship between the variables.
In another way, you can choose to present your hypothesis as a comparison between two variables. Here, you must specify the difference you expect to observe in the results.
4. Write the first draft
Now that everything is in place, it's time to write your hypothesis. For starters, create the first draft. In this version, write what you expect to find from your research.
Clearly separate your independent and dependent variables and the link between them. Don't fixate on syntax at this stage. The goal is to ensure your hypothesis addresses the issue.
5. Proof your hypothesis
After preparing the first draft of your hypothesis, you need to inspect it thoroughly. It should tick all the boxes, like being concise, straightforward, relevant, and accurate. Your final hypothesis has to be well-structured as well.
Research projects are an exciting and crucial part of being a scholar. And once you have your research question, you need a great hypothesis to begin conducting research. Thus, knowing how to write a hypothesis is very important.
Now that you have a firmer grasp on what a good hypothesis constitutes, the different kinds there are, and what process to follow, you will find it much easier to write your hypothesis, which ultimately helps your research.
Now it's easier than ever to streamline your research workflow with SciSpace Discover . Its integrated, comprehensive end-to-end platform for research allows scholars to easily discover, write and publish their research and fosters collaboration.
It includes everything you need, including a repository of over 270 million research papers across disciplines, SEO-optimized summaries and public profiles to show your expertise and experience.
If you found these tips on writing a research hypothesis useful, head over to our blog on Statistical Hypothesis Testing to learn about the top researchers, papers, and institutions in this domain.
Frequently Asked Questions (FAQs)
1. what is the definition of hypothesis.
According to the Oxford dictionary, a hypothesis is defined as “An idea or explanation of something that is based on a few known facts, but that has not yet been proved to be true or correct”.
2. What is an example of hypothesis?
The hypothesis is a statement that proposes a relationship between two or more variables. An example: "If we increase the number of new users who join our platform by 25%, then we will see an increase in revenue."
3. What is an example of null hypothesis?
A null hypothesis is a statement that there is no relationship between two variables. The null hypothesis is written as H0. The null hypothesis states that there is no effect. For example, if you're studying whether or not a particular type of exercise increases strength, your null hypothesis will be "there is no difference in strength between people who exercise and people who don't."
4. What are the types of research?
• Fundamental research
• Applied research
• Qualitative research
• Quantitative research
• Mixed research
• Exploratory research
• Longitudinal research
• Cross-sectional research
• Field research
• Laboratory research
• Fixed research
• Flexible research
• Action research
• Policy research
• Classification research
• Comparative research
• Causal research
• Inductive research
• Deductive research
5. How to write a hypothesis?
• Your hypothesis should be able to predict the relationship and outcome.
• Avoid wordiness by keeping it simple and brief.
• Your hypothesis should contain observable and testable outcomes.
• Your hypothesis should be relevant to the research question.
6. What are the 2 types of hypothesis?
• Null hypotheses are used to test the claim that "there is no difference between two groups of data".
• Alternative hypotheses test the claim that "there is a difference between two data groups".
7. Difference between research question and research hypothesis?
A research question is a broad, open-ended question you will try to answer through your research. A hypothesis is a statement based on prior research or theory that you expect to be true due to your study. Example - Research question: What are the factors that influence the adoption of the new technology? Research hypothesis: There is a positive relationship between age, education and income level with the adoption of the new technology.
8. What is plural for hypothesis?
The plural of hypothesis is hypotheses. Here's an example of how it would be used in a statement, "Numerous well-considered hypotheses are presented in this part, and they are supported by tables and figures that are well-illustrated."
9. What is the red queen hypothesis?
The red queen hypothesis in evolutionary biology states that species must constantly evolve to avoid extinction because if they don't, they will be outcompeted by other species that are evolving. Leigh Van Valen first proposed it in 1973; since then, it has been tested and substantiated many times.
10. Who is known as the father of null hypothesis?
The father of the null hypothesis is Sir Ronald Fisher. He published a paper in 1925 that introduced the concept of null hypothesis testing, and he was also the first to use the term itself.
11. When to reject null hypothesis?
You need to find a significant difference between your two populations to reject the null hypothesis. You can determine that by running statistical tests such as an independent sample t-test or a dependent sample t-test. You should reject the null hypothesis if the p-value is less than 0.05.
You might also like

How to Write a Statement of the Problem in Research

Elicit vs. SciSpace: AI research assistant for effortless literature review
QuillBot vs SciSpace: Choose the best AI-paraphrasing tool
- Social Anxiety Disorder
- Bipolar Disorder
- Kids Mental Health
- Therapy Center
- When To See a Therapist
- Types of Therapy
- Best Online Therapy
- Best Couples Therapy
- Best Family Therapy
- Managing Stress
- Sleep and Dreaming
- Understanding Emotions
- Self-Improvement
- Healthy Relationships
- Relationships in 2023
- Student Resources
- Personality Types
- Verywell Mind Insights
- 2023 Verywell Mind 25
- Mental Health in the Classroom
- Editorial Process
- Meet Our Review Board
- Crisis Support
How to Write a Great Hypothesis
Hypothesis Format, Examples, and Tips
Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."
:max_bytes(150000):strip_icc():format(webp)/IMG_9791-89504ab694d54b66bbd72cb84ffb860e.jpg)
Amy Morin, LCSW, is a psychotherapist and international bestselling author. Her books, including "13 Things Mentally Strong People Don't Do," have been translated into more than 40 languages. Her TEDx talk, "The Secret of Becoming Mentally Strong," is one of the most viewed talks of all time.
:max_bytes(150000):strip_icc():format(webp)/VW-MIND-Amy-2b338105f1ee493f94d7e333e410fa76.jpg)
Verywell / Alex Dos Diaz
- The Scientific Method
Hypothesis Format
Falsifiability of a hypothesis, operational definitions, types of hypotheses, hypotheses examples.
- Collecting Data
Frequently Asked Questions
A hypothesis is a tentative statement about the relationship between two or more variables. It is a specific, testable prediction about what you expect to happen in a study.
One hypothesis example would be a study designed to look at the relationship between sleep deprivation and test performance might have a hypothesis that states: "This study is designed to assess the hypothesis that sleep-deprived people will perform worse on a test than individuals who are not sleep-deprived."
This article explores how a hypothesis is used in psychology research, how to write a good hypothesis, and the different types of hypotheses you might use.
The Hypothesis in the Scientific Method
In the scientific method , whether it involves research in psychology, biology, or some other area, a hypothesis represents what the researchers think will happen in an experiment. The scientific method involves the following steps:
- Forming a question
- Performing background research
- Creating a hypothesis
- Designing an experiment
- Collecting data
- Analyzing the results
- Drawing conclusions
- Communicating the results
The hypothesis is a prediction, but it involves more than a guess. Most of the time, the hypothesis begins with a question which is then explored through background research. It is only at this point that researchers begin to develop a testable hypothesis. Unless you are creating an exploratory study, your hypothesis should always explain what you expect to happen.
In a study exploring the effects of a particular drug, the hypothesis might be that researchers expect the drug to have some type of effect on the symptoms of a specific illness. In psychology, the hypothesis might focus on how a certain aspect of the environment might influence a particular behavior.
Remember, a hypothesis does not have to be correct. While the hypothesis predicts what the researchers expect to see, the goal of the research is to determine whether this guess is right or wrong. When conducting an experiment, researchers might explore a number of factors to determine which ones might contribute to the ultimate outcome.
In many cases, researchers may find that the results of an experiment do not support the original hypothesis. When writing up these results, the researchers might suggest other options that should be explored in future studies.
In many cases, researchers might draw a hypothesis from a specific theory or build on previous research. For example, prior research has shown that stress can impact the immune system. So a researcher might hypothesize: "People with high-stress levels will be more likely to contract a common cold after being exposed to the virus than people who have low-stress levels."
In other instances, researchers might look at commonly held beliefs or folk wisdom. "Birds of a feather flock together" is one example of folk wisdom that a psychologist might try to investigate. The researcher might pose a specific hypothesis that "People tend to select romantic partners who are similar to them in interests and educational level."
Elements of a Good Hypothesis
So how do you write a good hypothesis? When trying to come up with a hypothesis for your research or experiments, ask yourself the following questions:
- Is your hypothesis based on your research on a topic?
- Can your hypothesis be tested?
- Does your hypothesis include independent and dependent variables?
Before you come up with a specific hypothesis, spend some time doing background research. Once you have completed a literature review, start thinking about potential questions you still have. Pay attention to the discussion section in the journal articles you read . Many authors will suggest questions that still need to be explored.
To form a hypothesis, you should take these steps:
- Collect as many observations about a topic or problem as you can.
- Evaluate these observations and look for possible causes of the problem.
- Create a list of possible explanations that you might want to explore.
- After you have developed some possible hypotheses, think of ways that you could confirm or disprove each hypothesis through experimentation. This is known as falsifiability.
In the scientific method , falsifiability is an important part of any valid hypothesis. In order to test a claim scientifically, it must be possible that the claim could be proven false.
Students sometimes confuse the idea of falsifiability with the idea that it means that something is false, which is not the case. What falsifiability means is that if something was false, then it is possible to demonstrate that it is false.
One of the hallmarks of pseudoscience is that it makes claims that cannot be refuted or proven false.
A variable is a factor or element that can be changed and manipulated in ways that are observable and measurable. However, the researcher must also define how the variable will be manipulated and measured in the study.
For example, a researcher might operationally define the variable " test anxiety " as the results of a self-report measure of anxiety experienced during an exam. A "study habits" variable might be defined by the amount of studying that actually occurs as measured by time.
These precise descriptions are important because many things can be measured in a number of different ways. One of the basic principles of any type of scientific research is that the results must be replicable. By clearly detailing the specifics of how the variables were measured and manipulated, other researchers can better understand the results and repeat the study if needed.
Some variables are more difficult than others to define. How would you operationally define a variable such as aggression ? For obvious ethical reasons, researchers cannot create a situation in which a person behaves aggressively toward others.
In order to measure this variable, the researcher must devise a measurement that assesses aggressive behavior without harming other people. In this situation, the researcher might utilize a simulated task to measure aggressiveness.
Hypothesis Checklist
- Does your hypothesis focus on something that you can actually test?
- Does your hypothesis include both an independent and dependent variable?
- Can you manipulate the variables?
- Can your hypothesis be tested without violating ethical standards?
The hypothesis you use will depend on what you are investigating and hoping to find. Some of the main types of hypotheses that you might use include:
- Simple hypothesis : This type of hypothesis suggests that there is a relationship between one independent variable and one dependent variable.
- Complex hypothesis : This type of hypothesis suggests a relationship between three or more variables, such as two independent variables and a dependent variable.
- Null hypothesis : This hypothesis suggests no relationship exists between two or more variables.
- Alternative hypothesis : This hypothesis states the opposite of the null hypothesis.
- Statistical hypothesis : This hypothesis uses statistical analysis to evaluate a representative sample of the population and then generalizes the findings to the larger group.
- Logical hypothesis : This hypothesis assumes a relationship between variables without collecting data or evidence.
A hypothesis often follows a basic format of "If {this happens} then {this will happen}." One way to structure your hypothesis is to describe what will happen to the dependent variable if you change the independent variable .
The basic format might be: "If {these changes are made to a certain independent variable}, then we will observe {a change in a specific dependent variable}."
A few examples of simple hypotheses:
- "Students who eat breakfast will perform better on a math exam than students who do not eat breakfast."
- Complex hypothesis: "Students who experience test anxiety before an English exam will get lower scores than students who do not experience test anxiety."
- "Motorists who talk on the phone while driving will be more likely to make errors on a driving course than those who do not talk on the phone."
Examples of a complex hypothesis include:
- "People with high-sugar diets and sedentary activity levels are more likely to develop depression."
- "Younger people who are regularly exposed to green, outdoor areas have better subjective well-being than older adults who have limited exposure to green spaces."
Examples of a null hypothesis include:
- "Children who receive a new reading intervention will have scores different than students who do not receive the intervention."
- "There will be no difference in scores on a memory recall task between children and adults."
Examples of an alternative hypothesis:
- "Children who receive a new reading intervention will perform better than students who did not receive the intervention."
- "Adults will perform better on a memory task than children."
Collecting Data on Your Hypothesis
Once a researcher has formed a testable hypothesis, the next step is to select a research design and start collecting data. The research method depends largely on exactly what they are studying. There are two basic types of research methods: descriptive research and experimental research.
Descriptive Research Methods
Descriptive research such as case studies , naturalistic observations , and surveys are often used when it would be impossible or difficult to conduct an experiment . These methods are best used to describe different aspects of a behavior or psychological phenomenon.
Once a researcher has collected data using descriptive methods, a correlational study can then be used to look at how the variables are related. This type of research method might be used to investigate a hypothesis that is difficult to test experimentally.
Experimental Research Methods
Experimental methods are used to demonstrate causal relationships between variables. In an experiment, the researcher systematically manipulates a variable of interest (known as the independent variable) and measures the effect on another variable (known as the dependent variable).
Unlike correlational studies, which can only be used to determine if there is a relationship between two variables, experimental methods can be used to determine the actual nature of the relationship—whether changes in one variable actually cause another to change.
A Word From Verywell
The hypothesis is a critical part of any scientific exploration. It represents what researchers expect to find in a study or experiment. In situations where the hypothesis is unsupported by the research, the research still has value. Such research helps us better understand how different aspects of the natural world relate to one another. It also helps us develop new hypotheses that can then be tested in the future.
Some examples of how to write a hypothesis include:
- "Staying up late will lead to worse test performance the next day."
- "People who consume one apple each day will visit the doctor fewer times each year."
- "Breaking study sessions up into three 20-minute sessions will lead to better test results than a single 60-minute study session."
The four parts of a hypothesis are:
- The research question
- The independent variable (IV)
- The dependent variable (DV)
- The proposed relationship between the IV and DV
Castillo M. The scientific method: a need for something better? . AJNR Am J Neuroradiol. 2013;34(9):1669-71. doi:10.3174/ajnr.A3401
Nevid J. Psychology: Concepts and Applications. Wadworth, 2013.
By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."
By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts.

- Manuscript Preparation
What is and How to Write a Good Hypothesis in Research?
- 4 minute read
Table of Contents
One of the most important aspects of conducting research is constructing a strong hypothesis. But what makes a hypothesis in research effective? In this article, we’ll look at the difference between a hypothesis and a research question, as well as the elements of a good hypothesis in research. We’ll also include some examples of effective hypotheses, and what pitfalls to avoid.
What is a Hypothesis in Research?
Simply put, a hypothesis is a research question that also includes the predicted or expected result of the research. Without a hypothesis, there can be no basis for a scientific or research experiment. As such, it is critical that you carefully construct your hypothesis by being deliberate and thorough, even before you set pen to paper. Unless your hypothesis is clearly and carefully constructed, any flaw can have an adverse, and even grave, effect on the quality of your experiment and its subsequent results.
Research Question vs Hypothesis
It’s easy to confuse research questions with hypotheses, and vice versa. While they’re both critical to the Scientific Method, they have very specific differences. Primarily, a research question, just like a hypothesis, is focused and concise. But a hypothesis includes a prediction based on the proposed research, and is designed to forecast the relationship of and between two (or more) variables. Research questions are open-ended, and invite debate and discussion, while hypotheses are closed, e.g. “The relationship between A and B will be C.”
A hypothesis is generally used if your research topic is fairly well established, and you are relatively certain about the relationship between the variables that will be presented in your research. Since a hypothesis is ideally suited for experimental studies, it will, by its very existence, affect the design of your experiment. The research question is typically used for new topics that have not yet been researched extensively. Here, the relationship between different variables is less known. There is no prediction made, but there may be variables explored. The research question can be casual in nature, simply trying to understand if a relationship even exists, descriptive or comparative.
How to Write Hypothesis in Research
Writing an effective hypothesis starts before you even begin to type. Like any task, preparation is key, so you start first by conducting research yourself, and reading all you can about the topic that you plan to research. From there, you’ll gain the knowledge you need to understand where your focus within the topic will lie.
Remember that a hypothesis is a prediction of the relationship that exists between two or more variables. Your job is to write a hypothesis, and design the research, to “prove” whether or not your prediction is correct. A common pitfall is to use judgments that are subjective and inappropriate for the construction of a hypothesis. It’s important to keep the focus and language of your hypothesis objective.
An effective hypothesis in research is clearly and concisely written, and any terms or definitions clarified and defined. Specific language must also be used to avoid any generalities or assumptions.
Use the following points as a checklist to evaluate the effectiveness of your research hypothesis:
- Predicts the relationship and outcome
- Simple and concise – avoid wordiness
- Clear with no ambiguity or assumptions about the readers’ knowledge
- Observable and testable results
- Relevant and specific to the research question or problem
Research Hypothesis Example
Perhaps the best way to evaluate whether or not your hypothesis is effective is to compare it to those of your colleagues in the field. There is no need to reinvent the wheel when it comes to writing a powerful research hypothesis. As you’re reading and preparing your hypothesis, you’ll also read other hypotheses. These can help guide you on what works, and what doesn’t, when it comes to writing a strong research hypothesis.
Here are a few generic examples to get you started.
Eating an apple each day, after the age of 60, will result in a reduction of frequency of physician visits.
Budget airlines are more likely to receive more customer complaints. A budget airline is defined as an airline that offers lower fares and fewer amenities than a traditional full-service airline. (Note that the term “budget airline” is included in the hypothesis.
Workplaces that offer flexible working hours report higher levels of employee job satisfaction than workplaces with fixed hours.
Each of the above examples are specific, observable and measurable, and the statement of prediction can be verified or shown to be false by utilizing standard experimental practices. It should be noted, however, that often your hypothesis will change as your research progresses.
Language Editing Plus
Elsevier’s Language Editing Plus service can help ensure that your research hypothesis is well-designed, and articulates your research and conclusions. Our most comprehensive editing package, you can count on a thorough language review by native-English speakers who are PhDs or PhD candidates. We’ll check for effective logic and flow of your manuscript, as well as document formatting for your chosen journal, reference checks, and much more.

Research Paper Conclusion: Know How To Write It

- Publication Recognition
How to Write and Improve your Researcher Profile
You may also like.

The Essentials of Writing to Communicate Research in Medicine

Changing Lines: Sentence Patterns in Academic Writing

Path to An Impactful Paper: Common Manuscript Writing Patterns and Structure

How to write the results section of a research paper

What are Implications in Research?

Differentiating between the abstract and the introduction of a research paper

What is the Background of a Study and How Should it be Written?

How to Use Tables and Figures effectively in Research Papers
Input your search keywords and press Enter.
- Research article
- Open Access
- Published: 25 June 2018
Identification of research hypotheses and new knowledge from scientific literature
- Matthew Shardlow 1 ,
- Riza Batista-Navarro 1 ,
- Paul Thompson 1 ,
- Raheel Nawaz 1 ,
- John McNaught 1 &
- Sophia Ananiadou ORCID: orcid.org/0000-0002-4097-9191 1
BMC Medical Informatics and Decision Making volume 18 , Article number: 46 ( 2018 ) Cite this article
9400 Accesses
46 Citations
6 Altmetric
Metrics details
Text mining (TM) methods have been used extensively to extract relations and events from the literature. In addition, TM techniques have been used to extract various types or dimensions of interpretative information, known as Meta-Knowledge (MK), from the context of relations and events, e.g. negation, speculation, certainty and knowledge type. However, most existing methods have focussed on the extraction of individual dimensions of MK, without investigating how they can be combined to obtain even richer contextual information. In this paper, we describe a novel, supervised method to extract new MK dimensions that encode Research Hypotheses (an author’s intended knowledge gain) and New Knowledge (an author’s findings). The method incorporates various features, including a combination of simple MK dimensions.
We identify previously explored dimensions and then use a random forest to combine these with linguistic features into a classification model. To facilitate evaluation of the model, we have enriched two existing corpora annotated with relations and events, i.e., a subset of the GENIA-MK corpus and the EU-ADR corpus, by adding attributes to encode whether each relation or event corresponds to Research Hypothesis or New Knowledge. In the GENIA-MK corpus, these new attributes complement simpler MK dimensions that had previously been annotated.
We show that our approach is able to assign different types of MK dimensions to relations and events with a high degree of accuracy. Firstly, our method is able to improve upon the previously reported state of the art performance for an existing dimension, i.e., Knowledge Type. Secondly, we also demonstrate high F1-score in predicting the new dimensions of Research Hypothesis (GENIA: 0.914, EU-ADR 0.802) and New Knowledge (GENIA: 0.829, EU-ADR 0.836).
We have presented a novel approach for predicting New Knowledge and Research Hypothesis, which combines simple MK dimensions to achieve high F1-scores. The extraction of such information is valuable for a number of practical TM applications.
Peer Review reports
The goal of information extraction (IE) is to automatically distil and structure associations from unstructured text, with the aim of making it easier to locate information of interest in huge volumes of text. Within biomedical research articles, the textual context of a particular piece of knowledge often provides clues as to its current status along the ‘research journey’ timeline. Sentences (1)–(3) below exemplify a number of different points along the research timeline regarding the establishment of an association between Interleukin-17 (IL-17) and psoriasis . The association is firstly introduced in (1) as a hypothesis to be investigated. In (2), which is taken from the same paper [ 1 ], the putative association is backed up by initial experimental evidence. Sentence (3) comes from a paper published 10 years later [ 2 ], by which time the association is presented as widely accepted knowledge, presumably on the basis of many further positive experimental results.
(1) ‘To investigate the role of Interleukin-17 (IL-17) in the pathogenesis of psoriasis...’ (2) ‘These findings indicate that up-regulated expression of IL-17 might be involved in the pathogenesis of psoriasis.’ (3) ‘IL-17 is a critical factor in the pathogenesis of psoriasis and other inflammatory diseases.’
There is a strong need to identify different types of emerging knowledge, such as those shown in sentences (1–2), in a number of different scenarios. It has been shown elsewhere that incorporating this type of information improves the automated curation of biomedical networks and models [ 3 ].
In processing sentences (1)–(3) above, a typical IE system would firstly detect that Interleukin-17 and IL-17 are phrases that describe the same gene concept and that psoriasis represents a disease concept. Subsequently, the system would recognise that a specific association exists between these concepts. These associations may be binary relations between concepts, which encode that a specific type of association exists, or they may be events , which encode complex n -ary relations between a trigger word and multiple concepts or other events. Figure 1 shows the specific characteristics of both a relation and an event using the visualisation of the brat rapid annotation tool [ 4 ]. The output of the IE system would allow the location of all sentences within a large document collection, regardless of their varied phrasing, that explicitly mention the same association, or those mentioning other related types of associations, e.g., to find different genes that have an association with psoriasis. The structured associations that are extracted may subsequently be used as input to further stages of reasoning or data mining. Many IE systems would consider that sentences (1)–(3) each conveys exactly the same information, since most such systems only take into account the key information and not the wider context. Recently, however, there has been a trend towards detecting various aspects of contextual/interpretative information (such as negation or speculation) automatically [ 5 – 8 ].

An example of two sentences, one containing events and the other containing one relation. The first sentence shows two events. The first event in the sentence concerns the term ‘activation’ which is a type of positive regulation. The theme of this event is ‘NF-kappaB’, indicating that this protein is being activated. The next event in the sentence is centered around ‘dependent’ which is a type of positive regulation. This event has the cause ‘oxidative stress’ and its theme is the first event in the sentence. The example of a relation between two entities is, in contrast to the event, clearly much more simple. The relation indicates that NPTN is related to Schizophrenia in a relation that can be categorised as ‘Target-Disorder’
In this work, we focus on the automatic assignment of two interpretative dimensions to relations and events extracted by text mining tools. Specifically, we aim to determine whether or not each relation and event corresponds to a Research Hypothesis , as in sentence (1), or to New Knowledge , as in sentence (2). To the best of our knowledge, this work represents the first effort to apply a supervised approach to detect this type of information at such a fine-grained level.
We envisage that the recognition of these two interpretative dimensions is valuable in tasks where the discovery of emerging knowledge is important. To demonstrate the utility and portability of our method, we show that it can be used to enrich instances of both events and relations.
Related work
The task of automatically classifying knowledge contained within scientific literature according to its intended interpretation has long been recognised as an important step towards helping researchers to make sense of the information reported, and to allow important details to be located in an efficient manner. Previous work, focussing either on general scientific text or biomedical text, has aimed to assign interpretative information to continuous textual units, varying in granularity from segments of sentences to complete paragraphs, but most frequently concerning complete sentences. Specific aspects of interpretation addressed have included negation [ 5 ], speculation [ 6 – 8 ], general information content/rhetorical intent, e.g., background, methods, results, insights, etc. [ 9 – 12 ] and the distinction between novel information and background knowledge [ 13 , 14 ].
Despite the demonstrated utility of approaches such as the above, performing such classifications at the level of continuous text spans is not straightforward. For example, a single sentence or clause can introduce multiple types of information (e.g., several interactions or associations), each of which may have a different interpretation, in terms of speculation, negation, research novelty, etc. As can be seen from Fig. 1 , events and relations can structure and categorise the potentially complex information that is described in a continuous text span. Following on from the successful development of IE systems that are able to extract both gene-disease relations [ 15 – 17 ] and biomolecular events [ 18 , 19 ], there has been a growing interest in the task of assigning interpretative information to relations and events. However, given that a single sentence may contain mutiple events or relations, the challenge is to determine whether and how the interpretation of each of these structures is affected by the presence of particular words or phrases in the sentence that denote negation or speculation, etc.
IE systems are typically developed by applying supervised or semi-supervised methods to annotated corpora marked up with relations and events. There have been several efforts to manually enrich corpora with interpretative information, such that it is possible to train models to determine automatically how particular types of contxtual information in a sentence affect the interpretation of different events and relations. Most work on enriching relations and events has been focussed on one or two specific aspects of interpretation (e.g., negation [ 20 , 21 ] and/or speculation [ 22 , 23 ]). Subsequent work has shown that these types of information can be detected automatically [ 24 , 25 ].
In contrast, work on Meta-Knowledge (MK) captures a wider range of contextual information, integrating and building upon various aspects of the above-mentioned schemes to create a number of separate ‘dimensions’ of information, which are aimed at capturing subtle differences in the interpretation of relations and events. Domain-specific versions of the MK scheme have been created to enrich complex event structures in two different domain corpora, i.e., the ACE-MK corpus [ 26 ], which enriches the general domain news-related events of the ACE2005 corpus [ 27 ], and the GENIA-MK corpus [ 28 ], which adds MK to the biomolecular interactions captured as events in the GENIA event corpus [ 22 ]. Recent work has focussed on the detection of uncertainty around events in the GENIA-MK Corpus. Uncertainty was detected using a hybrid approach of rules and machine learning. The authors were able to show that incorporating uncertainty into a pathway modelling task led to an improvement in curator performance [ 3 ].
The GENIA-MK annotation scheme defines five distinct core dimensions of MK for events, each of which has a number of possible values, as shown in Fig. 2 :
Knowledge Type , which categorises the knowledge that the author wishes to express into one of: Observation, Investigation, Analysis, Method, Fact or Other.

The GENIA-MK annotation scheme. There are five Meta-Knowledge dimensions introduced by Thompson et al. as well as two further hyperdimensions
Knowledge Source , which encodes whether the author presents the knowledge as part of their own work (Current), or whether it is referring to previous work (Other).
Polarity , which is set to Positive if the event took place, and to Negative if it is negated, i.e., it did not take place.
Manner , which denotes the event’s intensity, i.e., High, Low or Neutral.
Certainty Level or Uncertainty , which indicates how certain an event is. It may be certain (L3), probable (L2) or possible (L1).
These five dimensions are considered to be independent of one another, in that the value of one dimension does not affect the value of any other dimension. There may, however, be emergent correlations between the dimensions (i.e., an event with the MK value ’Knowledge Source=Other’ is more frequently negated), which occur due to the characteristics of the events. Previous work using the GENIA-MK corpus has demonstrated the feasibility of automatically recognising one or more of the MK dimensions [ 29 – 31 ]. In addition to the five core dimensions, Thompson et al. [ 28 ] introduced the notion of hyperdimensions , (i.e., New Knowledge and Hypothesis) which represent higher level dimensions of information whose values are determined according to specific combinations of values that are assigned to different core MK dimensions. These hyperdimensions are also represented in Fig. 2 . We build upon these approaches in our own work to develop novel techniques for the recognition of New Knowledge and Hypothesis, which take into account several of the core MK dimensions described above, as well as other features pertaining to the structure of the event and sentence.
Our work took as its starting point the MK hyperdimensions defined by Thompson et al. [ 28 ], since we are also interested in idenfifying relations and events that describe hypotheses or new knowledge. However, we found a number of issues with the original work on these hyperdimensions. Firstly, Thompson et al. [ 28 ] did not provide clear definitions for of ‘Hypothesis‘ and ‘New Knowledge’. In response, we have formulated concise definitions for each of them, as shown below. Secondly, by performing an analysis of events that takes into account these definitions, we found that it was not possible to reliably and consistently identify events that describe new knowledge or hypotheses based only on the values of the core MK dimensions. As such, we decided to carry out a new annotation effort to mark up both ‘Research Hypothesis’ and ‘New Knowledge’ as independent MK dimensions (i.e., their values do not necessarily have any dependence on the values of other core MK dimensons), and to explore supervised, rather than rule-based methods, to facilitate their automated recognition.
Annotation guidelines
The starting point for our novel annotation effort was our tightened definitions of Research Hypothesis and New Knowledge ; our initial definitions were refined throughout the process of annotation. As the definitions and guidelines evolved, we asked the annotators to revisit previously annotated documents in each new round. Our final definitions are presented below:
Research Hypothesis: A relation or event is considered as a Research Hypothesis if it encompasses a statement of the authors’ anticipated knowledge gain. This is shown in examples (1) and (2) in Table 1 . Table 1 Examples of sentences containing research hypotheses and new knowledge Full size table
New Knowledge: A relation or event is considered as New Knowledge if it corresponds to a novel research outcome resulting from the work the author is describing, as per examples (3) and (4) in Table 1 .
Whereas the value assigned to each of the core MK dimensions of Thompson et al. is completely independent of the values assigned to the other core dimensions, our newly introduced dimensions do not maintain this independence. Rather, Research Hypothesis and New Knowledge possess the property of mutual exclusivity, as an event or relation cannnot be simultaneously both a Research Hypothesis and New Knowledge. We chose to enrich two different corpora with attributes encoding Research Hypothesis and New Knowledge, i.e., a subset of the biomolecular interactions annotated as events in the GENIA-MK corpus [ 28 ], and the biomarker-relevant relations involving genes, diseases and treatments in the EU-ADR corpus [ 23 ]. Leveraging the previously-added core MK annotations in the GENIA-MK corpus, we explored how these can contribute to the accurate recognition of New Knowledge and Research Hypothesis. Specifically, we have introduced new approaches for predicting the values of the core Knowledge Type and Knowledge Source dimensions, demonstrating an improvement over the former state of the art for Knowledge Type. We subsequently use supervised methods to automatically detect New Knowledge and Research Hypothesis, incorporating the values of Knowledge Type, Knowledge Source and Uncertainty as features into the trained models.
The GENIA-MK corpus consists of one thousand MEDLINE abstracts on the subject of transcription factors in human blood cells, which have been annotated with a range of entities and events that provide detailed, structured information about various types of biomolecular interactions that are described in text. In the GENIA-MK corpus, values for all five core MK dimensions are already manually annotated for all of the 36,000 events. The MK annotation effort also involved the identification of ‘clue words’, i.e., words or phrases that provide evidence for the assignment of values for particular MK dimensions. For example, the word ‘suggest’ would be annotated as a clue both for Uncertainty and Knowledge Type, as it indicates that the information encoded in the event is stated based on a speculative analysis of results.
The EU-ADR corpus consists of three sets of 100 MEDLINE abstracts, each obtained using different PubMed queries aimed at retrieving abstracts that are likely to contain three specific types of relations (i.e., gene-disease, gene-drug and drug-disease), the former two of which can be important in discovering how different types of genetic information influence disease susceptibility and treatment response. The original annotation task involved identifying three types of entities, i.e., targets (proteins, genes and variants), diseases and drugs, together with relationships between these entity types, where these are present. In contrast to the richness of the event representations in the GENIA-MK corpus, each relation annotation in the EU-ADR corpus consists only of links between entities of two specific types. Relations were annotated in 159 of the 300 abstracts selected for inclusion in the corpus.
Annotation of new knowledge and research hypothesis
As an initial step of our work, subsets of GENIA-MK and EU-ADR were manually enriched with additional annotations, which identify those events or relations corresponding to Research Hypotheses or New Knowledge. Since high quality annotations are key to ensuring that accurate supervised models can be trained, we engaged with a number of experts and carried out an exploratory annotation exercise prior to the the final annotation effort, in order to ensure the highest possible inter-annotator agreement (IAA).
Initially, we worked with two domain experts, a text mining researcher and a medical professional. They added the novel MK annotations to events that had been automatically detected in sentences from full-text papers. We found, however, that there were some issues with this annotation set-up. Firstly, we found that events denoting Research Hypotheses and New Knowledge were very sparse in full papers. Secondly, we found that isolated sentences often provided insufficient context for annotators to determine accurately whether or not the event described new knowledge or a hypothesis. Finally, we found that errors in the automatically detected events were detracting the annotators’ attention from the task at hand. Based on these findings, we decided not to pursue this apporach, and instead focussed our anotation efforts on annotating Research Hypotheses and New Knowledge in abstracts containing gold-standard, expert-annotated events and relations, whose quality had previously been verified. Since abstracts also generally contain denser and more consolidated statements of New Knowledge and Research Hypotheses than full papers [ 32 ], we also expected that this approach would produce more useful training data.
We then employed two PhD students (both working in disciplines related to biological sciences) to carry out the next round of annotation work. We held regular meetings to discuss new annotations and provided feedback as necessary. A subset of the abstracts was doubly annotated by both annotators, allowing us to evaluate the annotation quality by calculating IAA using Cohen’s Kappa [ 33 ].
Table 2 , which shows IAA at three different points during the annotation process, illustrates a steady increase in IAA as time progressed and as more discussions were held, demonstrating a convergence towards a common understanding of the guidelines by the two annotators. We get a final agreement of above 0.8 on most dimensions, indicating a strong level of agreement [ 34 ]. Annotation of Research Hypothesis in the EU-ADR corpus achieved slightly lower agreement of 0.761, indicating moderate agreement between the annotators [ 34 ]. At the end of the annotation process, the annotators were asked to revisit their earlier annotations to make revisions based on their enhanced understanding of the guidelines. Remaining discrepancies were resolved by the lead author after consultation with both annotators.
Each annotator marked up 112 abstracts from the EU-ADR corpus (70 of which were doubly annotated), and 100 abstracts from the GENIA-MK corpus (50 of which were doubly annotated). This resulted in a total of 150 GENIA-MK abstracts and 159 EU-ADR abstracts annotated with New Knowledge and Research Hypothesis. Statistics on the final corpus are shown in Table 3 .
Baseline method for new knowledge and research hypothesis
Thompson et al. [ 28 ] suggest a method for detecting new knowledge and hypothesis based on automatic inferences from core MK values. Their inferences state that an event will be an instance of new knowledge if the Knowledge Source dimension is equal to ‘Current’ , the Uncertainty dimension is equal to ‘L3’ (equivalent to ‘Certain’ in our work, see below) and the Knowledge Type dimension is equal to either ‘Observation’ or ‘Analysis’ . Similarly, according to their inferences, an event will be an instance of Hypothesis if the Knowledge Type dimension is equal to ‘Analysis’ and Uncertainty is equal to either ‘L2’ or ‘L1’ (which are both equivalent to ‘Uncertain’ in our work, see below).
We use these automated inferences as a baseline for our techniques. To best reflect the work of Thompson et al. [ 28 ], we use their manually annotated values of Knowledge Type, Uncertainty and Knowledge Source for the GENIA-MK corpus. This allows us to compare our own work with previous efforts, as well as providing a lower bound for the performance of a rule based system, which we contrast with our supervised learning system, as introduced in the next section.
A supervised method for extracting new knowledge and research hypothesis
We took a supervised approach to annotating events with instances of our target dimensions of New Knowledge and Research Hypothesis. According to the previously mentioned intrinsic links to the core MK dimensions of Knowledge Source, Knowledge Type and Uncertainty, we incorporated the values of these dimensions as features that are used by our classifiers.
Uncertainty
For the Uncertainty dimension, we used an existing system [ 3 ]. Adopting their treatment of Uncertainty, we differ from Thompson et al. [ 28 ] as we use only have 2 levels (certain and uncertain), as opposed to their three levels (L3 = certain, L2 = probable and L1 = possible). Since our development of the original MK scheme, we have experimented and discussed different levels of granularity for this dimension with domain experts, and have concluded that the differences between the two different levels of uncertainty in our original scheme (i.e., L1 and L2) are often too subtle to be of benefit in practical scenarios. Therefore, it was decided to focus instead on the binary distinction between certainty and uncertainty.
Knowledge source
The Knowledge Source dimension distinguishes events that encode information originating from an author’s own work (Knowledge Source = Current), from those describing work from an alternative source (Knowledge Source = Other). Such information is relevant to the identification of New Knowledge, as a relation or event that corresponds to information reported in background literature definitely cannot be classed as New Knowledge. Attribution by citation is a well-established practice in the scientific literature. Citations can be expressed heterogeneously between documents, but are typically expressed homogeneously within a single document, or a collection of similarly-sourced documents. We used regular expressions to identify citations following the work of Miwa et al. [ 35 ], in conjunction with a set of clue expressions that aim to detect background knowledge in cases where no citation is given. These include statements such as ‘we previously showed…’ or ‘as seen in our former work’. Whereas Miwa et al. use a supervised learning method to detect Knowledge Source, we found that supervised learning approaches overfitted to the overwhelming majority class (Source =Current) in the GENIA-MK dataset. This meant that we suffered poor performance on unseen data, such as the EU-ADR corpus. To alleviate this, we simply used the regular expression feature as described above as an indicator of Knowledge Source being ‘Other’. A list of our regular expressions and clue expressions is made available as part of the Additional files .
Knowledge type
For Knowledge Type, we used an implementation of the random forest algorithm [ 36 ] from the WEKA library [ 37 ]. We used the standard parameters of the random forest in the WEKA implementation. We used ten-fold cross validation for all experiments, and results are reported as the macro-average across the ten folds. We treat the identification of Knowledge Type as a multi-class classification problem and we took a supervised approach to categorising relations and events in the two corpora according to the values of the Knowledge Type dimension. To facilitate this, we used the following seven types of features to generate information about each event from GENIA-MK and relation from EU-ADR:
Sentence features describing the sentence containing the relation or event.
Structural features, inspired by the structural differences of events.
Participant features, representing the participants in the relation or event.
Lexical features, capturing the presence of clue words.
Constituency features, corresponding to relationships between a clue and the relation or event, based on the output of a parser.
Dependency features, which capture relationships between a clue and the relation or event based on the dependency parse tree.
Parse tree features, which pertain to the structure of the dependency parse tree.
These features are further described in Table 4 . To generate these features, we made use of the GENIA Tagger [ 38 ] to obtain part-of-speech (POS) tags, and the Enju parser [ 39 ] to compute syntactic parse trees.
Research hypotheses and new knowledge
We followed a similar approach to predicting Research Hypothesis and New Knowledge values to that described above for the recognition of Knowledge Type. We used the same features and also a random forest classifier. We incorporated additional features encoding the Knowledge Source, Knowledge Type and Uncertainty of each relation and event.
Clue lists, developed by the authors, were used for the detection of Knowledge Type, Knowledge Source and Uncertainty. For the detection of New Knowledge and Hypothesis, a combination of clues for Knowledge Type, Knowledge Source and Uncertainty was used. The exact clue lists are available in the Additional files .
In this section, we present our experiments to detect the core Knowledge Type dimension, in which we determine the most appropriate feature subset to use, and also compare our approach to previous work. We then extend this approach to recognise New Knowledge and Research Hypothesis, and to evaluate our results in terms of precision Footnote 1 , recall , Footnote 2 and F1-score . Footnote 3
Our experiments to predict the correct values for the Knowledge Type dimension were carried out only using the events in the GENIA-MK corpus, given that Knowledge Type is only annotated in this corpus and not in EU-ADR. We performed an analysis of each feature subset to assess its impact on classifier performance, as shown in Table 5 . It was established that removing each of the participant, dependency and parse tree features individually leads to a small increase in F1-score. However, in subsequent experiments, we found that removing all three features does not lead to an additional increase in performance. We therefore used all feature subsets except for the participant features in subsequent experiments, as this gave us the best overall score. By observing the isolated performance of each feature subset, we also determined that the lexical and structural features are both significant individual contributors to the final classification score. In Table 6 , we compare the performance of our classifier in predicting each Knowledge Type value with the results obtained by the state-of-the-art method developed by Miwa et al. [ 31 ]. The results reveal that our approach achieves an increase in F1-score over Miwa et al. [ 31 ] by a minimum of 0.063 for the Other value, and a maximum of 0.113 for Method. We also see corresponding performance boosts in terms of precision and recall. Although we observe a small drop in recall for Fact and Method, this is offset by an increase in precision of 0.210 and 0.299, respectively.
To further investigate our improvement over Miwa et al., we swapped our classifier for an SVM, but used all the same features. The results of this are shown in Table 6 . This experiment allowed us to compare the performance of our features with the same classification algorithm (SVM), as used by Miwa et al. We note that using the SVM with our features leads to a similar, but slightly worse performance in terms of F1 score than Miwa et al. on all categories except for Analysis. However we do note an increase in Precision for certain categories (Method, Investigation, Analysis) and Recall for others (Observation, Analysis). As our features are tuned for performance with a Random Forest, this experiment demonstrates that different types of classifiers may require different feature sets to achieve optimal performance.
To further understand the impact of our feature categories, we analysed the correlation of each feature with each Knowledge Type value. This allowed us to determine the most informative features for each Knowlegde Type value, as displayed in Table 7 . In addition to this, we calculated the average rank of each feature across all Knowledge Type values. This measure shows us the most globally useful features. The top features according to average rank are displayed in Table 8 .
For the identification of New Knowledge and Research Hypothesis, we firstly performed 10-fold cross validation on each corpus (GENIA-MK and EU-ADR) and for each dimension of interest, yielding the results in Table 9 . In our presentation of results, we term the negative class for New Knowledge as “Other Knowledge”, as it covers a number of categories that we wish to exclude (e.g., background knowledge, irrelevant knowledge, supporting knowledge, etc.). We were able to classify Knowledge Type for relations in the EU-ADR corpus by setting the event and participant features to sensible static values — e.g., the number of participants in a relation is always 2.
In Table 5 , we observed the effects of each feature subset on the overall classification score for Knowledge Type. We found that the structural, lexical and sentence features had particularly strong contributions. The structural features encoded information about the structure of the event and were particularly useful for identifying events that participate in other events. The lexical features depended on the identification of clue words that appeared in the context of relations and events, which provided important evidence to determine the most appropriate MK values to assign. However, the usefulness of this feature is directly tied to the comprehensiveness of the list of clues associated with each MK value.
In addition to the feature analysis in Table 5 , we also provided additional analysis of each specific feature in Tables 7 and 8 . In line with the results from Table 5 , these tables demonstrate that the structural features were particularly informative for most classes, as well as the lexical, dependency and constituency features. It is interesting to note from Table 7 that no individual feature is particularly strongly correlated with each class label. This supports our ensemble approach and indicates that multiple feature sources are needed to attain a high classification accuracy. In addition, we can see that the correlations drop fairly quickly for all classes - indicating that not all features are used for every class. Finally, we can see that different features occur in each column (with some repetition), indicating that certain features were more useful for specific classes.
For the classification of New Knowledge and Hypothesis, we incorporated features denoting the existing meta-knowledge values of the event for Knowledge Source, Knowledge Type and Uncertainty. Knowledge Source indicates whether an event is current to the research in question, or whether it describes background work. This may be especially helpful for the detection of new knowledge, since it is clear that any background work cannot be classified as new knowledge. Knowledge Type classifies events as falling into one of six categories, i.e., Fact, Method, Analysis, Investigation, Observation or Other. The Investigation category may have contributed to the classification of Hypothetical events, whereas Observation and Analysis may have helped to contribute to the detection of New Knowledge events. The Fact, Method and Other categories could have helped the system to determine that events did not convey either hyperdimension. Finally, Uncertainty describes whether an author presented their results with confidence in their accuracy, or with some hedging (e.g., use of the words may, possibly, perhaps , etc.). This dimension could have helped to contribute to the classification of hypotheses (where an author states that an event may occur) and new knowledge, where we expect an author to be certain about their results.
We compared our results to those of Miwa et al. (2012) in Table 6 , where we showed a consistent improvement of precision, recall and F1-score across all categories. Their system used support vector machines (SVMs) for classification, with a set of features similar to our lexical and structural features. However, our work used an enhanced set of features as well as a random forest classifier, which is typically robust in high dimensional classification problems [ 36 ]. These two factors contributed to our system’s improved performance. Our system yielded an average increase in precision of 0.156, but only yielded an average increase in recall of 0.04. This implies that the use of a random forest and additional features mainly helped to ensure that the system returned results which are consistently correct. For both the ‘Fact’ and ‘Method’ Knowledge Type values, our system yielded a slight dip in recall compared to previous work. However, this was coupled with an increase in precision of 0.210 and 0.298, respectively.
To understand the relative contributions made by our switches in both feature set and type of classifier, compared to previous work, we analysed the performance of our system when using an SVM with our features instead of a Random Forest. We attained a similar performance to Miwa et al. using our feature set and SVM, although some values were lower than those reported by Miwa et al. This implies that our decision to use a different type of classifier to Miwa et al. (i.e., Random Forest instead of SVM) was the main reason behind our improved performance. Different feature sets are better suited to different types of classifiers, and our feature set was carefully selected (as documented in Table 5 ) to be performant with a Random Forest. Miwa et al.’s features were equally selected to perform well with an SVM. We have shown similar results in prior work for a task on detecting metaknowledge for negated bio-events [ 29 ], where we showed that tree-based methods, including the Random Forest, outperformed other techniques such as the SVM for detecting the negation dimension of metaknowledge.
We illustrated our results for the identification of the novel dimensions New Knowledge and Research Hypothesis in Table 9 . These showed strong performance across both corpora and association types (events and relations). The results for the GENIA-MK corpus (events) outperformed those for the EU-ADR corpus (relations). This was most likely due to the difference in size between the corpora. There are over ten times more annotated events in the subset of GENIA-MK that we annotated than relations in the subset of EU-ADR (6899 events vs. 622 relations). The fact that we annotated all of the 159 abstracts available in the EU-ADR corpus and only 150 abstracts from GENIA-MK indicates that event structures are more densely packed in GENIA-MK than relations in EU-ADR.
In particular, the EU-ADR corpus yielded a poor recall value for Research Hypotheses. There were only 38 examples of relations annotated as Research Hypothesis in the EU-ADR corpus. Our annotators reported that several relations occuring in hypothetical contexts appeared to have been missed by the original annotators of the EU-ADR corpus, which may be the cause of this sparsity. However, adding additional relations to the corpus was beyond the scope of the current work. The precision for the prediction of Research Hypothesis in the EU-ADR corpus was 1.00, indicating that of those relations automatically classified as Research Hypothesis, all were indeed Research Hypotheses (i.e., there were no false positives). It is usually the case in minority class situations that a classifier will tend towards classifying instances as the majority class (i.e., favouring false negatives over false positives), so this result is expected. We chose not to perform subsampling of the majority class, as the density of Research Hypotheses or New Knowledge in our training data is reflective of the density we would expect in other biomedical abstracts.
Our corpus has focussed on identifying Research Hypotheses and New Knowledge in biomedical abstracts. However, it has been shown elsewhere that full texts contain more information than abstracts alone [ 40 ]. Whilst our future goal is to additionally facilitate the recognition of New Knowledge and Research Hypothesis in full papers, our decision to focus initially on abstracts was motivated by the findings of our earlier rounds of annotation. These initial annotation efforts revealed that the density of the types of MK that form the focus of the current paper are very low in full papers and are consequently difficult for annotators to reliably identify. Therefore we chose to use abstracts, where the density was higher, since the availability of as many examples as possible of relevant MK was important for the development of our methods. We noted that abstracts fairly consistently mention the main Research Hypotheses and New Knowledge outcomes from a paper. However, further information may be available in the full paper that has not been mentioned in the abstract. To access this information we will need to further adapt our techniques and develop annotated corpora of full papers — this is left for future work.
Error analysis
Finally, we present an analysis of some common errors that our system makes and strategies for overcoming these in future work. In the following sentence, the event centred on “regulation” was marked as Non-Hypothetical by the annotators, but our system recognised it as a Hypothetical event.
To continue our investigation of the cellular events that occur following human CMV (HCMV) infection, we focused on the regulation of cellular activation following viral binding to human monocytes.
It is likely that this event was marked as a hypothesis by the system because of the words ‘investigation’ and ‘focused’ that occur before it. However in this case, the main hypothesis that the annotators have marked is on the event centred on ‘occur’ preceding the event centred around ‘focused’. To overcome this in future work, we could implement a classification strategy that takes into account MK information that has already been assigned to other events that occur in the context of the focussed event. A conditional random field or deep learning model could be used for sequence labelling to accomplish this.
The second error, which concerns the event centred on “effects” in the following sentence, was marked as Hypothetical by our annotators, but was classified as Non-Hypothetical by our system.
MATERIAL AND METHODS: In the present study, we analyzed the effects of CyA, aspirin, and indomethacin \(\dots \)
This event is clearly stating the subject of the authors’ investigation, and so should be marked as hypothesis. It is likely that our system was confused by the preceding section heading, which led it to believe that this was part of the background or methods, and not a statement of the authors’ intended research goals. To overcome this, we could identify these section headings automatically and either exclude them from the text to be analysed, or use them as extra features in our classification scheme.
In our third example error, the event in the sentence below is centred on the phrase “result in decreased”. The event was marked as new knowledge by the annotators, but the system was not able to recognise it as such.
Down-regulation of MCP-1 expression by aspirin may result in decreased recruitment of monocytes into the arterial intima beneath stressed EC.
We believe that the cause of this classification errors is the unusual event trigger - the majority of events only have a single verb as their trigger. To help the system to better determine cases in which such events denote new knowledge, it would be necessary to further increase our corpus size, such that the training set includes a wider variety of trigger types. A further factor affecting the inability of the system to determine the new knowledge classification may have been be the lack of an appropriate new knowledge clue. In this case, the annotators most likely determined this as an example of new knowledge due to information from the wider context of the discourse. We could improve our classifier by looking for clues in a wider window, or by looking for discourse clues that might indicate that the author is drawing their conclusions.
The final example below concerns an event (centred on the verb “enhanced”), which was marked as ‘other knowledge’ by the annotators, but which the system determined to be an example of new knowledge.
Taken together, these data indicate that the unexpected expression of megakaryocytic genes is a specific property of immortalized cells that cannot be explained only by enhanced expression of Spi-1 and/or Fli-1 genes
In this example, the event is somewhat problematic as regards the assignment of MK. Although it is clear both that the sentence is a concluding statement, and that there is some new knowledge contained within it, the annotators chose not to mark the event with the trigger “enhanced” as new knowledge, indicating that they did not consider it to convey the main aspect of new knowledge in this sentence. Interestingly, however, both annotators agreed with the system that the event centred on the first instance of “expression” should be marked as an instance of new knowledge. The presence of the clue ‘indicate’ may be affecting the system’s classification decision in both cases. A human annotator can distinguish that indicate is most relevant to ‘expression’, rather than ‘enhanced’, whereas our system was unable to make this distinction.
Conclusions
We have presented a novel application of text mining techniques for the discovery of Research Hypotheses and New Knowledge at the level of events and relations. This constitutes the first study into the application of supervised methods to assign these interpretative aspects at such a fine-grained level. We firstly showed that by applying a Random Forest classifier using a new feature set, we were able to achieve a better performance than previous efforts in detecting Knowledge Type. We subsequently showed that the core MK dimensions of Knowledge Type, Knowledge Source and Uncertainty could feed into the training of classifiers that can predict whether events and relations represent Research Hypotheses and New Knowledge, with a high degree of accuracy. Our techniques can be incorporated into a system that allows researchers to quickly filter information contained within the abstracts of research articles, as shown in previous literature [ 3 ]. Our methods generally favour precision on the positive class (i.e., Research Hypothesis or New Knowledge). Specifically, we attain a precision of between 0.863 and 1.00 on all of the corpus experiments. This demonstrates that our approach is successful in avoiding the identification of false positives, thus allowing researchers to be confident that instances of Research Hypothesis or New Knowledge identified by our method will usually be correct.
the proportion of results returned by the system which are correct.
the proportion of correct results returned by the system as a fraction of all the correct results that should have been found.
the balanced harmonic mean between precision and recall, providing a single overall measure of performance.
Abbreviations
Adverse Drug Reaction
F1 Score (The harmonic mean between Precision and Recall)
Information Extraction
Inter-Annotator Agreement
Meta-Knowledge
Support Vector Machine
Text Mining
Jiawen L, Dongsheng L, Zhijian T. The expression of interleukin-17, interferon-gamma, and macrophage inflammatory protein-3 alpha mRNA in patients with psoriasis vulgaris. J Huazhong University Sci Technol [Med Sci]. 2004; 24(3):294–6. https://doi.org/10.1007/BF02832018 .
Article Google Scholar
Scharffetter-Kochanek K, Singh K, Tasdogan A, Wlaschek M, Gatzka M, Hainzl A, Peters T. Reduction of CD18 promotes expansion of inflammatory gd T cells collaborating with CD4 T cells in chronic murine psoriasiform dermatitis. J Immunol. 2013; 191:5477–88. https://doi.org/10.4049/jimmunol.1300976 .
Article PubMed CAS Google Scholar
Zerva C, Batista-Navarro R, Day P, Ananiadou S. Using uncertainty to link and rank evidence from biomedical literature for model curation. Bioinformatics. btx466. https://doi.org/10.1093/bioinformatics/btx466 .
Stenetorp P, Pyysalo S, Topić G, Ohta T, Ananiadou S, Tsujii J. BRAT: a web-based tool for NLP-assisted text annotation. In: Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics: 2012. p. 102–107.
Agarwal S, Yu H, Kohane I. BioNØT: A searchable database of biomedical negated sentences. BMC Bioinformatics. 2011; 12(1):420. https://doi.org/10.1186/1471-2105-12-420 .
Article PubMed PubMed Central Google Scholar
Medlock B, Briscoe T. Weakly supervised learning for hedge classification in scientific literature. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Prague, Czech Republic: Association for Computational Linguistics: 2007. p. 992–9. http://www.aclweb.org/anthology/P07-1125 .
Google Scholar
Vincze V, Szarvas G, Farkas R, Móra G, Csirik J. The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics. 2008; 9(11):1–9.
Malhotra A, Younesi E, Gurulingappa H, Hofmann-Apitius M. ‘HypothesisFinder:’ a strategy for the detection of speculative statements in scientific text. PLOS Comput Biol. 2013; 9(7):1–10. https://doi.org/10.1371/journal.pcbi.1003117 .
Article CAS Google Scholar
Ruch P, Boyer C, Chichester C, Tbahriti I, Geissbühler A, Fabry P, Gobeill J, Pillet V, Rebholz-Schuhmann D, Lovis C, et al. Using argumentation to extract key sentences from biomedical abstracts. Int J Med Inform. 2007; 76(2):195–200.
Article PubMed Google Scholar
Teufel S, Carletta J, Moens M. An annotation scheme for discourse-level argumentation in research articles. In: Proceedings of the Ninth Conference on European Chapter of the Association for Computational Linguistics. EACL ’99. Stroudsburg: Association for Computational Linguistics: 1999. p. 110–7. https://doi.org/10.3115/977035.977051 .
Mizuta Y, Collier N. Zone identification in biology articles as a basis for information extraction. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications. JNLPBA ’04. Stroudsburg: Association for Computational Linguistics: 2004. p. 29–35. http://dl.acm.org/citation.cfm?id=1567594.1567600 .
Burns G, Dasigi P, de Waard A, Hovy EH. Automated detection of discourse segment and experimental types from the text of cancer pathway results sections. Database. 2016; 2016:122. https://doi.org/10.1093/database/baw122 .
Liakata M, Saha S, Dobnik S, Batchelor C, Rebholz-Schuhmann D. Automatic recognition of conceptualization zones in scientific articles and two life science applications. Bioinformatics. 2012; 28(7):991. https://doi.org/10.1093/bioinformatics/bts071 .
Article PubMed PubMed Central CAS Google Scholar
Simsek D, Buckingham Shum S, Sandor A, De Liddo A, Ferguson R. Xip dashboard: visual analytics from automated rhetorical parsing of scientific metadiscourse. In: 1st International Workshop on Discourse-Centric Learning Analytics. Leuven: 2013.
Bundschus M, Dejori M, Stetter M, Tresp V, Kriegel HP. Extraction of semantic biomedical relations from text using conditional random fields. BMC Bioinformatics. 2008; 9(1):207.
Bravo A, Piñero J, Queralt-Rosinach N, Rautschka LIM. Furlong: Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research. BMC Bioinformatics. 2015; 16(1):55.
Verspoor KM, Heo EG, Kang KY, Song M. Establishing a baseline for literature mining human genetic variants and their relationships to disease cohorts. BMC Med Inf Decis Mak. 2016; 16(1):68.
Nedellec C. Learning language in logic-genic interaction extraction challenge. In: Proceedings of the ICML-2005 Workshop on Learning Language in Logic (LLL05): 2005. p. 31–7.
Kim JD, Pyysalo S, Ohta T, Bossy R, Nguyen N, Tsujii J. Overview of BioNLP shared task 2011. In: Proceedings of the BioNLP Shared Task 2011 Workshop. Portland: Association for Computational Linguistics: 2011. p. 1–6.
Pyysalo S, Ginter F, Heimonen J, Björne F, Boberg F, Järvinen F, Salakoski T. BioInfer: a corpus for information extraction in the biomedical domain. BMC Bioinformatics. 2007; 8(1):50.
Sanchez-Graillet O, Poesio M. Negation of protein—protein interactions: analysis and extraction. Bioinformatics. 2007; 23(13):424. https://doi.org/10.1093/bioinformatics/btm184 .
Kim JD, Ohta T, Tsujii J. Corpus annotation for mining biomedical events from literature. BMC Bioinformatics. 2008; 9(1):1–25.
Van Mulligen EM, Fourrier-Reglat A, Gurwitz D, Molokhia M, Nieto A, Trifiro G, Kors JA, Furlong LI. The EU-ADR corpus: annotated drugs, diseases, targets, and their relationships. J Biomed Inform. 2012; 45(5):879–84.
Björne J, Ginter F, Salakoski T. University of Turku in the BioNLP’11 shared task. BMC Bioinformatics. 2012; 13(11):4.
Kilicoglu H, Bergler S. Biological event composition. BMC Bioinformatics. 2012; 13(11):7.
Thompson P, Nawaz R, McNaught J, Ananiadou S. Enriching news events with meta-knowledge information. Lang Resour Eval. 2016:1–30. https://doi.org/10.1007/s10579-016-9344-9 .
Walker C, Strassel S, Medero J, Maeda K. ACE 2005 multilingual training corpus. Philadelphia: Linguistic Data Consortium; 2006.
Thompson P, Nawaz R, McNaught J, Ananiadou S. Enriching a biomedical event corpus with meta-knowledge annotation. BMC Bioinformatics. 2011; 12(1):1–18.
Nawaz R, Thompson P, Ananiadou S. Negated BioEvents: Analysis and identification. BMC Bioinformatics. 2013; 14(1):14. https://doi.org/10.1186/1471-2105-14-14 .
Nawaz R, Thompson P, Ananiadou S. Something old, something new: identifying knowledge source in bio-events. Int J Comput Linguist Appl. 2013; 4(1):129–44.
Miwa M, Thompson P, McNaught J, Kell DB, Ananiadou S. Extracting semantically enriched events from biomedical literature. BMC Bioinformatics. 2012; 13:108. https://doi.org/10.1186/1471-2105-13-108 . Highly Accessed.
Nawaz R, Thompson P, Ananiadou S. Meta-knowledge annotation at the event level: Comparison between abstracts and full papers. In: Proceedings of the Third Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM 2012): 2012. p. 24–31.
Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960; 20(1):37–46. https://doi.org/10.1177/001316446002000104 .
McHugh ML. Interrater reliability: the kappa statistic. Biochemia medica. 2012; 22(3):276–82.
Miwa M, Sætre R, Kim JD, Tsujii J. Event extraction with complex event classification using rich features. J Bioinforma Comput Biol. 2010; 8(01):131–46.
Breiman L. Random forests. Machine Learning. 2001; 45(1):5–32.
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: An update. SIGKDD Explor Newsl. 2009; 11(1):10–18. https://doi.org/10.1145/1656274.1656278 .
Tsuruoka Y, Tateishi Y, Kim JD, Ohta T, McNaught J, Ananiadou S, Tsujii J. Developing a robust part-of-speech tagger for biomedical text. Berlin, Heidelberg: Springer; 2005, pp. 382–92. Advances in Informatics: 10th Panhellenic Conference on Informatics, PCI 2005, Volas, Greece, November 11-13, 2005.
Book Google Scholar
Miyao Y, Tsujii J. Feature forest models for probabilistic HPSG parsing. Comput Linguist. 2008; 34(1):35–80. https://doi.org/10.1162/coli.2008.34.1.35 .
Schuemie MJ, Weeber M, Schijvenaars BJA, van Mulligen EM, van der Eijk CC, Jelier R, Mons B, Kors JA. Distribution of information in biomedical abstracts and full-text publications. Bioinformatics. 2004; 20(16):2597–604. https://doi.org/10.1093/bioinformatics/bth291 .
Download references
Acknowledgements
The authors wish to thank the annotators involved in creating the dataset for this paper, without whom this research would not have been possible. Out thanks also go to the reviewers for their considered feedback on our research.
The authors of this work were funded by the European Commission (an Open Mining Infrastructure for Text and Data. OpenMinTeD. Grant: 654021), the Medical Research Council (Manchester Molecular Pathology Innovation Centre. MMPathIC Grant: MR/N00583X/1) and the Biotechnology and Biological Sciences Research Council (Enriching Metabolic PATHwaY models with evidence from the literature. EMPATHY. Grant: BB/M006891/1). The funders played no part in either the design of the study or the collection, analysis, and interpretation of data, or in writing the manuscript.
Availability of data and materials
The datasets generated and analysed during the current study are available as Additional files to this paper.
Author information
Authors and affiliations.
National Centre for Text Mining, University of Manchester, Manchester, UK
Matthew Shardlow, Riza Batista-Navarro, Paul Thompson, Raheel Nawaz, John McNaught & Sophia Ananiadou
You can also search for this author in PubMed Google Scholar
Contributions
MS ran the principal experiments, performed the analysis of the results and participated in authoring the paper. RB helped with the design of the experiments and authoring the paper. PT contributed work on the preparation of the EU-ADR corpus as well as participating in the authorship of the paper. RN contributed to the experimental design, guidelines for the annotators and participated in the authorship of the paper. JM and SA jointly supervised the research and participated in authoring the paper. All authors read and approved the final version of this manuscript prior to publication.
Corresponding author
Correspondence to Sophia Ananiadou .
Ethics declarations
Ethics approval and consent to participate.
No ethics approval was required for any element of this study.
Consent for publication
Not Applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional files
Additional file 1.
The annotation guidelines that were given to annotators for reference. (PDF 830 kb)
Additional file 2
A table providing an in depth description of each feature. (PDF 32 kb)
Additional file 3
Read me documentation explaining the structure of the clue files. (TXT 4 kb)
Additional file 4
The clues used to detect the Analysis component of the Knowledge Type meta-knowledge dimension. (FILE 3 kb)
Additional file 5
The clues used to detect the Fact component of the Knowledge Type meta-knowledge dimension. (FILE 4 kb)
Additional file 6
The clues used to detect the Investigation component of the Knowledge Type meta-knowledge dimension. (FILE 2 kb)
Additional file 7
The clues used to detect the Method component of the Knowledge Type meta-knowledge dimension. (FILE 4 kb)
Additional file 8
The clues used to detect the Observation component of the Knowledge Type meta-knowledge dimension. (FILE 4 kb)
Additional file 9
The clues used to detect the Other component of the Knowledge Source meta-knowledge dimension. (FILE 1 kb)
Additional file 10
The clues used to detect the Uncertain component of the Certainty Level meta-knowledge dimension. (FILE 4 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.
Reprints and Permissions
About this article
Cite this article.
Shardlow, M., Batista-Navarro, R., Thompson, P. et al. Identification of research hypotheses and new knowledge from scientific literature. BMC Med Inform Decis Mak 18 , 46 (2018). https://doi.org/10.1186/s12911-018-0639-1
Download citation
Received : 10 August 2017
Accepted : 11 June 2018
Published : 25 June 2018
DOI : https://doi.org/10.1186/s12911-018-0639-1
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
- Text mining
- Meta-knowledge
- New knowledge
BMC Medical Informatics and Decision Making
ISSN: 1472-6947
- Submission enquiries: [email protected]
- General enquiries: [email protected]

What Is A Research (Scientific) Hypothesis? A plain-language explainer + examples
By: Derek Jansen (MBA) | Reviewed By: Dr Eunice Rautenbach | June 2020
If you’re new to the world of research, or it’s your first time writing a dissertation or thesis, you’re probably noticing that the words “research hypothesis” and “scientific hypothesis” are used quite a bit, and you’re wondering what they mean in a research context .
“Hypothesis” is one of those words that people use loosely, thinking they understand what it means. However, it has a very specific meaning within academic research. So, it’s important to understand the exact meaning before you start hypothesizing.
Research Hypothesis 101
- What is a hypothesis ?
- What is a research hypothesis (scientific hypothesis)?
- Requirements for a research hypothesis
- Definition of a research hypothesis
- The null hypothesis
What is a hypothesis?
Let’s start with the general definition of a hypothesis (not a research hypothesis or scientific hypothesis), according to the Cambridge Dictionary:
Hypothesis: an idea or explanation for something that is based on known facts but has not yet been proved.
In other words, it’s a statement that provides an explanation for why or how something works, based on facts (or some reasonable assumptions), but that has not yet been specifically tested . For example, a hypothesis might look something like this:
Hypothesis: sleep impacts academic performance.
This statement predicts that academic performance will be influenced by the amount and/or quality of sleep a student engages in – sounds reasonable, right? It’s based on reasonable assumptions , underpinned by what we currently know about sleep and health (from the existing literature). So, loosely speaking, we could call it a hypothesis, at least by the dictionary definition.
But that’s not good enough…
Unfortunately, that’s not quite sophisticated enough to describe a research hypothesis (also sometimes called a scientific hypothesis), and it wouldn’t be acceptable in a dissertation, thesis or research paper. In the world of academic research, a statement needs a few more criteria to constitute a true research hypothesis .
What is a research hypothesis?
A research hypothesis (also called a scientific hypothesis) is a statement about the expected outcome of a study (for example, a dissertation or thesis). To constitute a quality hypothesis, the statement needs to have three attributes – specificity , clarity and testability .
Let’s take a look at these more closely.
Need a helping hand?
Hypothesis Essential #1: Specificity & Clarity
A good research hypothesis needs to be extremely clear and articulate about both what’ s being assessed (who or what variables are involved ) and the expected outcome (for example, a difference between groups, a relationship between variables, etc.).
Let’s stick with our sleepy students example and look at how this statement could be more specific and clear.
Hypothesis: Students who sleep at least 8 hours per night will, on average, achieve higher grades in standardised tests than students who sleep less than 8 hours a night.
As you can see, the statement is very specific as it identifies the variables involved (sleep hours and test grades), the parties involved (two groups of students), as well as the predicted relationship type (a positive relationship). There’s no ambiguity or uncertainty about who or what is involved in the statement, and the expected outcome is clear.
Contrast that to the original hypothesis we looked at – “Sleep impacts academic performance” – and you can see the difference. “Sleep” and “academic performance” are both comparatively vague , and there’s no indication of what the expected relationship direction is (more sleep or less sleep). As you can see, specificity and clarity are key.

Hypothesis Essential #2: Testability (Provability)
A statement must be testable to qualify as a research hypothesis. In other words, there needs to be a way to prove (or disprove) the statement. If it’s not testable, it’s not a hypothesis – simple as that.
For example, consider the hypothesis we mentioned earlier:
Hypothesis: Students who sleep at least 8 hours per night will, on average, achieve higher grades in standardised tests than students who sleep less than 8 hours a night.
We could test this statement by undertaking a quantitative study involving two groups of students, one that gets 8 or more hours of sleep per night for a fixed period, and one that gets less. We could then compare the standardised test results for both groups to see if there’s a statistically significant difference.
Again, if you compare this to the original hypothesis we looked at – “Sleep impacts academic performance” – you can see that it would be quite difficult to test that statement, primarily because it isn’t specific enough. How much sleep? By who? What type of academic performance?
So, remember the mantra – if you can’t test it, it’s not a hypothesis 🙂

Defining A Research Hypothesis
You’re still with us? Great! Let’s recap and pin down a clear definition of a hypothesis.
A research hypothesis (or scientific hypothesis) is a statement about an expected relationship between variables, or explanation of an occurrence, that is clear, specific and testable.
So, when you write up hypotheses for your dissertation or thesis, make sure that they meet all these criteria. If you do, you’ll not only have rock-solid hypotheses but you’ll also ensure a clear focus for your entire research project.
What about the null hypothesis?
You may have also heard the terms null hypothesis , alternative hypothesis, or H-zero thrown around. At a simple level, the null hypothesis is the counter-proposal to the original hypothesis.
For example, if the hypothesis predicts that there is a relationship between two variables (for example, sleep and academic performance), the null hypothesis would predict that there is no relationship between those variables.
At a more technical level, the null hypothesis proposes that no statistical significance exists in a set of given observations and that any differences are due to chance alone.
And there you have it – hypotheses in a nutshell.
If you have any questions, be sure to leave a comment below and we’ll do our best to help you. If you need hands-on help developing and testing your hypotheses, consider our private coaching service , where we hold your hand through the research journey.

Psst… there’s more (for free)
This post is part of our dissertation mini-course, which covers everything you need to get started with your dissertation, thesis or research project.
You Might Also Like:

11 Comments
Very useful information. I benefit more from getting more information in this regard.
Very great insight,educative and informative. Please give meet deep critics on many research data of public international Law like human rights, environment, natural resources, law of the sea etc
In a book I read a distinction is made between null, research, and alternative hypothesis. As far as I understand, alternative and research hypotheses are the same. Can you please elaborate? Best Afshin
This is a self explanatory, easy going site. I will recommend this to my friends and colleagues.
Very good definition. How can I cite your definition in my thesis? Thank you. Is nul hypothesis compulsory in a research?
Please what is the difference between alternate hypothesis and research hypothesis?
It is a very good explanation. However, it limits hypotheses to statistically tasteable ideas. What about for qualitative researches or other researches that involve quantitative data that don’t need statistical tests?
In qualitative research, one typically uses propositions, not hypotheses.
could you please elaborate it more
I’ve benefited greatly from these notes, thank you.
This is very helpful
Trackbacks/Pingbacks
- What Is Research Methodology? Simple Definition (With Examples) - Grad Coach - […] Contrasted to this, a quantitative methodology is typically used when the research aims and objectives are confirmatory in nature. For example,…
Submit a Comment Cancel reply
Your email address will not be published. Required fields are marked *
Save my name, email, and website in this browser for the next time I comment.
- Print Friendly
Have a language expert improve your writing
Run a free plagiarism check in 10 minutes, automatically generate references for free.
- Knowledge Base
- Methodology
- How to Write a Strong Hypothesis | Guide & Examples
How to Write a Strong Hypothesis | Guide & Examples
Published on 6 May 2022 by Shona McCombes .
A hypothesis is a statement that can be tested by scientific research. If you want to test a relationship between two or more variables, you need to write hypotheses before you start your experiment or data collection.
Table of contents
What is a hypothesis, developing a hypothesis (with example), hypothesis examples, frequently asked questions about writing hypotheses.
A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.
A hypothesis is not just a guess – it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations, and statistical analysis of data).
Variables in hypotheses
Hypotheses propose a relationship between two or more variables . An independent variable is something the researcher changes or controls. A dependent variable is something the researcher observes and measures.
In this example, the independent variable is exposure to the sun – the assumed cause . The dependent variable is the level of happiness – the assumed effect .
Prevent plagiarism, run a free check.
Step 1: ask a question.
Writing a hypothesis begins with a research question that you want to answer. The question should be focused, specific, and researchable within the constraints of your project.
Step 2: Do some preliminary research
Your initial answer to the question should be based on what is already known about the topic. Look for theories and previous studies to help you form educated assumptions about what your research will find.
At this stage, you might construct a conceptual framework to identify which variables you will study and what you think the relationships are between them. Sometimes, you’ll have to operationalise more complex constructs.
Step 3: Formulate your hypothesis
Now you should have some idea of what you expect to find. Write your initial answer to the question in a clear, concise sentence.
Step 4: Refine your hypothesis
You need to make sure your hypothesis is specific and testable. There are various ways of phrasing a hypothesis, but all the terms you use should have clear definitions, and the hypothesis should contain:
- The relevant variables
- The specific group being studied
- The predicted outcome of the experiment or analysis

Step 5: Phrase your hypothesis in three ways
To identify the variables, you can write a simple prediction in if … then form. The first part of the sentence states the independent variable and the second part states the dependent variable.
In academic research, hypotheses are more commonly phrased in terms of correlations or effects, where you directly state the predicted relationship between variables.
If you are comparing two groups, the hypothesis can state what difference you expect to find between them.
Step 6. Write a null hypothesis
If your research involves statistical hypothesis testing , you will also have to write a null hypothesis. The null hypothesis is the default position that there is no association between the variables. The null hypothesis is written as H 0 , while the alternative hypothesis is H 1 or H a .
Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.
A hypothesis is not just a guess. It should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations, and statistical analysis of data).
A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (‘ x affects y because …’).
A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses. In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.
Cite this Scribbr article
If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.
McCombes, S. (2022, May 06). How to Write a Strong Hypothesis | Guide & Examples. Scribbr. Retrieved 25 September 2023, from https://www.scribbr.co.uk/research-methods/hypothesis-writing/
Is this article helpful?
Shona McCombes
Other students also liked, operationalisation | a guide with examples, pros & cons, what is a conceptual framework | tips & examples, a quick guide to experimental design | 5 steps & examples.

An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
- Publications
- Account settings
- Advanced Search
- Journal List
- v.53(4); 2010 Aug

Research questions, hypotheses and objectives
Patricia farrugia.
* Michael G. DeGroote School of Medicine, the
Bradley A. Petrisor
† Division of Orthopaedic Surgery and the
Forough Farrokhyar
‡ Departments of Surgery and
§ Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ont
Mohit Bhandari
There is an increasing familiarity with the principles of evidence-based medicine in the surgical community. As surgeons become more aware of the hierarchy of evidence, grades of recommendations and the principles of critical appraisal, they develop an increasing familiarity with research design. Surgeons and clinicians are looking more and more to the literature and clinical trials to guide their practice; as such, it is becoming a responsibility of the clinical research community to attempt to answer questions that are not only well thought out but also clinically relevant. The development of the research question, including a supportive hypothesis and objectives, is a necessary key step in producing clinically relevant results to be used in evidence-based practice. A well-defined and specific research question is more likely to help guide us in making decisions about study design and population and subsequently what data will be collected and analyzed. 1
Objectives of this article
In this article, we discuss important considerations in the development of a research question and hypothesis and in defining objectives for research. By the end of this article, the reader will be able to appreciate the significance of constructing a good research question and developing hypotheses and research objectives for the successful design of a research study. The following article is divided into 3 sections: research question, research hypothesis and research objectives.
Research question
Interest in a particular topic usually begins the research process, but it is the familiarity with the subject that helps define an appropriate research question for a study. 1 Questions then arise out of a perceived knowledge deficit within a subject area or field of study. 2 Indeed, Haynes suggests that it is important to know “where the boundary between current knowledge and ignorance lies.” 1 The challenge in developing an appropriate research question is in determining which clinical uncertainties could or should be studied and also rationalizing the need for their investigation.
Increasing one’s knowledge about the subject of interest can be accomplished in many ways. Appropriate methods include systematically searching the literature, in-depth interviews and focus groups with patients (and proxies) and interviews with experts in the field. In addition, awareness of current trends and technological advances can assist with the development of research questions. 2 It is imperative to understand what has been studied about a topic to date in order to further the knowledge that has been previously gathered on a topic. Indeed, some granting institutions (e.g., Canadian Institute for Health Research) encourage applicants to conduct a systematic review of the available evidence if a recent review does not already exist and preferably a pilot or feasibility study before applying for a grant for a full trial.
In-depth knowledge about a subject may generate a number of questions. It then becomes necessary to ask whether these questions can be answered through one study or if more than one study needed. 1 Additional research questions can be developed, but several basic principles should be taken into consideration. 1 All questions, primary and secondary, should be developed at the beginning and planning stages of a study. Any additional questions should never compromise the primary question because it is the primary research question that forms the basis of the hypothesis and study objectives. It must be kept in mind that within the scope of one study, the presence of a number of research questions will affect and potentially increase the complexity of both the study design and subsequent statistical analyses, not to mention the actual feasibility of answering every question. 1 A sensible strategy is to establish a single primary research question around which to focus the study plan. 3 In a study, the primary research question should be clearly stated at the end of the introduction of the grant proposal, and it usually specifies the population to be studied, the intervention to be implemented and other circumstantial factors. 4
Hulley and colleagues 2 have suggested the use of the FINER criteria in the development of a good research question ( Box 1 ). The FINER criteria highlight useful points that may increase the chances of developing a successful research project. A good research question should specify the population of interest, be of interest to the scientific community and potentially to the public, have clinical relevance and further current knowledge in the field (and of course be compliant with the standards of ethical boards and national research standards).
FINER criteria for a good research question
Adapted with permission from Wolters Kluwer Health. 2
Whereas the FINER criteria outline the important aspects of the question in general, a useful format to use in the development of a specific research question is the PICO format — consider the population (P) of interest, the intervention (I) being studied, the comparison (C) group (or to what is the intervention being compared) and the outcome of interest (O). 3 , 5 , 6 Often timing (T) is added to PICO ( Box 2 ) — that is, “Over what time frame will the study take place?” 1 The PICOT approach helps generate a question that aids in constructing the framework of the study and subsequently in protocol development by alluding to the inclusion and exclusion criteria and identifying the groups of patients to be included. Knowing the specific population of interest, intervention (and comparator) and outcome of interest may also help the researcher identify an appropriate outcome measurement tool. 7 The more defined the population of interest, and thus the more stringent the inclusion and exclusion criteria, the greater the effect on the interpretation and subsequent applicability and generalizability of the research findings. 1 , 2 A restricted study population (and exclusion criteria) may limit bias and increase the internal validity of the study; however, this approach will limit external validity of the study and, thus, the generalizability of the findings to the practical clinical setting. Conversely, a broadly defined study population and inclusion criteria may be representative of practical clinical practice but may increase bias and reduce the internal validity of the study.
PICOT criteria 1
A poorly devised research question may affect the choice of study design, potentially lead to futile situations and, thus, hamper the chance of determining anything of clinical significance, which will then affect the potential for publication. Without devoting appropriate resources to developing the research question, the quality of the study and subsequent results may be compromised. During the initial stages of any research study, it is therefore imperative to formulate a research question that is both clinically relevant and answerable.
Research hypothesis
The primary research question should be driven by the hypothesis rather than the data. 1 , 2 That is, the research question and hypothesis should be developed before the start of the study. This sounds intuitive; however, if we take, for example, a database of information, it is potentially possible to perform multiple statistical comparisons of groups within the database to find a statistically significant association. This could then lead one to work backward from the data and develop the “question.” This is counterintuitive to the process because the question is asked specifically to then find the answer, thus collecting data along the way (i.e., in a prospective manner). Multiple statistical testing of associations from data previously collected could potentially lead to spuriously positive findings of association through chance alone. 2 Therefore, a good hypothesis must be based on a good research question at the start of a trial and, indeed, drive data collection for the study.
The research or clinical hypothesis is developed from the research question and then the main elements of the study — sampling strategy, intervention (if applicable), comparison and outcome variables — are summarized in a form that establishes the basis for testing, statistical and ultimately clinical significance. 3 For example, in a research study comparing computer-assisted acetabular component insertion versus freehand acetabular component placement in patients in need of total hip arthroplasty, the experimental group would be computer-assisted insertion and the control/conventional group would be free-hand placement. The investigative team would first state a research hypothesis. This could be expressed as a single outcome (e.g., computer-assisted acetabular component placement leads to improved functional outcome) or potentially as a complex/composite outcome; that is, more than one outcome (e.g., computer-assisted acetabular component placement leads to both improved radiographic cup placement and improved functional outcome).
However, when formally testing statistical significance, the hypothesis should be stated as a “null” hypothesis. 2 The purpose of hypothesis testing is to make an inference about the population of interest on the basis of a random sample taken from that population. The null hypothesis for the preceding research hypothesis then would be that there is no difference in mean functional outcome between the computer-assisted insertion and free-hand placement techniques. After forming the null hypothesis, the researchers would form an alternate hypothesis stating the nature of the difference, if it should appear. The alternate hypothesis would be that there is a difference in mean functional outcome between these techniques. At the end of the study, the null hypothesis is then tested statistically. If the findings of the study are not statistically significant (i.e., there is no difference in functional outcome between the groups in a statistical sense), we cannot reject the null hypothesis, whereas if the findings were significant, we can reject the null hypothesis and accept the alternate hypothesis (i.e., there is a difference in mean functional outcome between the study groups), errors in testing notwithstanding. In other words, hypothesis testing confirms or refutes the statement that the observed findings did not occur by chance alone but rather occurred because there was a true difference in outcomes between these surgical procedures. The concept of statistical hypothesis testing is complex, and the details are beyond the scope of this article.
Another important concept inherent in hypothesis testing is whether the hypotheses will be 1-sided or 2-sided. A 2-sided hypothesis states that there is a difference between the experimental group and the control group, but it does not specify in advance the expected direction of the difference. For example, we asked whether there is there an improvement in outcomes with computer-assisted surgery or whether the outcomes worse with computer-assisted surgery. We presented a 2-sided test in the above example because we did not specify the direction of the difference. A 1-sided hypothesis states a specific direction (e.g., there is an improvement in outcomes with computer-assisted surgery). A 2-sided hypothesis should be used unless there is a good justification for using a 1-sided hypothesis. As Bland and Atlman 8 stated, “One-sided hypothesis testing should never be used as a device to make a conventionally nonsignificant difference significant.”
The research hypothesis should be stated at the beginning of the study to guide the objectives for research. Whereas the investigators may state the hypothesis as being 1-sided (there is an improvement with treatment), the study and investigators must adhere to the concept of clinical equipoise. According to this principle, a clinical (or surgical) trial is ethical only if the expert community is uncertain about the relative therapeutic merits of the experimental and control groups being evaluated. 9 It means there must exist an honest and professional disagreement among expert clinicians about the preferred treatment. 9
Designing a research hypothesis is supported by a good research question and will influence the type of research design for the study. Acting on the principles of appropriate hypothesis development, the study can then confidently proceed to the development of the research objective.
Research objective
The primary objective should be coupled with the hypothesis of the study. Study objectives define the specific aims of the study and should be clearly stated in the introduction of the research protocol. 7 From our previous example and using the investigative hypothesis that there is a difference in functional outcomes between computer-assisted acetabular component placement and free-hand placement, the primary objective can be stated as follows: this study will compare the functional outcomes of computer-assisted acetabular component insertion versus free-hand placement in patients undergoing total hip arthroplasty. Note that the study objective is an active statement about how the study is going to answer the specific research question. Objectives can (and often do) state exactly which outcome measures are going to be used within their statements. They are important because they not only help guide the development of the protocol and design of study but also play a role in sample size calculations and determining the power of the study. 7 These concepts will be discussed in other articles in this series.
From the surgeon’s point of view, it is important for the study objectives to be focused on outcomes that are important to patients and clinically relevant. For example, the most methodologically sound randomized controlled trial comparing 2 techniques of distal radial fixation would have little or no clinical impact if the primary objective was to determine the effect of treatment A as compared to treatment B on intraoperative fluoroscopy time. However, if the objective was to determine the effect of treatment A as compared to treatment B on patient functional outcome at 1 year, this would have a much more significant impact on clinical decision-making. Second, more meaningful surgeon–patient discussions could ensue, incorporating patient values and preferences with the results from this study. 6 , 7 It is the precise objective and what the investigator is trying to measure that is of clinical relevance in the practical setting.
The following is an example from the literature about the relation between the research question, hypothesis and study objectives:
Study: Warden SJ, Metcalf BR, Kiss ZS, et al. Low-intensity pulsed ultrasound for chronic patellar tendinopathy: a randomized, double-blind, placebo-controlled trial. Rheumatology 2008;47:467–71.
Research question: How does low-intensity pulsed ultrasound (LIPUS) compare with a placebo device in managing the symptoms of skeletally mature patients with patellar tendinopathy?
Research hypothesis: Pain levels are reduced in patients who receive daily active-LIPUS (treatment) for 12 weeks compared with individuals who receive inactive-LIPUS (placebo).
Objective: To investigate the clinical efficacy of LIPUS in the management of patellar tendinopathy symptoms.
The development of the research question is the most important aspect of a research project. A research project can fail if the objectives and hypothesis are poorly focused and underdeveloped. Useful tips for surgical researchers are provided in Box 3 . Designing and developing an appropriate and relevant research question, hypothesis and objectives can be a difficult task. The critical appraisal of the research question used in a study is vital to the application of the findings to clinical practice. Focusing resources, time and dedication to these 3 very important tasks will help to guide a successful research project, influence interpretation of the results and affect future publication efforts.
Tips for developing research questions, hypotheses and objectives for research studies
- Perform a systematic literature review (if one has not been done) to increase knowledge and familiarity with the topic and to assist with research development.
- Learn about current trends and technological advances on the topic.
- Seek careful input from experts, mentors, colleagues and collaborators to refine your research question as this will aid in developing the research question and guide the research study.
- Use the FINER criteria in the development of the research question.
- Ensure that the research question follows PICOT format.
- Develop a research hypothesis from the research question.
- Develop clear and well-defined primary and secondary (if needed) objectives.
- Ensure that the research question and objectives are answerable, feasible and clinically relevant.
FINER = feasible, interesting, novel, ethical, relevant; PICOT = population (patients), intervention (for intervention studies only), comparison group, outcome of interest, time.
Competing interests: No funding was received in preparation of this paper. Dr. Bhandari was funded, in part, by a Canada Research Chair, McMaster University.
Have a language expert improve your writing
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
- Knowledge Base
- Research paper
Writing a Research Paper Introduction | Step-by-Step Guide
Published on September 24, 2022 by Jack Caulfield . Revised on March 27, 2023.

The introduction to a research paper is where you set up your topic and approach for the reader. It has several key goals:
- Present your topic and get the reader interested
- Provide background or summarize existing research
- Position your own approach
- Detail your specific research problem and problem statement
- Give an overview of the paper’s structure
The introduction looks slightly different depending on whether your paper presents the results of original empirical research or constructs an argument by engaging with a variety of sources.
Table of contents
Step 1: introduce your topic, step 2: describe the background, step 3: establish your research problem, step 4: specify your objective(s), step 5: map out your paper, research paper introduction examples, frequently asked questions about the research paper introduction.
The first job of the introduction is to tell the reader what your topic is and why it’s interesting or important. This is generally accomplished with a strong opening hook.
The hook is a striking opening sentence that clearly conveys the relevance of your topic. Think of an interesting fact or statistic, a strong statement, a question, or a brief anecdote that will get the reader wondering about your topic.
For example, the following could be an effective hook for an argumentative paper about the environmental impact of cattle farming:
A more empirical paper investigating the relationship of Instagram use with body image issues in adolescent girls might use the following hook:
Don’t feel that your hook necessarily has to be deeply impressive or creative. Clarity and relevance are still more important than catchiness. The key thing is to guide the reader into your topic and situate your ideas.
Scribbr Citation Checker New
The AI-powered Citation Checker helps you avoid common mistakes such as:
- Missing commas and periods
- Incorrect usage of “et al.”
- Ampersands (&) in narrative citations
- Missing reference entries

This part of the introduction differs depending on what approach your paper is taking.
In a more argumentative paper, you’ll explore some general background here. In a more empirical paper, this is the place to review previous research and establish how yours fits in.
Argumentative paper: Background information
After you’ve caught your reader’s attention, specify a bit more, providing context and narrowing down your topic.
Provide only the most relevant background information. The introduction isn’t the place to get too in-depth; if more background is essential to your paper, it can appear in the body .
Empirical paper: Describing previous research
For a paper describing original research, you’ll instead provide an overview of the most relevant research that has already been conducted. This is a sort of miniature literature review —a sketch of the current state of research into your topic, boiled down to a few sentences.
This should be informed by genuine engagement with the literature. Your search can be less extensive than in a full literature review, but a clear sense of the relevant research is crucial to inform your own work.
Begin by establishing the kinds of research that have been done, and end with limitations or gaps in the research that you intend to respond to.
The next step is to clarify how your own research fits in and what problem it addresses.
Argumentative paper: Emphasize importance
In an argumentative research paper, you can simply state the problem you intend to discuss, and what is original or important about your argument.
Empirical paper: Relate to the literature
In an empirical research paper, try to lead into the problem on the basis of your discussion of the literature. Think in terms of these questions:
- What research gap is your work intended to fill?
- What limitations in previous work does it address?
- What contribution to knowledge does it make?
You can make the connection between your problem and the existing research using phrases like the following.
Now you’ll get into the specifics of what you intend to find out or express in your research paper.
The way you frame your research objectives varies. An argumentative paper presents a thesis statement, while an empirical paper generally poses a research question (sometimes with a hypothesis as to the answer).
Argumentative paper: Thesis statement
The thesis statement expresses the position that the rest of the paper will present evidence and arguments for. It can be presented in one or two sentences, and should state your position clearly and directly, without providing specific arguments for it at this point.
Empirical paper: Research question and hypothesis
The research question is the question you want to answer in an empirical research paper.
Present your research question clearly and directly, with a minimum of discussion at this point. The rest of the paper will be taken up with discussing and investigating this question; here you just need to express it.
A research question can be framed either directly or indirectly.
- This study set out to answer the following question: What effects does daily use of Instagram have on the prevalence of body image issues among adolescent girls?
- We investigated the effects of daily Instagram use on the prevalence of body image issues among adolescent girls.
If your research involved testing hypotheses , these should be stated along with your research question. They are usually presented in the past tense, since the hypothesis will already have been tested by the time you are writing up your paper.
For example, the following hypothesis might respond to the research question above:
Prevent plagiarism. Run a free check.
The final part of the introduction is often dedicated to a brief overview of the rest of the paper.
In a paper structured using the standard scientific “introduction, methods, results, discussion” format, this isn’t always necessary. But if your paper is structured in a less predictable way, it’s important to describe the shape of it for the reader.
If included, the overview should be concise, direct, and written in the present tense.
- This paper will first discuss several examples of survey-based research into adolescent social media use, then will go on to …
- This paper first discusses several examples of survey-based research into adolescent social media use, then goes on to …
Full examples of research paper introductions are shown in the tabs below: one for an argumentative paper, the other for an empirical paper.
- Argumentative paper
- Empirical paper
Are cows responsible for climate change? A recent study (RIVM, 2019) shows that cattle farmers account for two thirds of agricultural nitrogen emissions in the Netherlands. These emissions result from nitrogen in manure, which can degrade into ammonia and enter the atmosphere. The study’s calculations show that agriculture is the main source of nitrogen pollution, accounting for 46% of the country’s total emissions. By comparison, road traffic and households are responsible for 6.1% each, the industrial sector for 1%. While efforts are being made to mitigate these emissions, policymakers are reluctant to reckon with the scale of the problem. The approach presented here is a radical one, but commensurate with the issue. This paper argues that the Dutch government must stimulate and subsidize livestock farmers, especially cattle farmers, to transition to sustainable vegetable farming. It first establishes the inadequacy of current mitigation measures, then discusses the various advantages of the results proposed, and finally addresses potential objections to the plan on economic grounds.
The rise of social media has been accompanied by a sharp increase in the prevalence of body image issues among women and girls. This correlation has received significant academic attention: Various empirical studies have been conducted into Facebook usage among adolescent girls (Tiggermann & Slater, 2013; Meier & Gray, 2014). These studies have consistently found that the visual and interactive aspects of the platform have the greatest influence on body image issues. Despite this, highly visual social media (HVSM) such as Instagram have yet to be robustly researched. This paper sets out to address this research gap. We investigated the effects of daily Instagram use on the prevalence of body image issues among adolescent girls. It was hypothesized that daily Instagram use would be associated with an increase in body image concerns and a decrease in self-esteem ratings.
The introduction of a research paper includes several key elements:
- A hook to catch the reader’s interest
- Relevant background on the topic
- Details of your research problem
and your problem statement
- A thesis statement or research question
- Sometimes an overview of the paper
Don’t feel that you have to write the introduction first. The introduction is often one of the last parts of the research paper you’ll write, along with the conclusion.
This is because it can be easier to introduce your paper once you’ve already written the body ; you may not have the clearest idea of your arguments until you’ve written them, and things can change during the writing process .
The way you present your research problem in your introduction varies depending on the nature of your research paper . A research paper that presents a sustained argument will usually encapsulate this argument in a thesis statement .
A research paper designed to present the results of empirical research tends to present a research question that it seeks to answer. It may also include a hypothesis —a prediction that will be confirmed or disproved by your research.
Cite this Scribbr article
If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.
Caulfield, J. (2023, March 27). Writing a Research Paper Introduction | Step-by-Step Guide. Scribbr. Retrieved September 26, 2023, from https://www.scribbr.com/research-paper/research-paper-introduction/
Is this article helpful?

Jack Caulfield
Other students also liked, writing strong research questions | criteria & examples, writing a research paper conclusion | step-by-step guide, research paper format | apa, mla, & chicago templates, what is your plagiarism score.
Research Hypothesis: Definition, Types, & Examples
Saul Mcleod, PhD
Educator, Researcher
BSc (Hons) Psychology, MRes, PhD, University of Manchester
Saul Mcleod, Ph.D., is a qualified psychology teacher with over 18 years experience of working in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.
Learn about our Editorial Process
Olivia Guy-Evans, MSc
Associate Editor for Simply Psychology
BSc (Hons) Psychology, MSc Psychology of Education
Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.
A hypothesis (plural hypotheses) is a precise, testable statement of what the researcher(s) predict will be the outcome of the study. It is stated at the start of the study.
This usually involves proposing a possible relationship between two variables: the independent variable (what the researcher changes) and the dependent variable (what the research measures).
In research, there is a convention that the hypothesis is written in two forms, the null hypothesis, and the alternative hypothesis (called the experimental hypothesis when the method of investigation is an experiment ).
A fundamental requirement of a hypothesis is that is can be tested against reality, and can then be supported or rejected.
To test a hypothesis the researcher first assumes that there is no difference between populations from which they are taken. This is known as the null hypothesis. The research hypothesis is often called the alternative hypothesis.
Table of Contents
Types of research hypotheses
Alternative hypothesis.
The alternative hypothesis states that there is a relationship between the two variables being studied (one variable has an effect on the other).
An experimental hypothesis predicts what change(s) will take place in the dependent variable when the independent variable is manipulated.
It states that the results are not due to chance and that they are significant in terms of supporting the theory being investigated.
Null Hypothesis
The null hypothesis states that there is no relationship between the two variables being studied (one variable does not affect the other). There will be no changes in the dependent variable due to the manipulation of the independent variable.
It states results are due to chance and are not significant in terms of supporting the idea being investigated.
Nondirectional Hypothesis
A non-directional (two-tailed) hypothesis predicts that the independent variable will have an effect on the dependent variable, but the direction of the effect is not specified. It just states that there will be a difference.
E.g., there will be a difference in how many numbers are correctly recalled by children and adults.
Directional Hypothesis
A directional (one-tailed) hypothesis predicts the nature of the effect of the independent variable on the dependent variable. It predicts in which direction the change will take place. (i.e. greater, smaller, less, more)
E.g., adults will correctly recall more words than children.

Falsifiability
The Falsification Principle, proposed by Karl Popper , is a way of demarcating science from non-science. It suggests that for a theory to be considered scientific it must be able to be tested and conceivably proven false.
However many confirming instances there are for a theory, it only takes one counter observation to falsify it. For example, the hypothesis that “all swans are white,” can be falsified by observing a black swan.
For Popper, science should attempt to disprove a theory, rather than attempt to continually support theoretical hypotheses.
Can a hypothesis be proven?
Upon analysis of the results, an alternative hypothesis can be rejected or supported, but it can never be proven to be correct. We must avoid any reference to results proving a theory as this implies 100% certainty, and there is always a chance that evidence may exist which could refute a theory.
How to write a hypothesis
- 1. To write the alternative and null hypotheses for an investigation, you need to identify the key variables in the study.The independent variable is manipulated by the researcher and the dependent variable is the outcome which is measured.
- 2. Operationalized the variables being investigated.Operationalisation of a hypothesis refers to the process of making the variables physically measurable or testable, e.g. if you are about to study aggression you might count the number of punches given by participants
- 3. Decide on a direction for your prediction. If there is evidence in the literature to support a specific effect on the independent variable on the dependent variable, write a directional (one-tailed) hypothesis.If there are limited or ambiguous findings in the literature regarding the effect of the independent variable on the dependent variable, write a non-directional (two-tailed) hypothesis.
- 4. Write your hypothesis. A good hypothesis is short (i.e. concise) and comprises clear and simple language.
What are examples of a hypothesis?
Let’s consider a hypothesis that many teachers might subscribe to: that students work better on Monday morning than they do on a Friday afternoon (IV=Day, DV=Standard of work).
Now, if we decide to study this by giving the same group of students a lesson on a Monday morning and on a Friday afternoon and then measuring their immediate recall on the material covered in each session we would end up with the following:
- The alternative hypothesis states that students will recall significantly more information on a Monday morning than on a Friday afternoon.
- The null hypothesis states that there will be no significant difference in the amount recalled on a Monday morning compared to a Friday afternoon. Any difference will be due to chance or confounding factors.
The null hypothesis is, therefore, the opposite of the alternative hypothesis in that it states that there will be no change in behavior.
At this point, you might be asking why we seem so interested in the null hypothesis. Surely the alternative (or experimental) hypothesis is more important?
Well, yes it is. However, we can never 100% prove the alternative hypothesis. What we do instead is see if we can disprove, or reject, the null hypothesis.
If we reject the null hypothesis, this doesn’t really mean that our alternative hypothesis is correct – but it does provide support for the alternative / experimental hypothesis.
- SAVE ARTICLE


How to Develop a Good Research Hypothesis

The story of a research study begins by asking a question. Researchers all around the globe are asking curious questions and formulating research hypothesis. However, whether the research study provides an effective conclusion depends on how well one develops a good research hypothesis. Research hypothesis examples could help researchers get an idea as to how to write a good research hypothesis.
This blog will help you understand what is a research hypothesis, its characteristics and, how to formulate a research hypothesis
Table of Contents
What is Hypothesis?
Hypothesis is an assumption or an idea proposed for the sake of argument so that it can be tested. It is a precise, testable statement of what the researchers predict will be outcome of the study. Hypothesis usually involves proposing a relationship between two variables: the independent variable (what the researchers change) and the dependent variable (what the research measures).
What is a Research Hypothesis?
Research hypothesis is a statement that introduces a research question and proposes an expected result. It is an integral part of the scientific method that forms the basis of scientific experiments. Therefore, you need to be careful and thorough when building your research hypothesis. A minor flaw in the construction of your hypothesis could have an adverse effect on your experiment. In research, there is a convention that the hypothesis is written in two forms, the null hypothesis, and the alternative hypothesis (called the experimental hypothesis when the method of investigation is an experiment).

Essential Characteristics of a Good Research Hypothesis
As the hypothesis is specific, there is a testable prediction about what you expect to happen in a study. You may consider drawing hypothesis from previously published research based on the theory.
A good research hypothesis involves more effort than just a guess. In particular, your hypothesis may begin with a question that could be further explored through background research.
To help you formulate a promising research hypothesis, you should ask yourself the following questions:
- Is the language clear and focused?
- What is the relationship between your hypothesis and your research topic?
- Is your hypothesis testable? If yes, then how?
- What are the possible explanations that you might want to explore?
- Does your hypothesis include both an independent and dependent variable?
- Can you manipulate your variables without hampering the ethical standards?
- Does your research predict the relationship and outcome?
- Is your research simple and concise (avoids wordiness)?
- Is it clear with no ambiguity or assumptions about the readers’ knowledge
- Is your research observable and testable results?
- Is it relevant and specific to the research question or problem?

The questions listed above can be used as a checklist to make sure your hypothesis is based on a solid foundation. Furthermore, it can help you identify weaknesses in your hypothesis and revise it if necessary.
Source: Educational Hub
How to formulate an effective research hypothesis.
A testable hypothesis is not a simple statement. It is rather an intricate statement that needs to offer a clear introduction to a scientific experiment, its intentions, and the possible outcomes. However, there are some important things to consider when building a compelling hypothesis.
1. State the problem that you are trying to solve.
Make sure that the hypothesis clearly defines the topic and the focus of the experiment.
2. Try to write the hypothesis as an if-then statement.
Follow this template: If a specific action is taken, then a certain outcome is expected.
3. Define the variables
Independent variables are the ones that are manipulated, controlled, or changed. Independent variables are isolated from other factors of the study.
Dependent variables , as the name suggests are dependent on other factors of the study. They are influenced by the change in independent variable.
4. Scrutinize the hypothesis
The types of research hypothesis are stated below:
1. Simple Hypothesis
It predicts the relationship between a single dependent variable and a single independent variable.
2. Complex Hypothesis
It predicts the relationship between two or more independent and dependent variables.
3. Directional Hypothesis
It specifies the expected direction to be followed to determine the relationship between variables and is derived from theory. Furthermore, it implies the researcher’s intellectual commitment to a particular outcome.
4. Non-directional Hypothesis
It does not predict the exact direction or nature of the relationship between the two variables. The non-directional hypothesis is used when there is no theory involved or when findings contradict previous research.
5. Associative and Causal Hypothesis
The associative hypothesis defines interdependency between variables. A change in one variable results in the change of the other variable. On the other hand, the causal hypothesis proposes an effect on the dependent due to manipulation of the independent variable.
6. Null Hypothesis
Null hypothesis states a negative statement to support the researcher’s findings that there is no relationship between two variables. There will be no changes in the dependent variable due the manipulation of the independent variable. Furthermore, it states results are due to chance and are not significant in terms of supporting the idea being investigated.
7. Alternative Hypothesis
It states that there is a relationship between the two variables of the study and that the results are significant to the research topic. An experimental hypothesis predicts what changes will take place in the dependent variable when the independent variable is manipulated. Also, it states that the results are not due to chance and that they are significant in terms of supporting the theory being investigated.
Research Hypothesis Examples of Independent and Dependent Variables:
Research Hypothesis Example 1 The greater number of coal plants in a region (independent variable) increases water pollution (dependent variable). If you change the independent variable (building more coal factories), it will change the dependent variable (amount of water pollution).
Research Hypothesis Example 2 What is the effect of diet or regular soda (independent variable) on blood sugar levels (dependent variable)? If you change the independent variable (the type of soda you consume), it will change the dependent variable (blood sugar levels)
You should not ignore the importance of the above steps. The validity of your experiment and its results rely on a robust testable hypothesis. Developing a strong testable hypothesis has few advantages, it compels us to think intensely and specifically about the outcomes of a study. Consequently, it enables us to understand the implication of the question and the different variables involved in the study. Furthermore, it helps us to make precise predictions based on prior research. Hence, forming a hypothesis would be of great value to the research. Here are some good examples of testable hypotheses.
More importantly, you need to build a robust testable research hypothesis for your scientific experiments. A testable hypothesis is a hypothesis that can be proved or disproved as a result of experimentation.
Importance of a Testable Hypothesis
To devise and perform an experiment using scientific method, you need to make sure that your hypothesis is testable. To be considered testable, some essential criteria must be met:
- There must be a possibility to prove that the hypothesis is true.
- There must be a possibility to prove that the hypothesis is false.
- The results of the hypothesis must be reproducible.
Without these criteria, the hypothesis and the results will be vague. As a result, the experiment will not prove or disprove anything significant.
What are your experiences with building hypotheses for scientific experiments? What challenges did you face? How did you overcome these challenges? Please share your thoughts with us in the comments section.
Frequently Asked Questions
The steps to write a research hypothesis are: 1. Stating the problem: Ensure that the hypothesis defines the research problem 2. Writing a hypothesis as an 'if-then' statement: Include the action and the expected outcome of your study by following a ‘if-then’ structure. 3. Defining the variables: Define the variables as Dependent or Independent based on their dependency to other factors. 4. Scrutinizing the hypothesis: Identify the type of your hypothesis
Hypothesis testing is a statistical tool which is used to make inferences about a population data to draw conclusions for a particular hypothesis.
Hypothesis in statistics is a formal statement about the nature of a population within a structured framework of a statistical model. It is used to test an existing hypothesis by studying a population.
Research hypothesis is a statement that introduces a research question and proposes an expected result. It forms the basis of scientific experiments.
The different types of hypothesis in research are: • Null hypothesis: Null hypothesis is a negative statement to support the researcher’s findings that there is no relationship between two variables. • Alternate hypothesis: Alternate hypothesis predicts the relationship between the two variables of the study. • Directional hypothesis: Directional hypothesis specifies the expected direction to be followed to determine the relationship between variables. • Non-directional hypothesis: Non-directional hypothesis does not predict the exact direction or nature of the relationship between the two variables. • Simple hypothesis: Simple hypothesis predicts the relationship between a single dependent variable and a single independent variable. • Complex hypothesis: Complex hypothesis predicts the relationship between two or more independent and dependent variables. • Associative and casual hypothesis: Associative and casual hypothesis predicts the relationship between two or more independent and dependent variables. • Empirical hypothesis: Empirical hypothesis can be tested via experiments and observation. • Statistical hypothesis: A statistical hypothesis utilizes statistical models to draw conclusions about broader populations.

Wow! You really simplified your explanation that even dummies would find it easy to comprehend. Thank you so much.
Thanks a lot for your valuable guidance.
I enjoy reading the post. Hypotheses are actually an intrinsic part in a study. It bridges the research question and the methodology of the study.
Useful piece!
This is awesome.Wow.
It very interesting to read the topic, can you guide me any specific example of hypothesis process establish throw the Demand and supply of the specific product in market
Nicely explained
It is really a useful for me Kindly give some examples of hypothesis
It was a well explained content ,can you please give me an example with the null and alternative hypothesis illustrated
clear and concise. thanks.
So Good so Amazing
Good to learn
Thanks a lot for explaining to my level of understanding
Rate this article Cancel Reply
Your email address will not be published.

Enago Academy's Most Popular

- AI in Academia
Disclosing the Use of Generative AI: Best practices for authors in manuscript preparation
The rapid proliferation of generative and other AI-based tools in research writing has ignited an…

- Publishing Research
Setting Rationale in Research: Cracking the code for excelling at research
Knowledge and curiosity lays the foundation of scientific progress. The quest for knowledge has always…

- Reporting Research
How to Design Effective Research Questionnaires for Robust Findings
As a staple in data collection, questionnaires help uncover robust and reliable findings that can…

- Career Corner
- PhDs & Postdocs
- Trending Now
Intersectionality in Academia: Dealing with diverse perspectives
What Is Intersectionality in Academia? Intersectionality in academia refers to the recognition and study of…

Preserving Research Integrity: Why author guidelines on generative AI tools matter
After COPE, the Committee on Publication Ethics, along with other heavyweights like WAME (World Association…
Unraveling Research Population and Sample: Understanding their role in statistical…
Mitigating Survivorship Bias in Scholarly Research: 10 tips to enhance data integrity

Sign-up to read more
Subscribe for free to get unrestricted access to all our resources on research writing and academic publishing including:
- 2000+ blog articles
- 50+ Webinars
- 10+ Expert podcasts
- 50+ Infographics
- 10+ Checklists
- Research Guides
We hate spam too. We promise to protect your privacy and never spam you.
I am looking for Editing/ Proofreading services for my manuscript Tentative date of next journal submission:

According to you, which of the following peer review trends will have the most impact on the future of publishing?

How To: Use Articles for Research: Introduction: Hypothesis/Thesis
- What's a Scholarly Journal?
- Reading the Citation
- Authors' Credentials
- Introduction: Hypothesis/Thesis
- Literature Review
- Research Method
- Results/Data
- Discussion/Conclusions
Hypothesis or Thesis
The first few paragraphs of a journal article serve to introduce the topic, to provide the author's hypothesis or thesis, and to indicate why the research was done. A thesis or hypothesis is not always clearly labled; you may need to read through the introductory paragraphs to determine what the authors are proposing.
- << Previous: Abstract
- Next: Literature Review >>
- Last Updated: Jan 29, 2020 2:27 PM
- URL: https://libguides.cayuga-cc.edu/1ST-PRIORITY/articles
How to Identify a Hypothesis
Pharaba witt.

Identifying a hypothesis allows students to know what is being proven by a particular experiment or paper. Being able to determine the overall point not only makes you a more effective reader but also better at formulating your own theories when writing your own paper. By asking a few simple questions while you read, you should be able to pick out the intent of the author and identify the hypothesis.
Explore this article
- Read over the beginning of the material
- Look for if-then statements
- Ask if the if-then statement
- Read through the rest of the paper
1 Read over the beginning of the material
Read over the beginning of the material while asking what the purpose of the introduction is.
2 Look for if-then statements
Look for if-then statements. This type of wording is usually the hypothesis. It lays out a position for the overall paper or project.
3 Ask if the if-then statement
Ask if the if-then statement is testable or provable. Is this the type of statement you could supply evidence for in order to prove? Decide if you agree with the hypothesis. This puts you in a position to be convinced as you read the paper or follow the experiment.
4 Read through the rest of the paper
Read through the rest of the paper to determine if it is going in the direction you suspect. If you get to a point where the words seem to be proving something entirely different, revisit the first paragraph to see if there is another if-then statement.
- Try not to jump to conclusions. Read the paragraph thoroughly through a few times to be certain you have not missed any other potential hypothesis.
- When presented with the information, ask yourself what you would aim to prove. Oftentimes you will formulate a similar question. While your expectations might be different, picking out the hypothesis can be easier.
- Not every hypothesis is accurate. Part of testing a theory is determining if the expectation is accurate. By the end of the paper the writer might draw a new conclusion. The author could even take that space to formulate an entirely new hypothesis.
- Practice writing if-then statements. The more familiar you are with formulating hypothesis statements the better you will be at identifying the hypothesis.
- 1 SlideShare: Hypothesis Conclusion (Geometry)
- 2 Cornell University: Null Hypothesis vs. Alternative Hypothesis
About the Author
Pharaba Witt has worked as a writer in Los Angeles for more than 10 years. She has written for websites such as USA Today, Red Beacon, LIVESTRONG, WiseGeek, Web Series Network, Nursing Daily and major film studios. When not traveling she enjoys outdoor activities such as backpacking, snowboarding, ice climbing and scuba diving. She is constantly researching equipment and seeking new challenges.
Related Articles

How to Write a Rationale

How to Start a Thesis Statement

How to Find a Thesis in an Essay

How to Write a Thesis Statement in High School Essays

Research Paper Thesis Topics

What Is a Lead-in Statement?

Comprehension Skills That Require Critical Thinking

How to Improve Adult Reading Comprehension

Steps in Writing a Report

How to Answer Open-Ended Essay Questions

How to Write a DBQ Essay

How to Write a Paper: Title, Introduction, Body & Conclusion

How to Write a Thesis Statement for an Article Critique

How to Write an Analytical Book Report

How to Write a Hypothesis to an Analytical Essay

How to Write an Introduction Paragraph With Thesis...

How to Write a Good High School English Essay

How to Make an Introduction to an Informative Essay

What Are the Differences Between Bias & Fallacy?

How to Write a Persuasive Essay
Regardless of how old we are, we never stop learning. Classroom is the educational resource for people of all ages. Whether you’re studying times tables or applying to college, Classroom has the answers.
- Accessibility
- Terms of Use
- Privacy Policy
- Copyright Policy
- Manage Preferences
© 2020 Leaf Group Ltd. / Leaf Group Media, All Rights Reserved. Based on the Word Net lexical database for the English Language. See disclaimer .
Research Paper Guide
How To Write A Hypothesis

Learn How to Write a Hypothesis in Simple Steps
Published on: Jan 10, 2018
Last updated on: Mar 16, 2023

On This Page On This Page
A hypothesis is one of the important parts of the scientific research paper . It is an idea that is based on evidence and must be proved through facts and examples.
In a scientific method, whether it involves research in biology, psychology, or any other area, a hypothesis will show what will come next in the experiment.
Since the hypothesis is the foundation for future research, it is important to draft a strong hypothesis. In this blog, you will learn how to write a good hypothesis statement in simple steps and many examples for your better understanding.
There are many questions regarding a hypothesis. Most people look for their answers.
- How to write a hypothesis for a research paper?
- How to write a hypothesis in sociology?
- How to write hypothesis psychology?
- How to write a hypothesis in a lab report?
- How to write a hypothesis in statistics?
- How to write a hypothesis for a science fair?
If you wonder how to create a hypothesis on any of the above scenarios, keep on reading to understand what a hypothesis is and how to write a perfect one.
What is Hypothesis?
A hypothesis is a prediction that is more than just a simple guess. Usually, the hypothesis starts with a question that is explored through in-depth research.
At this point, you need to develop a strong and testable hypothesis. Your hypothesis should explain what you expect to happen next, except if you are writing an explanatory study.
For example, if exploring the effects of a particular drug, the hypothesis should be what effects this drug might have on the symptoms of a particular disease.
In conducting psychology research, the hypothesis might be how an environment influences a particular response or behavior.
Similarly, a hypothesis does not always have to be accurate. While it predicts what the researchers expect, the research aims to determine whether the guess came out right or wrong.
It also establishes a relationship between two or more variables. A dependable variable is what you observe and measure. An independent variable is what you change or control over time.
Different Types of Hypotheses
Before heading towards the writing steps, understand the common types of hypotheses with examples.
Refer to the document below to get a detailed description of these different types of hypotheses.
Simple Hypothesis
A simple hypothesis predicts the connection between the dependent and independent variables. Here are some simple hypothesis examples that you can refer to for your better understanding.
- Intake of sugary drinks leads to weight gain.
- Smoking is the leading cause of lung cancer.
Complex Hypothesis
A complex hypothesis predicts the relationship between two or more dependent and two or more independent variables. Follow the below-mentioned complex hypothesis examples and understand how it is formulated.
- Overweight people who value and seek happiness are more likely to lose weight and enjoy life than those who do not care much.
- Individuals who eat fewer vegetables and more greasy food are at a greater risk of developing heart diseases.
Empirical Hypothesis
An empirical hypothesis is also known as a ‘Working hypothesis.’ This hypothesis plays a part when the theory is being tested through an experiment and observation. It is no longer just a wild guess.
Here are some examples to quickly understand how to craft an empirical hypothesis.
- Women who take vitamin E grow their hair faster than those women who take vitamin K.
- Animals learn faster if the food is given immediately after the response of a command.
Null Hypothesis
A null hypothesis is written when there is insufficient information to state the hypothesis or no obvious relationship between the two variables. Refer to the following null hypothesis examples and learn how to disapprove of something.
- There is no improvement in my health, no matter how healthy I eat or get plenty of sleep.
- There is no change in my work habits, whether I get 6 hours or 10 hours of sleep.
Alternative Hypothesis
There is always an alternative hypothesis that disapproves of a null hypothesis. It is denoted by H1. You can learn more about the alternative hypothesis with these examples.
- My health gets better when I drink green tea daily.
- My work habits get better when I sleep on time and wake up early in the morning.
Logical Hypothesis
The proposed explanation to process the evidence is a logical hypothesis. Usually, a logical hypothesis is turned into an empirical hypothesis. It aims to put your theories to the test.
Here are some logical hypothesis examples for your better understanding.
- Cacti experience more successful growth rates than tulips on Mars.
- The atmospheric pressure on Mars is less than one-hundredth of what we breathe on Earth.
Statistical Hypothesis
A statistical hypothesis is an examination of a sample of a population. This is the type of analysis in which you use statistical information collected from and for a specific area.
Below are some statistical hypothesis examples to understand how to conduct your research using statistical information.
- About 16% of the American population is 65 years old or over.
- 21% of the adults in the United States fall into the category of illiteracy.
How to Write a Hypothesis?
Here are the steps that you need to follow for writing a strong hypothesis.
1. Ask a Question
A hypothesis starts with a research question that you need to address. A clear, focused, and researchable question is required that should be within the limitations of the project.
Furthermore, the question needs to be testable, i.e., there should be a hypothesis that can answer the research question.
2. Conduct Some Initial Research
Now you have to collect data and think about the start of your answer. It should be about the information already known about the research paper topics . Take time and review theories and previous studies to formulate better assumptions.
You can create a conceptual framework to identify which variables you will be focusing on. Also, figure out the relationship between those variables.
3. Create Your Hypothesis
With the help of theories and previous studies, you might have an idea of what you expect to find. Make sure you come up with a clear and concise initial answer.
A hypothesis, in research terms, is the portion of a study that can be proven by testing it. In this case, how does your independent variable affect how participants respond to the experiment?
4. Refine Your Hypothesis
Make sure your hypothesis is to the point and testable. Similarly, it can be refined in a variety of ways. All the terms that you use must be defined clearly and contain the following elements.
- The required variables
- The group that is being studied
- The predicted outcome of the analysis
5. Compose Your Hypothesis in Three Different Ways
You can formulate a simple prediction in the form of if...then to identify the variables. The beginning of the sentence should state the independent variable and dependent variable at the end of the sentence.
In academic research, a hypothesis is commonly phrased in terms of defining relations or showing effects. Here you need to state the relationship between variables.
If you are making a comparison, your hypothesis should state what difference you expect.
6. Formulate a Null Hypothesis
If your research methods cover statistical hypothesis testing, you will need to formulate a null hypothesis. It is donated by ‘ HO .’ A null hypothesis is the default position that shows no connection between the variables.
Hypothesis Writing Tips
Here are some of the expert tips that you should keep in mind for writing a good hypothesis.
- Do not choose a random topic. Take time and find something interesting to write on
- Keep the hypothesis to the point, clear, and concise
- Make sure to research as it will help you throughout the writing process
- Clearly define your independent and dependent variables
- Know your audience to identify the relationship among phenomena observed in different experiments
A hypothesis is basically a statement of what you will do. It determines how you will set up an experiment and how you will analyze the results.
This is why a hypothesis needs to be clearly defined. Once you are done writing your hypothesis, you need to test it and analyze the data to come up with your conclusion.
Are you assigned to write a research paper and worried about drafting a strong hypothesis for it?
Worry not! MyPerfectWords.com can provide you with the best help with its legitimate essay writing service .
We have the best essay writers who can help you with any type of academic paper and for any academic level. Whether it's a research paper, case study, or thesis, you can always count on us.
So place your order now and get the best hypothesis writing help at the most affordable rates.
Frequently Asked Questions
What are the three required parts of a hypothesis.
The three required parts of a hypothesis are as follows:
- If (cause)
- Then (effect)
- Because (rationale)
How do you turn a question into a hypothesis?
A research question can be transformed into a hypothesis by changing it into a statement.
Which statements contain a hypothesis?
A hypothesis is an if/then statement that gives a possibility and explains what may happen because of it. These statements could include may.
Nova A. (Literature, Marketing)
Nova Allison is a Digital Content Strategist with over eight years of experience. Nova has also worked as a technical and scientific writer. She is majorly involved in developing and reviewing online content plans that engage and resonate with audiences. Nova has a passion for writing that engages and informs her readers.
People also read
Writing A Research Paper - A Step by Step Guide
Research Paper Example - APA and MLA Format
Research Paper Outline - A Complete Guide with Examples
250+ Interesting Research Paper Topics for 2022
Research Proposal - A Complete Format Guide and Template
How to Start a Research Paper - 7 Easy Steps
How to Write an Abstract - A Step by Step Guide
Learn How To Write A Literature Review In Simple Steps
Qualitative Research - Methods, Types, and Examples
Types Of Qualitative Research - Overview & Examples
Qualitative vs Quantitative Research - Learning the Basics
Psychology Research Topics - 200+ Interesting Ideas
Types of Research With Examples - A Detailed Guide
Quantitative Research - Types & Data Collection Techniques
Interesting Sociology Research Topics & Ideas for Students
How to Cite a Research Paper - Learn with Helpful Examples
Interesting History Research Paper Topics (2022)
How to Write a Research Methodology for a Research Paper
Share this article
Keep reading

We value your privacy
We use cookies to improve your experience and give you personalized content. Do you agree to our cookie policy?
Website Data Collection
We use data collected by cookies and JavaScript libraries.
Are you sure you want to cancel?
Your preferences have not been saved.
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
- View all journals
- Explore content
- About the journal
- Publish with us
- Sign up for alerts
- Published: 25 September 2023
Distinguishing features of Long COVID identified through immune profiling
- Jon Klein ORCID: orcid.org/0000-0002-3552-7684 1 na1 ,
- Jamie Wood 2 na1 ,
- Jillian Jaycox 1 na1 ,
- Rahul M. Dhodapkar ORCID: orcid.org/0000-0002-2014-7515 1 , 3 na1 ,
- Peiwen Lu ORCID: orcid.org/0000-0001-6118-872X 1 na1 ,
- Jeff R. Gehlhausen 1 , 4 na1 ,
- Alexandra Tabachnikova 1 na1 ,
- Kerrie Greene 1 ,
- Laura Tabacof 2 ,
- Amyn A. Malik 5 ,
- Valter Silva Monteiro ORCID: orcid.org/0000-0003-1785-6713 1 ,
- Julio Silva ORCID: orcid.org/0000-0001-8212-7440 1 ,
- Kathy Kamath 6 ,
- Minlu Zhang ORCID: orcid.org/0000-0003-2347-0569 6 ,
- Abhilash Dhal 6 ,
- Isabel M. Ott 1 ,
- Gabrielee Valle 7 ,
- Mario Peña-Hernandez 1 , 8 ,
- Tianyang Mao ORCID: orcid.org/0000-0001-9251-8592 1 ,
- Bornali Bhattacharjee ORCID: orcid.org/0000-0002-0801-1543 1 ,
- Takehiro Takahashi ORCID: orcid.org/0000-0002-1061-356X 1 ,
- Carolina Lucas ORCID: orcid.org/0000-0003-4590-2756 1 , 11 ,
- Eric Song ORCID: orcid.org/0000-0001-5448-5865 1 ,
- Dayna Mccarthy 2 ,
- Erica Breyman 2 ,
- Jenna Tosto-Mancuso 2 ,
- Yile Dai ORCID: orcid.org/0000-0002-7761-3361 1 ,
- Emily Perotti 1 ,
- Koray Akduman 1 ,
- Tiffany J. Tzeng 1 ,
- Anna C. Geraghty 9 ,
- Michelle Monje ORCID: orcid.org/0000-0002-3547-237X 9 , 10 ,
- Inci Yildirim ORCID: orcid.org/0000-0002-8631-0020 5 , 11 , 12 , 13 ,
- John Shon 6 ,
- Ruslan Medzhitov ORCID: orcid.org/0000-0002-7021-2012 1 , 10 , 11 ,
- Denyse Lutchmansingh 7 ,
- Jennifer D. Possick 7 ,
- Naftali Kaminski ORCID: orcid.org/0000-0001-5917-4601 7 ,
- Saad B. Omer ORCID: orcid.org/0000-0002-5383-3474 5 , 11 , 13 , 14 ,
- Harlan M. Krumholz ORCID: orcid.org/0000-0003-2046-127X 11 , 15 , 16 , 17 ,
- Leying Guan 11 , 18 ,
- Charles S. Dela Cruz ORCID: orcid.org/0000-0002-5258-1797 7 , 11 ,
- David van Dijk ORCID: orcid.org/0000-0003-3911-9925 11 , 19 , 20 ,
- Aaron M. Ring ORCID: orcid.org/0000-0003-3699-2446 1 , 11 ,
- David Putrino ORCID: orcid.org/0000-0002-2232-3324 2 , 21 &
- Akiko Iwasaki ORCID: orcid.org/0000-0002-7824-9856 1 , 10 , 11
Nature ( 2023 ) Cite this article
1774 Altmetric
Metrics details
We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.
- Viral infection
Post-acute infection syndromes (PAIS) may develop after acute viral disease 1 . Infection with SARS-CoV-2 can result in the development of a PAIS known as “Long COVID” (LC). Individuals with LC frequently report unremitting fatigue, post-exertional malaise, and a variety of cognitive and autonomic dysfunctions 2–4 ; however, the biological processes associated with the development and persistence of these symptoms are unclear. Here, 273 individuals with or without LC were enrolled in a cross-sectional study that included multi-dimensional immune phenotyping and unbiased machine learning methods to identify biological features associated with LC. Marked differences were noted in circulating myeloid and lymphocyte populations relative to matched controls, as well as evidence of exaggerated humoral responses directed against SARS-CoV-2 among participants with LC. Further, higher antibody responses directed against non-SARS-CoV-2 viral pathogens were observed among individuals with LC, particularly Epstein-Barr virus. Levels of soluble immune mediators and hormones varied among groups, with cortisol levels being lower among participants with LC. Integration of immune phenotyping data into unbiased machine learning models identified key features most strongly associated with LC status. Collectively, these findings may help guide future studies into the pathobiology of LC and aid in developing relevant biomarkers.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
24,99 € / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
185,98 € per year
only 3,65 € per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Author information
These authors contributed equally: Jon Klein, Jamie Wood, Jillian Jaycox, Rahul M. Dhodapkar, Peiwen Lu, Jeff R. Gehlhausen, Alexandra Tabachnikova
Authors and Affiliations
Department of Immunobiology, Yale School of Medicine, New Haven, CT, USA
Jon Klein, Jillian Jaycox, Rahul M. Dhodapkar, Peiwen Lu, Jeff R. Gehlhausen, Alexandra Tabachnikova, Kerrie Greene, Valter Silva Monteiro, Julio Silva, Isabel M. Ott, Mario Peña-Hernandez, Tianyang Mao, Bornali Bhattacharjee, Takehiro Takahashi, Carolina Lucas, Eric Song, Yile Dai, Emily Perotti, Koray Akduman, Tiffany J. Tzeng, Lan Xu, Ruslan Medzhitov, Aaron M. Ring & Akiko Iwasaki
Abilities Research Center, Icahn School of Medicine at Mount Sinai, New York City, NY, USA
Jamie Wood, Laura Tabacof, Dayna Mccarthy, Erica Breyman, Jenna Tosto-Mancuso & David Putrino
Department of Ophthalmology, USC Keck School of Medicine, Los Angeles, CA, USA
Rahul M. Dhodapkar
Department of Dermatology, Yale School of Medicine, New Haven, CT, USA
Jeff R. Gehlhausen
Yale Institute for Global Health, Yale School of Public Health, New Haven, CT, USA
Amyn A. Malik, Inci Yildirim & Saad B. Omer
SerImmune Inc., Goleta, CA, USA
Kathy Kamath, Minlu Zhang, Abhilash Dhal & John Shon
Department of Internal Medicine (Pulmonary, Critical Care, and Sleep Medicine), Yale School of Medicine, New Haven, CT, USA
Gabrielee Valle, Denyse Lutchmansingh, Jennifer D. Possick, Naftali Kaminski & Charles S. Dela Cruz
Department of Microbiology, Yale School of Medicine, New Haven, CT, USA
Mario Peña-Hernandez
Department of Neurology and Neurological Sciences, Stanford University, Palo Alto, CA, USA
Anna C. Geraghty & Michelle Monje
Howard Hughes Medical Institute, Chevy Chase, MD, USA
Michelle Monje, Ruslan Medzhitov & Akiko Iwasaki
Center for Infection and Immunity, Yale School of Medicine, New Haven, CT, USA
Carolina Lucas, Inci Yildirim, Ruslan Medzhitov, Saad B. Omer, Harlan M. Krumholz, Leying Guan, Charles S. Dela Cruz, David van Dijk, Aaron M. Ring & Akiko Iwasaki
Department of Pediatrics (Infectious Diseases), Yale New Haven Hospital, New Haven, CT, USA
Inci Yildirim
Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, USA
Inci Yildirim & Saad B. Omer
Department of Internal Medicine (Infectious Diseases), Yale School of Medicine, New Haven, CT, USA
Saad B. Omer
Center for Outcomes Research and Evaluation, Yale New Haven Hospital, New Haven, CT, USA
Harlan M. Krumholz
Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA
Department of Health Policy and Management, Yale School of Public Health, New Haven, CT, USA
Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
Leying Guan
Department of Computer Science, Yale University, New Haven, CT, USA
David van Dijk
Department of Internal Medicine (Cardiology), Yale School of Medicine, New Haven, CT, USA
Department of Rehabilitation and Human Performance, Icahn School of Medicine at Mount Sinai, New York City, NY, USA
David Putrino
You can also search for this author in PubMed Google Scholar
Corresponding authors
Correspondence to David van Dijk , Aaron M. Ring , David Putrino or Akiko Iwasaki .
Supplementary information
Reporting summary, supplementary table 1.
Antibody clones and dilutions used for flow cytometry analysis. Excel file containing a list of antibodies used in flow cytometry analysis.
Supplementary Table 2
Viral antigens included in REAP analysis. Excel file containing a list of viral antigens used in REAP analysis.
Supplementary Table 3
MY-LC Clinical and Immunological Data. Excel file containing various immunological and clinical data used for analyses throughout the manuscript.
Supplementary Table 4
Ext. LC Clinical and Immunological Data. Excel file containing various immunological and clinical data from the external LC group used for analyses throughout the manuscript.
Rights and permissions
Reprints and Permissions
About this article
Cite this article.
Klein, J., Wood, J., Jaycox, J. et al. Distinguishing features of Long COVID identified through immune profiling. Nature (2023). https://doi.org/10.1038/s41586-023-06651-y
Download citation
Received : 08 August 2022
Accepted : 18 September 2023
Published : 25 September 2023
DOI : https://doi.org/10.1038/s41586-023-06651-y
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.
Quick links
- Explore articles by subject
- Guide to authors
- Editorial policies
Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.


IMAGES
VIDEO
COMMENTS
Step 1. Ask a question Writing a hypothesis begins with a research question that you want to answer. The question should be focused, specific, and researchable within the constraints of your project.
The answer is written in length in the discussion section of the paper. Thus, the research question gives a preview of the different parts and variables of the study meant to address the problem posed in the research ... a research hypothesis is an educated statement of an expected outcome. ... clarify the background and 2) identify the ...
Writing a hypothesis is one of the essential elements of a scientific research paper. It needs to be to the point, clearly communicating what your research is trying to accomplish. A blurry, drawn-out, or complexly-structured hypothesis can confuse your readers. Or worse, the editor and peer reviewers. A captivating hypothesis is not too intricate.
A hypothesis is a tentative statement about the relationship between two or more variables. It is a specific, testable prediction about what you expect to happen in a study.
An effective hypothesis in research is clearly and concisely written, and any terms or definitions clarified and defined. Specific language must also be used to avoid any generalities or assumptions. Use the following points as a checklist to evaluate the effectiveness of your research hypothesis: Predicts the relationship and outcome.
If your null hypothesis was rejected, this result is interpreted as "supported the alternate hypothesis." Stating results in a research paper We found a difference in average height between men and women of 14.3cm, with a p-value of 0.002, consistent with our hypothesis that there is a difference in height between men and women.
However, there is a lack of simple and clear recommendations on how to write such scientific articles. To make life easier for new authors, we propose a simple hypothesis-based approach, which ...
Our corpus has focussed on identifying Research Hypotheses and New Knowledge in biomedical abstracts. However, it has been shown elsewhere that full texts contain more information than abstracts alone . Whilst our future goal is to additionally facilitate the recognition of New Knowledge and Research Hypothesis in full papers, our decision to ...
A research hypothesis (also called a scientific hypothesis) is a statement about the expected outcome of a study (for example, a dissertation or thesis). To constitute a quality hypothesis, the statement needs to have three attributes - specificity, clarity and testability. Let's take a look at these more closely.
Step 5: Phrase your hypothesis in three ways. To identify the variables, you can write a simple prediction in if … then form. The first part of the sentence states the independent variable and the second part states the dependent variable. If a first-year student starts attending more lectures, then their exam scores will improve.
illustrates a null hypothesis. Designing Research Example 7.3 A Null Hypothesis An investigator might examine three types of reinforcement for children with autism: verbal cues, a reward, and no reinforcement. The investigator collects behavioral measures assessing social interaction of the children with their siblings. A null hypothesis might ...
These are different from standard thesis statements in that they introduce a specific prediction to be supported by the research you will conduct, and they propose an expected or predicted relationship between two or more variables. Below is an example of a research question and its corresponding hypothesis. How do self-paced, asynchronous ...
Hypotheses determine the direction and organization of your subsequent research methods, and that makes them a big part of writing a research paper. Ultimately the reader wants to know whether your hypothesis was proven true or false, so it must be written clearly in the introduction and/or abstract of your paper. 7 examples of hypotheses
Definition Nature of Hypothesis Types How to formulate a Hypotheses in Quantitative Research Qualitative Research Testing and Errors in Hypotheses Summary The research structure helps us create research that is : Quantifiable Verifiable Replicable Defensible Corollaries among the model, common sense & paper format Model
A good research question should specify the population of interest, be of interest to the scientific community and potentially to the public, have clinical relevance and further current knowledge in the field (and of course be compliant with the standards of ethical boards and national research standards). Box 1
Revised on June 22, 2023. The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test: Null hypothesis (H0): There's no effect in the population. Alternative hypothesis (Ha or H1): There's an effect in the population.
Step 1: Introduce your topic Step 2: Describe the background Step 3: Establish your research problem Step 4: Specify your objective (s) Step 5: Map out your paper Research paper introduction examples Frequently asked questions about the research paper introduction Step 1: Introduce your topic
A hypothesis (plural hypotheses) is a precise, testable statement of what the researcher (s) predict will be the outcome of the study. It is stated at the start of the study. This usually involves proposing a possible relationship between two variables: the independent variable (what the researcher changes) and the dependent variable (what the ...
Research hypothesis is a statement that introduces a research question and proposes an expected result. It is an integral part of the scientific method that forms the basis of scientific experiments. Therefore, you need to be careful and thorough when building your research hypothesis.
Hypothesis or Thesis The first few paragraphs of a journal article serve to introduce the topic, to provide the author's hypothesis or thesis, and to indicate why the research was done. A thesis or hypothesis is not always clearly labled; you may need to read through the introductory paragraphs to determine what the authors are proposing.
Identifying a hypothesis allows students to know what is being proven by a particular experiment or paper. Being able to determine the overall point not only makes you a more effective reader but also better at formulating your own theories when writing your own paper.
1. Ask a Question. A hypothesis starts with a research question that you need to address. A clear, focused, and researchable question is required that should be within the limitations of the project. Furthermore, the question needs to be testable, i.e., there should be a hypothesis that can answer the research question.
Introduction A research article represents a compilation of information by a scientist concerning an original research idea. It is characterized by a wide range of information including, the purpose of the study, the thesis statement, hypothesis, literature review, methodology, results and conclusion.
Infection with SARS-CoV-2 can result in the development of a PAIS known as "Long COVID" (LC). Individuals with LC frequently report unremitting fatigue, post-exertional malaise, and a variety ...