Information about Null Hypothesis

In statistics, a null hypothesis is a hypothesis set up to be nullified or refuted in order to support an alternate hypothesis. When used, the null hypothesis is presumed true until statistical evidence in the form of a hypothesis test indicates otherwise. In science, the null hypothesis is used to test differences in treatment and control groups, and the assumption at the outset of the experiment is that no difference exists between the two groups for the variable being compared.

Introduction

The null hypothesis proposes something initially presumed true. It is rejected only when it becomes evidently false. That is, when the researcher has a certain degree of confidence, usually 95% to 99%, that the data do not support the null hypothesis.

An example

For example, if we want to compare the test scores of two random samples of men and women, a null hypothesis would be that the mean score of the male population was the same as the mean score of the female population:

H0 : μ1 = μ2


where:

H0 = the null hypothesis
μ1 = the mean of population 1, and
μ2 = the mean of population 2.


Alternatively, the null hypothesis can postulate that the two samples are drawn from the same population, so that the variance and shape of the distributions are equal, as well as the means.

Formulation of the null hypothesis is a vital step in testing statistical significance. Having formulated such a hypothesis, one can establish the probability of observing the obtained data or data more different from the prediction of the null hypothesis, if the null hypothesis is true. That probability is what is commonly called the "significance level" of the results.

That is, in scientific experimental design, we may predict that a particular factor will produce an effect on our dependent variable — this is our alternative hypothesis. We then consider how often we would expect to observe our experimental results, or results even more extreme, if we were to take many samples from a population where there was no effect (i.e. we test against our null hypothesis). If we find that this happens rarely (up to, say, 5% of the time), we can conclude that our results support our experimental prediction — we accept our alternative hypothesis.

Lack of directionality

The null hypothesis does not have direction. That is, if we formulate a one-tailed alternative hypothesis that application of Drug A will lead to increased growth in patients, the null hypothesis remains that application of Drug A will have no effect on growth in patients. It is not merely the opposite of the alternative hypothesis — that is, it is not that application of Drug A will not lead to increased growth in patients.

To explain why this should be so, it is instructive to consider the nature of the hypotheses outlined above. We are predicting that patients exposed to Drug A will see increased growth compared to a control group who do not receive the drug. That is,

H1: μdrug > μcontrol


where:

μ = the patients' mean growth.


The null hypothesis is H0: μdrug = μcontrol

It is not H0incorrect: μdrug <= μcontrol

This is because, in order to gauge support for the alternative hypothesis, classical hypothesis testing requires us to calculate how often we would have obtained results as extreme as our experimental observations. In order to do this, we need to be able to model the sampling distribution characteristic of the null hypothesis. We are unable to create this model if it is imprecise: as Fisher, who first coined the term "null hypothesis" said, "the null hypothesis must be exact, that is free of vagueness and ambiguity, because it must supply the basis of the 'problem of distribution,' of which the test of significance is the solution."[1]

Thus the null hypothesis must be numerically exact — it must state that a particular quantity or difference is equal to a particular number. In classical science, it is most typically the statement that there is no effect of a particular treatment; in observations, it is typically that there is no difference between the value of a particular measured variable and that of a prediction.

(However, many statisticians believe that it is valid to state direction as a part of null hypothesis (for example see http://davidmlane.com/hyperstat/A73079.html). The logic is quite simple: if the direction is omitted, then if the null hypothesis is not rejected it is quite confusing to interpret the conclusion. Say, the null is that the population mean = 10, and the one-tailed alternative: mean > 10. If the sample evidence obtained through x-bar equals -200 and the corresponding t-test statistic equals -50, what is the conclusion? Not enough evidence to reject the null hypothesis? Surely not! But we cannot accept the one-sided alternative in this case. Therefore, to overcome this ambiguity, it is better to include the direction of the effect if the test is one-sided.)

Limitations

A null hypothesis is only useful if it is possible to calculate the probability of observing a data set with particular parameters from it. In general it is much harder to be precise about how probable the data would be if the alternative hypothesis were true.

If experimental observations contradict the prediction of the null hypothesis, it means that either the null hypothesis is false, or the event under observation occurs very improbably. This gives us high confidence in the falsehood of the null hypothesis, which can be improved in proportion to the number of trials conducted. However, accepting the alternative hypothesis only commits us to a difference in observed parameters; it does not prove that the theory or principles that predicted such a difference is true, since it is always possible that the difference could be due to additional factors not recognized by the theory.

For example, rejecting of a null hypothesis that predicts that the rates of symptom relief in a sample of patients who received a placebo and a sample who received a medicinal drug will be equal allows us to make a non-null statement (that the rates differed); it does not prove that the drug relieved the symptoms, though it gives us more confidence in that hypothesis.

The formulation, testing, and rejection of null hypotheses is methodologically consistent with the falsifiability model of scientific discovery formulated by Karl Popper and widely believed to apply to most kinds of empirical research. However, concerns regarding the high power of statistical tests to detect differences in large samples have led to suggestions for re-defining the null hypothesis, for example as a hypothesis that an effect falls within a range considered negligible. This is an attempt to address the confusion among non-statisticians between significant and substantial, since large enough samples are likely to be able to indicate differences however minor.

The theory underlying the idea of a null hypothesis is closely associated with the frequency theory of probability, in which probabilistic statements can only be made about the relative frequencies of events in arbitrarily large samples. A failure to reject the null hypothesis is meaningful only in relation to an arbitrarily large population from which the observed sample is supposed to be drawn.

Publication bias

Main article: Publication bias


In 2002, a group of psychologists launched a new journal dedicated to experimental studies in psychology which support the null hypothesis. The Journal of Articles in Support of the Null Hypothesis (JASNH) was founded to address a scientific publishing bias against such articles. [1] According to the editors,

"other journals and reviewers have exhibited a bias against articles that did not reject the null hypothesis. We plan to change that by offering an outlet for experiments that do not reach the traditional significance levels (p < 0.05). Thus, reducing the file drawer problem, and reducing the bias in psychological literature. Without such a resource researchers could be wasting their time examining empirical questions that have already been examined. We collect these articles and provide them to the scientific community free of cost."


The "File Drawer problem" is a problem that exists due to the fact that academics tend not to publish results that indicate the null hypothesis could not be rejected. That is, they got a statistically significant result that indicated the relationship they were looking for did not exist. Even though these papers can often be interesting, they tend to end up unpublished, in "file drawers."

Ioannidis has inventoried factors that should alert readers to risks of publication bias [2].

Controversy

Null hypothesis testing is controversial when the alternative hypothesis is suspected to be true at the outset of the experiment, making the null hypothesis the reverse of what the experimenter actually believes; it is put forward only to allow the data to contradict it. Many statisticians have pointed out that rejecting the null hypothesis says nothing or very little about the likelihood that the null is true. Under traditional null hypothesis testing, the null is rejected when P(Data | Null) (where P(x|y) denotes the probability of x given y) is very small, say 0.05. However, researchers are really interested in P(Null | Data) which cannot be inferred from a p-value. In some cases, P(Null | Data) approaches 1 while P(Data | Null) approaches 0, in other words, we can reject the null when it's virtually certain to be true. For this and other reasons, Gerd Gigerenzer has called null hypothesis testing "mindless statistics" while Jacob Cohen describes it as a ritual conducted to convince ourselves that we have the evidence needed to confirm our theories.

Bayesian statisticians normally reject the idea of null hypothesis testing. Given a prior probability distribution for one or more parameters, sample evidence can be used to generate an updated posterior distribution. In this framework, but not in the null hypothesis testing framework, it is meaningful to make statements of the general form "the probability that the true value of the parameter is greater than 0 is p".

References

HyperStat Online - [2]
1. ^ Fisher, R.A. (1966). The design of experiments. 8th edition. Hafner:Edinburgh.
2. ^ Ioannidis J (2005). "Why most published research findings are false". PLoS Med 2 (8): e124. DOI:10.1371/journal.pmed.0020124. PMID 16060722. 

See also

External links



Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It is applicable to a wide variety of academic disciplines, from the physical and social sciences to the humanities.
..... Click the link for more information.
The alternate hypothesis (or maintained hypothesis or research hypothesis) and the null hypothesis are the two rival hypotheses whose likelihoods are compared by a statistical hypothesis test.
..... Click the link for more information.
Evidence in its broadest sense, includes anything that is used to determine or demonstrate the truth of an assertion. Philosophically, evidence can include propositions which are presumed to be true used in support of other propositions that are presumed to be falsifiable.
..... Click the link for more information.
sample is a subset of a population. Typically, the population is very large, making a census or a complete enumeration of all the values in the population impractical or impossible. The sample represents a subset of manageable size.
..... Click the link for more information.
In statistics, a result is called significant if it is unlikely to have occurred by chance. "A statistically significant difference" simply means there is statistical evidence that there is a difference; it does not mean the difference is necessarily large, important or significant
..... Click the link for more information.
In statistics, a sampling distribution is the probability distribution, under repeated sampling of the population, of a given statistic (a numerical quantity calculated from the data values in a sample).
..... Click the link for more information.
data set (or dataset) is a collection of data, usually presented in tabular form. Each column represents a particular variable. Each row corresponds to a given member of the data set in question.
..... Click the link for more information.
Placebo effect is the term applied by medical science to the therapeutical and healing effects of inert medicines and/or ritualistic or faith healing manipulations.[1] [2].
..... Click the link for more information.
Falsifiability (or refutability or testability) is the logical possibility that an assertion can be shown false by an observation or a physical experiment. That something is "falsifiable" does not mean it is false; rather, it means that it is capable of being
..... Click the link for more information.
Science (from the Latin scientia, 'knowledge'), in the broadest sense, refers to any systematic knowledge or practice.[1] Examples of the broader use included political science and computer science, which are not incorrectly named, but rather named according to
..... Click the link for more information.
Karl Raimund Popper, CH, FRS, FBA (July 28, 1902 – September 17, 1994) was an Austrian and British[1] philosopher and a professor at the London School of Economics.
..... Click the link for more information.
Empirical research is any research that bases its findings on direct or indirect observation as its test of reality. Such research may also be conducted according to hypothetico-deductive procedures, such as those developed from the work of R. A. Fisher.
..... Click the link for more information.
The power of a statistical test is the probability that the test will reject a false null hypothesis (that it will not make a Type II error). As power increases, the chances of a Type II error decrease, and vice versa. The probability of a Type II error is referred to as β.
..... Click the link for more information.
statistical hypothesis test, or more briefly, hypothesis test, is an algorithm to state the alternative (for or against the hypothesis) which minimizes certain risks.

This article describes the commonly used frequentist treatment of hypothesis testing.
..... Click the link for more information.
Frequency probability is the interpretation of probability that defines an event's probability as the "limit" of its relative frequency in a large number of trials.
..... Click the link for more information.
Publication bias arises from the tendency for researchers and editors to handle experimental results that are positive (they found something) differently from results that are negative (found that something did not happen) or inconclusive.
..... Click the link for more information.
20th century - 21st century - 22nd century
1970s  1980s  1990s  - 2000s -  2010s  2020s  2030s
1999 2000 2001 - 2002 - 2003 2004 2005

2002 by topic:
News by month
Jan - Feb - Mar - Apr - May - Jun
..... Click the link for more information.
Psychology (from Greek: Literally "talk about the soul" (from logos)) is both an academic and applied discipline involving the scientific study of mental processes and behavior.
..... Click the link for more information.
In applied statistics, the file drawer problem results from the fact that academics tend not to publish results that indicate the null hypothesis could not be rejected. That is, they got a statistically non-significant result that failed to find the relationship they were looking
..... Click the link for more information.
Conditional probability is the probability of some event A, given the occurrence of some other event B. Conditional probability is written P(A|B), and is read "the probability of A, given B".
..... Click the link for more information.
In statistical hypothesis testing, the p-value is the probability of obtaining a result at least as extreme as a given data point, assuming the data point was the result of chance alone.
..... Click the link for more information.
Gerd Gigerenzer (b. September 3 1947) is a German psychologist who has studied the use of bounded rationality and heuristics in decision making, especially in medicine. A critic of the work of Daniel Kahneman and Amos Tversky, he focuses on how heuristics can be used to make
..... Click the link for more information.
Bayesian refers to methods in probability and statistics named after the Reverend Thomas Bayes (ca. 1702–1761), in particular methods related to:
  • the degree-of-belief interpretation of probability, as opposed to frequency or proportion or propensity

..... Click the link for more information.
A prior probability is a marginal probability, interpreted as a description of what is known about a variable in the absence of some evidence. The posterior probability is then the conditional probability of the variable taking the evidence into account.
..... Click the link for more information.
The posterior probability of a random event or an uncertain proposition is the conditional probability that is assigned when the relevant evidence is taken into account.
..... Click the link for more information.
digital object identifier (or DOI) is a permanent identifier given to a document, which is not related to its current location. A typical use of a DOI is to give a scientific paper or article a unique identifying number that can be used by anyone to locate details of the paper, and
..... Click the link for more information.
counternull is a statistic used to aid the understanding and presentation of research results. It revolves around the effect size, which is the mean magnitude of some effect divided by the standard deviation.
..... Click the link for more information.
In statistical hypothesis testing, the p-value is the probability of obtaining a result at least as extreme as a given data point, assuming the data point was the result of chance alone.
..... Click the link for more information.
Publication bias arises from the tendency for researchers and editors to handle experimental results that are positive (they found something) differently from results that are negative (found that something did not happen) or inconclusive.
..... Click the link for more information.
statistical hypothesis test, or more briefly, hypothesis test, is an algorithm to state the alternative (for or against the hypothesis) which minimizes certain risks.

This article describes the commonly used frequentist treatment of hypothesis testing.
..... Click the link for more information.


This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus


page counter