Information about Chi Square Distribution
This article is about the mathematics of the chi-square distribution. For its uses in statistics, see chi-square test.
| Probability density function | |
| Cumulative distribution function | |
| Parameters | degrees of freedom |
|---|---|
| Support | ![]() |
| Probability density function (pdf) | |
| Cumulative distribution function (cdf) | |
| Mean | ![]() |
| Median | approximately |
| Mode | if |
| Variance | |
| Skewness | |
| Excess kurtosis | |
| Entropy | |
| Moment-generating function (mgf) | for |
| Characteristic function | |
In probability theory and statistics, the chi-square distribution (also chi-squared or
distribution) is one of the most widely used theoretical probability distributions in inferential statistics, i.e. in statistical significance tests. It is useful because, under reasonable assumptions, easily calculated quantities can be proven to have distributions that approximate to the chi-square distribution if the null hypothesis is true.
If
are k independent, normally distributed random variables with mean 0 and variance 1, then the random variable
is distributed according to the chi-square distribution. This is usually written
The chi-square distribution has one parameter:
- a positive integer that specifies the number of degrees of freedom (i.e. the number of
)
The chi-square distribution is a special case of the gamma distribution.
The best-known situations in which the chi-square distribution is used are the common chi-square tests for goodness of fit of an observed distribution to a theoretical one, and of the independence of two criteria of classification of qualitative data. However, many other statistical tests lead to a use of this distribution. One example is Friedman's analysis of variance by ranks.
Characteristics
Probability density function
A probability density function of the chi-square distribution iswhere
denotes the Gamma function, which takes particular values at the half-integers.
Cumulative distribution function
Its cumulative distribution function is:where
is the lower incomplete Gamma function and is the regularized Gamma function.
Tables of this distribution — usually in its cumulative form — are widely available and the function is included in many spreadsheets and all statistical packages.
Characteristic function
The characteristic function of the Chi-square distribution isProperties
The chi-square distribution has numerous applications in inferential statistics, for instance in chi-square tests and in estimating variances. It enters the problem of estimating the mean of a normally distributed population and the problem of estimating the slope of a regression line via its role in Student's t-distribution. It enters all analysis of variance problems via its role in the F-distribution, which is the distribution of the ratio of two independent chi-squared random variables divided by their respective degrees of freedom.Normal approximation
If , then as
tends to infinity, the distribution of
tends to normality.
However, the tendency is slow (the skewness is and the kurtosis excess is ) and two transformations are commonly considered, each of which approaches normality faster than
itself:
Fisher emprically showed that is approximately normally distributed with mean and unit variance. It is possible to arrive at the same normal approximation result by using moment matching. To see this, consider the mean and the variance of a Chi-distributed random variable , which are given by and , where
is the Gamma
function. The particular ratio of the Gamma
functions in has the following series expansion [1]:
When , this ratio can be approximated as
follows:
Then, simple moment matching results in the following approximation of
:
, from which it follows that .
Wilson and Hilferty showed in 1931 that is approximately normally distributed with mean and variance .
The expected value of a random variable having chi-square distribution with
degrees of freedom is
and the variance is
.
The median is given approximately by
Note that 2 degrees of freedom lead to an exponential distribution.
Information entropy
The information entropy is given bywhere
is the Digamma function.
Related distributions
- is an exponential distribution if
(with 2 degrees of freedom).
- is a chi-square distribution if for independent that are normally distributed.
- If the have nonzero means, then is drawn from a noncentral chi-square distribution.
- The chi-square distribution is a special case of the gamma distribution, in that .
- is an F-distribution if where and are independent with their respective degrees of freedom.
- is a chi-square distribution if where are independent and .
- if
is chi-square distributed, then is chi distributed.
- in particular, if
(chi-square with 2 degrees of freedom), then is Rayleigh distributed.
- if
are i.i.d. random variables, then where .
- if , then
| Name | Statistic |
|---|---|
| chi-square distribution | |
| noncentral chi-square distribution | |
| chi distribution | |
| noncentral chi distribution |
See also
- Cochran's theorem
- Inverse-chi-square distribution
- Degrees of freedom (statistics)
- Fisher's method for combining independent tests of significance
- Noncentral chi-square distribution
External links
- On-line calculator for the significance of chi-square, in Richard Lowry's statistical website at Vassar College.
- Distribution Calculator Calculates probabilities and critical values for normal, t-, chi2- and F-distribution
- Chi-Square Calculator for critical values of Chi-Square in R. Webster West's applet website at University of South Carolina
- Chi-Square Calculator from GraphPad
A chi-square test is any statistical hypothesis test in which the test statistic has a chi-square distribution when the null hypothesis is true, or any in which the probability distribution of the test statistic (assuming the null hypothesis is true) can be made to approximate a
..... Click the link for more information.
..... Click the link for more information.
In mathematics, a support of a function f from a set X to the real numbers R is a subset Y of X such that f (x) is zero for all x in X and outside Y.
..... Click the link for more information.
..... Click the link for more information.
In mathematics, a probability density function (pdf) is a function that represents a probability distribution in terms of integrals.
Formally, a probability distribution has density f, if f
..... Click the link for more information.
Formally, a probability distribution has density f, if f
..... Click the link for more information.
In probability theory, the cumulative distribution function (CDF), also called probability distribution function or just distribution function,[1] completely describes the probability distribution of a real-valued random variable X.
..... Click the link for more information.
..... Click the link for more information.
expected value (or mathematical expectation, or mean) of a discrete random variable is the sum of the probability of each possible outcome of the experiment multiplied by the outcome value (or payoff).
..... Click the link for more information.
..... Click the link for more information.
median is described as the number separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking
..... Click the link for more information.
..... Click the link for more information.
In statistics, mode means the most frequent value assumed by a random variable, or occurring in a sampling of a random variable. The term is applied both to probability distributions and to collections of experimental data.
..... Click the link for more information.
..... Click the link for more information.
variance of a random variable (or somewhat more precisely, of a probability distribution) is one measure of statistical dispersion, averaging the squared distance of its possible values from the expected value.
..... Click the link for more information.
..... Click the link for more information.
skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable.
..... Click the link for more information.
Introduction
Consider the distribution in the figure. The bars on the right side of the distribution taper differently than the bars on the left side...... Click the link for more information.
kurtosis (from the Greek word kurtos, meaning bulging) is a measure of the "peakedness" of the probability distribution of a real-valued random variable. Higher kurtosis means more of the variance is due to infrequent extreme deviations, as opposed to frequent
..... Click the link for more information.
..... Click the link for more information.
Shannon entropy or information entropy is a measure of the uncertainty associated with a random variable.
Shannon entropy quantifies the information contained in a piece of data: it is the minimum average message length, in bits (if using base-2 logarithms), that must
..... Click the link for more information.
Shannon entropy quantifies the information contained in a piece of data: it is the minimum average message length, in bits (if using base-2 logarithms), that must
..... Click the link for more information.
In probability theory and statistics, the moment-generating function of a random variable X is
wherever this expectation exists. The moment-generating function generates the moments of the probability distribution.
..... Click the link for more information.
wherever this expectation exists. The moment-generating function generates the moments of the probability distribution.
..... Click the link for more information.
In probability theory, the characteristic function of any random variable completely defines its probability distribution. On the real line it is given by the following formula, where X is any random variable with the distribution in question:
..... Click the link for more information.
..... Click the link for more information.
Probability theory is the branch of mathematics concerned with analysis of random phenomena.[1] The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities
..... Click the link for more information.
..... Click the link for more information.
Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It is applicable to a wide variety of academic disciplines, from the physical and social sciences to the humanities.
..... Click the link for more information.
..... Click the link for more information.
The word theory has a number of distinct meanings in different fields of knowledge, depending on their methodologies and the context of discussion.
In common usage, people often use the word theory to signify a conjecture, an opinion, or a speculation.
..... Click the link for more information.
In common usage, people often use the word theory to signify a conjecture, an opinion, or a speculation.
..... Click the link for more information.
probability distribution that assigns a probability to every subset (more precisely every measurable subset) of its state space in such a way that the probability axioms are satisfied.
..... Click the link for more information.
..... Click the link for more information.
Inferential statistics or statistical induction comprises the use of statistics to make inferences concerning some unknown aspect of a population. It is distinguished from descriptive statistics.
..... Click the link for more information.
..... Click the link for more information.
In statistics, a result is called significant if it is unlikely to have occurred by chance. "A statistically significant difference" simply means there is statistical evidence that there is a difference; it does not mean the difference is necessarily large, important or significant
..... Click the link for more information.
..... Click the link for more information.
In statistics, a null hypothesis is a hypothesis set up to be nullified or refuted in order to support an alternate hypothesis. When used, the null hypothesis is presumed true until statistical evidence in the form of a hypothesis test indicates otherwise.
..... Click the link for more information.
..... Click the link for more information.
In probability theory, to say that two events are independent, intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs.
..... Click the link for more information.
..... Click the link for more information.
normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields. Each member of the family may be defined by two parameters, location and scale: the mean ("average",
..... Click the link for more information.
..... Click the link for more information.
In statistics, mean has two related meanings:
..... Click the link for more information.
- the arithmetic mean (and is distinguished from the geometric mean or harmonic mean).
- the expected value of a random variable, which is also called the population mean.
..... Click the link for more information.
variance of a random variable (or somewhat more precisely, of a probability distribution) is one measure of statistical dispersion, averaging the squared distance of its possible values from the expected value.
..... Click the link for more information.
..... Click the link for more information.
This article or section is in need of attention from an expert on the subject.
Please help recruit one or [ improve this article] yourself. See the talk page for details.
..... Click the link for more information.
Please help recruit one or [ improve this article] yourself. See the talk page for details.
..... Click the link for more information.
A chi-square test is any statistical hypothesis test in which the test statistic has a chi-square distribution when the null hypothesis is true, or any in which the probability distribution of the test statistic (assuming the null hypothesis is true) can be made to approximate a
..... Click the link for more information.
..... Click the link for more information.
Goodness of fit means how well a statistical model fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measures can be used in statistical hypothesis testing, e.
..... Click the link for more information.
..... Click the link for more information.
In probability theory, to say that two events are independent, intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs.
..... Click the link for more information.
..... Click the link for more information.
The Friedman test is a non-parametric statistical test developed by the U.S. economist Milton Friedman. Similar to the parametric repeated measures ANOVA, it is used to detect differences in treatments across multiple test attempts.
..... Click the link for more information.
..... Click the link for more information.
In mathematics, a probability density function (pdf) is a function that represents a probability distribution in terms of integrals.
Formally, a probability distribution has density f, if f
..... Click the link for more information.
Formally, a probability distribution has density f, if f
..... Click the link for more information.
This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus
degrees of freedom
