Information about Marginal Likelihood

In Bayesian probability theory, a marginal likelihood function is a likelihood function integrated over some variables, typically model parameters. Integrated likelihood is a synonym for marginal likelihood. Evidence is also sometimes used as a synonym, but this usage is somewhat idiosyncratic. "Marginal likelihood" is the most commonly-used of these three terms.

For any likelihood function of two or more variables, marginal likelihoods with respect to any subset of the variables can be defined. Let a denote the subset of variables marginalized (i.e., integrated). Let b denote the other variables. Let x denote observed data. Given the likelihood function p(x|a, b), the marginal likelihood of b is



where p(a|b) is the distribution of a conditional on b. The marginal likelihood of a is computed in an analogous way, by exchanging the roles of a and b.

In a widely-used application, the marginalized variables are parameters for a particular type of model, and the remaining variable is the identity of the model itself. In this case, the marginalized likelihood is the probability of the data given the model type, not assuming any particular model parameters. Writing θ for the model parameters, the marginal likelihood for the model M is



This quantity is important because the posterior odds ratio for a model M1 against another model M2 involves a ratio of marginal likelihoods, the so-called Bayes factor:



which can be stated schematically as



The Bayes factor is an object of central importance in Bayesian model comparison.

Unfortunately, marginal likelihoods are generally difficult to compute. Exact solutions are known for a small class of distributions. In general, some kind of numerical integration method is needed, either a general method such as Gaussian integration or a Monte Carlo method, or a method specialized to statistical problems such as the Laplace approximation or Gibbs sampling.

See also

References

Bayesian probability is an interpretation of the probability calculus which holds that the concept of probability can be defined as the degree to which a person (or community) believes that a proposition is true.
..... Click the link for more information.
Probability theory is the branch of mathematics concerned with analysis of random phenomena.[1] The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities
..... Click the link for more information.
Likelihood as a solitary term is a shorthand for likelihood function. In non-technical usage, "likelihood" is a synonym for "probability", but throughout this article only the technical definition is used.
..... Click the link for more information.
In statistics, the use of Bayes factors is a Bayesian alternative to classical hypothesis testing[1][2].

Given a model selection problem in which we have to choose between two models M1 and M2
..... Click the link for more information.
A common problem in statistical inference is to use data to decide between two or more competing models. Frequentist statistics uses hypothesis tests for this purpose. There are several Bayesian approaches. One approach is through Bayes factors.
..... Click the link for more information.
numerical integration constitutes a broad family of algorithms for calculating the numerical value of a definite integral, and by extension, the term is also sometimes used to describe the numerical solution of differential equations.
..... Click the link for more information.
In numerical analysis, a quadrature rule is an approximation of the definite integral of a function, usually stated as a weighted sum of function values at specified points within the domain of integration. (See numerical integration for more on quadrature rules.
..... Click the link for more information.
Monte Carlo methods are a widely used class of computational algorithms for simulating the behavior of various physical and mathematical systems, and for other computations.
..... Click the link for more information.
steepest descent method or saddle-point approximation is a method used to approximate integrals of the form
where f(x) is some twice-differentiable function, M is a large number, and the integral endpoints a and
..... Click the link for more information.
In mathematics and physics, Gibbs sampling is an algorithm to generate a sequence of samples from the joint probability distribution of two or more random variables. The purpose of such a sequence is to approximate the joint distribution (i.e.
..... Click the link for more information.
In statistics, the use of Bayes factors is a Bayesian alternative to classical hypothesis testing[1][2].

Given a model selection problem in which we have to choose between two models M1 and M2
..... Click the link for more information.
A common problem in statistical inference is to use data to decide between two or more competing models. Frequentist statistics uses hypothesis tests for this purpose. There are several Bayesian approaches. One approach is through Bayes factors.
..... Click the link for more information.
David J. C. MacKay (born April 22, 1967) is the professor of natural philosophy in the department of Physics at the University of Cambridge. He was born the fifth child of Donald MacCrimmon MacKay and Valerie MacKay.
..... Click the link for more information.


This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus


page counter