Information about Admissible Decision Rule
In classical (frequentist) decision theory, an admissible decision rule is a rule for making a decision that is "better" than any other rule that may compete with it, in a specific sense defined below. Generally speaking, in most decision problems the set of admissible rules is large, even infinite, but as will be seen there are good reasons to favor admissible rules.
and
, where
are the states of nature,
the possible observations and
the actions that may be taken. A decision rule is a function
, i.e., upon observing
, we choose to take action
.
In addition, we define a loss function
which measures the loss we incur by taking action
when the true state of nature is
. Usually we will take this action after observing data
, so that the loss will be
.
It is possible to recast the theory in terms of a utility function, the negative of the loss. However, admissibility is usually defined in terms of a loss function, and we shall follow this convention.
Let
have cumulative distribution function
. Define the risk function as the expectation
A decision rule
dominates a decision rule
if and only if
for all
, and the inequality is strict for some
.
A decision rule is admissible if and only if no other rule dominates it; otherwise it is inadmissible. An admissible rule should be preferred over an inadmissible rule since for any inadmissible rule there is an admissible rule that performs at least as well for all states of nature and betters it for some.
be a probability distribution on the states of nature. From a Bayesian point of view, we would regard it as a prior distribution. That is, it is our believed probability distribution on the states of nature, prior to observing data. For a frequentist, it is merely a function on
with no such special interpretation. The Bayes risk of the decision rule
with respect to
is the expectation
If the Bayes risk is finite, we can minimize
with respect to
to obtain
, a Bayes rule with respect to
. There may be more than one Bayes rule. If the Bayes risk is infinite, then no Bayes rule is defined.
is considered fixed. Instead of averaging over
as in the frequentist approach, the Bayesian would average over
. Thus, we would be interested in computing for our observed
the expected loss
Since
is considered fixed and known, we can choose
to minimize the expected loss for any
; by varying
over its range, we can define a function
, which is known as a generalized Bayes rule.
A generalized Bayes rule will be the same as some Bayes rule (relative to
), provided that the Bayes risk is finite. Since more than one decision rule may minimize the expected loss, there may not be a unique generalized Bayes rule.
According to the complete class theorems, under mild conditions every admissible rule is a (generalized) Bayes rule (with respect to some, possibly improper, prior). Thus, in frequentist decision theory it is sufficient to consider only (generalized) Bayes rules.
While Bayes rules with respect to proper priors are virtually always admissible, generalized Bayes rules corresponding to improper priors need not yield admissible procedures. Stein's example is one such famous situation.
Definition
Define sets
and
, where
are the states of nature,
the possible observations and
the actions that may be taken. A decision rule is a function
, i.e., upon observing
, we choose to take action
.
In addition, we define a loss function
which measures the loss we incur by taking action
when the true state of nature is
. Usually we will take this action after observing data
, so that the loss will be
.
It is possible to recast the theory in terms of a utility function, the negative of the loss. However, admissibility is usually defined in terms of a loss function, and we shall follow this convention.
Let
have cumulative distribution function
. Define the risk function as the expectation
A decision rule
dominates a decision rule
if and only if
for all
, and the inequality is strict for some
.
A decision rule is admissible if and only if no other rule dominates it; otherwise it is inadmissible. An admissible rule should be preferred over an inadmissible rule since for any inadmissible rule there is an admissible rule that performs at least as well for all states of nature and betters it for some.
Bayes rules
Let
be a probability distribution on the states of nature. From a Bayesian point of view, we would regard it as a prior distribution. That is, it is our believed probability distribution on the states of nature, prior to observing data. For a frequentist, it is merely a function on
with no such special interpretation. The Bayes risk of the decision rule
with respect to
is the expectation
If the Bayes risk is finite, we can minimize
with respect to
to obtain
, a Bayes rule with respect to
. There may be more than one Bayes rule. If the Bayes risk is infinite, then no Bayes rule is defined.
Admissible rules and Bayes rules
In the Bayesian approach to decision theory,
is considered fixed. Instead of averaging over
as in the frequentist approach, the Bayesian would average over
. Thus, we would be interested in computing for our observed
the expected loss
Since
is considered fixed and known, we can choose
to minimize the expected loss for any
; by varying
over its range, we can define a function
, which is known as a generalized Bayes rule.
A generalized Bayes rule will be the same as some Bayes rule (relative to
), provided that the Bayes risk is finite. Since more than one decision rule may minimize the expected loss, there may not be a unique generalized Bayes rule.
According to the complete class theorems, under mild conditions every admissible rule is a (generalized) Bayes rule (with respect to some, possibly improper, prior). Thus, in frequentist decision theory it is sufficient to consider only (generalized) Bayes rules.
While Bayes rules with respect to proper priors are virtually always admissible, generalized Bayes rules corresponding to improper priors need not yield admissible procedures. Stein's example is one such famous situation.
References
- James O. Berger Statistical Decision Theory and Bayesian Analysis. Second Edition. Springer-Verlag, 1980, 1985. ISBN 0-387-96098-8.
- Morris De Groot Optimal Statistical Decisions. Wiley Classics Library. 2004. (Originally published 1970.) ISBN 0-471-68029-X.
- Christian P. Robert The Bayesian Choice. Springer-Verlag 1994. ISBN 3-540-94296-3.
Frequency probability is the interpretation of probability that defines an event's probability as the "limit" of its relative frequency in a large number of trials.
..... Click the link for more information.
..... Click the link for more information.
Decision theory is an area of study of discrete mathematics, related to and of interest to practitioners in all branches of science, engineering and in all human social activities.
..... Click the link for more information.
..... Click the link for more information.
SET may stand for:
..... Click the link for more information.
- Sanlih Entertainment Television, a television channel in Taiwan
- Secure electronic transaction, a protocol used for credit card processing,
..... Click the link for more information.
function expresses dependence between two quantities, one of which is given (the independent variable, argument of the function, or its "input") and the other produced (the dependent variable, value of the function, or "output").
..... Click the link for more information.
..... Click the link for more information.
In statistics, decision theory and economics, a loss function is a function that maps an event (technically an element of a sample space) onto a real number representing the economic cost or regret associated with the event.
..... Click the link for more information.
..... Click the link for more information.
In economics, utility is a measure of the relative satisfaction or desiredness from consumption of goods. Given this measure, one may speak meaningfully of increasing or decreasing utility, and thereby explain economic behavior in terms of attempts to increase one's utility.
..... Click the link for more information.
..... Click the link for more information.
In probability theory, the cumulative distribution function (CDF), also called probability distribution function or just distribution function,[1] completely describes the probability distribution of a real-valued random variable X.
..... Click the link for more information.
..... Click the link for more information.
risk of an estimator δ(x) to be calculated from some observables x is the expected value of the loss function as a function on the unknown underlying state of nature θ:
..... Click the link for more information.
- .
..... Click the link for more information.
expected value (or mathematical expectation, or mean) of a discrete random variable is the sum of the probability of each possible outcome of the experiment multiplied by the outcome value (or payoff).
..... Click the link for more information.
..... Click the link for more information.
In decision theory, a decision rule is said to dominate another if the performance of the former is sometimes better, and never worse, than that of the latter.
Formally, let and be two decision rules, and let be the risk of rule for parameter .
..... Click the link for more information.
Formally, let and be two decision rules, and let be the risk of rule for parameter .
..... Click the link for more information.
inequality is a statement about the relative size or order of two objects. (See also: equality)
..... Click the link for more information.
- The notation means that a is less than b and
- The notation means that a is greater than b.
..... Click the link for more information.
Bayesian refers to methods in probability and statistics named after the Reverend Thomas Bayes (ca. 1702–1761), in particular methods related to:
..... Click the link for more information.
- the degree-of-belief interpretation of probability, as opposed to frequency or proportion or propensity
..... Click the link for more information.
A prior probability is a marginal probability, interpreted as a description of what is known about a variable in the absence of some evidence. The posterior probability is then the conditional probability of the variable taking the evidence into account.
..... Click the link for more information.
..... Click the link for more information.
Frequency probability is the interpretation of probability that defines an event's probability as the "limit" of its relative frequency in a large number of trials.
..... Click the link for more information.
..... Click the link for more information.
Decision theory is an area of study of discrete mathematics, related to and of interest to practitioners in all branches of science, engineering and in all human social activities.
..... Click the link for more information.
..... Click the link for more information.
Stein's example, sometimes referred to as Stein's phenomenon or Stein's paradox, is a surprising effect observed in decision theory and estimation theory. Simply stated, the example demonstrates that when three or more parameters are estimated simultaneously, their
..... Click the link for more information.
..... Click the link for more information.
This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus


