Information about Bellman Equation
A Bellman equation (also known as a dynamic programming equation), named after its discoverer, Richard Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming. Almost any problem which can be solved using optimal control theory can also be solved by analyzing the appropriate Bellman equation. The Bellman equation was first applied to engineering control theory and to other topics in applied mathematics, and subsequently became an important tool in economic theory.
the canonical infinite horizon dynamic programming problem is:
subject to the constraints
In this problem, x is a vector of state and control variables, indexed by discrete time t. 0≤β≤1 is the discount factor.
The recursive restatement of this problem as a Bellman equation is:
The function
that solves the Bellman equation
is called the value function.
The value function describes the optimized value of the problem,
as a function of the state variable x.
The function y(x) that describes the optimal choice as a function of the state
is called the policy function.
Stokey & Lucas (1989: 67-77) called the equivalence between these two forms of the problem the principle of optimality (a term taken from Bellman's 1952 paper[1]). The principle asserts that if the policy function is optimal for the infinite summation, then it must be the case that whatever the initial state and decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from that first decision (as expressed by the Bellman equation). The principle of optimality is related to the concept of optimal substructure, and problems that exhibit optimal substructure can often be solved with dynamic programming.
has the Bellman equation:
This equation describes the expected reward for taking the action prescribed by some policy
.
The equation for the optimal policy is referred to as the Bellman optimality equation:
It describes the reward for taking the action giving the highest expected return.
Lucas & Stokey describes stochastic and nonstochastic dynamic programming in considerable detail, giving many examples of how to employ dynamic programming to solve problems in economic theory.[3] This book led to dynamic programming being employed to solve a wide range of theoretical problems in economics, including optimal economic growth, resource extraction, principal-agent problems, public finance, business investment, asset pricing, factor supply, and industrial organization. Ljungqvist & Sargent apply dynamic programming to study a variety of theoretical questions in monetary policy, fiscal policy, taxation, economic growth, search theory, and labor economics.[4] Dixit & Pindyck showed the value of the method for thinking about capital budgeting.[5] Patrick L. Anderson used dynamic programming to develop methods to value closely held firms.[6]
Using dynamic programming to solve concrete problems is complicated by informational difficulties, such as choosing the unobservable discount rate. There are also computational issues, the main one being the curse of dimensionality arising from the vast number of possible actions and potential state variables that must be considered before an optimal strategy can be selected. For an extensive discussion of computational issues, see Miranda & Fackler.[7]
Control theory is an interdisciplinary branch of engineering and mathematics, that deals with the behavior of dynamical systems. The desired output of a system is called the reference.
..... Click the link for more information.
Mining is the extraction of valuable minerals or other geological materials from the earth, usually (but not always) from an ore body, vein, or (coal) seam.
..... Click the link for more information.
Economic policy
Monetary policy
Central bank Money supply
Fiscal policy
Spending Deficit Debt
Trade policy
Tariff Trade agreement
Finance
Financial market
..... Click the link for more information.
Economic policy
Monetary policy
Central bank Money supply
Fiscal policy
Spending Deficit Debt
Trade policy
Tariff Trade agreement
Finance
Financial market
..... Click the link for more information.
Economic policy
Monetary policy
Central bank Money supply
Fiscal policy
Spending Deficit Debt
Trade policy
Tariff Trade agreement
Finance
Financial market
..... Click the link for more information.
Principle of optimality
Given an appropriate initial conditionthe canonical infinite horizon dynamic programming problem is:
subject to the constraints
In this problem, x is a vector of state and control variables, indexed by discrete time t. 0≤β≤1 is the discount factor.
The recursive restatement of this problem as a Bellman equation is:
.
The function
that solves the Bellman equation
is called the value function.
The value function describes the optimized value of the problem,
as a function of the state variable x.
The function y(x) that describes the optimal choice as a function of the state
is called the policy function.
Stokey & Lucas (1989: 67-77) called the equivalence between these two forms of the problem the principle of optimality (a term taken from Bellman's 1952 paper[1]). The principle asserts that if the policy function is optimal for the infinite summation, then it must be the case that whatever the initial state and decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from that first decision (as expressed by the Bellman equation). The principle of optimality is related to the concept of optimal substructure, and problems that exhibit optimal substructure can often be solved with dynamic programming.
Example
In reinforcement learning, a Bellman equation refers to a recursion for expected rewards. For example, the expected reward for being in a particular state s and following some fixed policy
has the Bellman equation:
This equation describes the expected reward for taking the action prescribed by some policy
.
The equation for the optimal policy is referred to as the Bellman optimality equation:
It describes the reward for taking the action giving the highest expected return.
Solutions
- The method of undetermined coefficients, also known as 'guess and verify', can be used to solve some infinite-horizon, autonomous Bellman equations.
- The Bellman equation can be solved by backwards induction, either analytically in a few special cases, or numerically on a computer. Numerical backwards induction is applicable to a wide variety of problems, but may be infeasible when there are many state variables, due to the curse of dimensionality.
- By calculating the first-order conditions associated with the Bellman equation, and then using the envelope theorem to eliminate the derivatives of the value function, it is possible to obtain a system of difference equations or differential equations called the 'Euler equations'. Standard techniques for the solution of difference or differential equations can then be used to calculate the dynamics of the state variables and the control variables of the optimization problem.
Applications in Economics
The first known economic application of a Bellman equation is Merton's seminal 1973 article on the intertemporal capital asset pricing model.[2] The solution to Merton's theoretical model, one in which investors chose between income today and future income or capital gains, is a form of Bellman's equation. Because economic applications of dynamic programming usually result in a Bellman equation that is a difference equation, economists refer to dynamic programming as a "recursive method."Lucas & Stokey describes stochastic and nonstochastic dynamic programming in considerable detail, giving many examples of how to employ dynamic programming to solve problems in economic theory.[3] This book led to dynamic programming being employed to solve a wide range of theoretical problems in economics, including optimal economic growth, resource extraction, principal-agent problems, public finance, business investment, asset pricing, factor supply, and industrial organization. Ljungqvist & Sargent apply dynamic programming to study a variety of theoretical questions in monetary policy, fiscal policy, taxation, economic growth, search theory, and labor economics.[4] Dixit & Pindyck showed the value of the method for thinking about capital budgeting.[5] Patrick L. Anderson used dynamic programming to develop methods to value closely held firms.[6]
Using dynamic programming to solve concrete problems is complicated by informational difficulties, such as choosing the unobservable discount rate. There are also computational issues, the main one being the curse of dimensionality arising from the vast number of possible actions and potential state variables that must be considered before an optimal strategy can be selected. For an extensive discussion of computational issues, see Miranda & Fackler.[7]
See also
References
1. ^ R Bellman, On the Theory of Dynamic Programming, Proceedings of the National Academy of Sciences, 1952
2. ^ Robert C. Merton, 1973, "An Intertemporal Capital Asset Pricing Model," Econometrica 41: 867-887.
3. ^ *Nancy Stokey, and Robert E. Lucas, with Edward Prescott, 1989. Recursive Methods in Economic Dynamics. Harvard Univ. Press.
4. ^ Lars Ljungqvist & Thomas Sargent, 2004. Recursive Macroeconomic Theory. MIT Press.
5. ^ Avinash Dixit & Robert Pindyck, 1994. Investment Under Uncertainty. Princeton Univ. Press.
6. ^ Patrick L. Anderson, 2004. Business Economics and Finance. CRC Press.
7. ^ Miranda, M., & Fackler, P., 2002. Applied Computational Economics and Finance. MIT Press.
2. ^ Robert C. Merton, 1973, "An Intertemporal Capital Asset Pricing Model," Econometrica 41: 867-887.
3. ^ *Nancy Stokey, and Robert E. Lucas, with Edward Prescott, 1989. Recursive Methods in Economic Dynamics. Harvard Univ. Press.
4. ^ Lars Ljungqvist & Thomas Sargent, 2004. Recursive Macroeconomic Theory. MIT Press.
5. ^ Avinash Dixit & Robert Pindyck, 1994. Investment Under Uncertainty. Princeton Univ. Press.
6. ^ Patrick L. Anderson, 2004. Business Economics and Finance. CRC Press.
7. ^ Miranda, M., & Fackler, P., 2002. Applied Computational Economics and Finance. MIT Press.
Richard Ernest Bellman (1920–1984) was an applied mathematician, celebrated for his invention of dynamic programming in 1953, and important contributions in other fields of mathematics.
Bellman studied mathematics at Brooklyn College (B.A.
..... Click the link for more information.
Bellman studied mathematics at Brooklyn College (B.A.
..... Click the link for more information.
dynamic programming is a method of solving problems exhibiting the properties of overlapping subproblems and optimal substructure (described below) that takes much less time than naive methods.
..... Click the link for more information.
..... Click the link for more information.
Optimal control theory, a generalization of the calculus of variations, is a mathematical optimization method for deriving control policies. The method is largely due to the work of Lev Pontryagin and his collaborators, summarized in English in Prontryagin (1962).
..... Click the link for more information.
..... Click the link for more information.
For control theory in psychology and sociology, see .
Control theory is an interdisciplinary branch of engineering and mathematics, that deals with the behavior of dynamical systems. The desired output of a system is called the reference.
..... Click the link for more information.
Economics is the social science that studies the production, distribution, and consumption of goods and services. The term economics comes from the Greek for oikos (house) and nomos (custom or law), hence "rules of the house(hold).
..... Click the link for more information.
..... Click the link for more information.
discounting is the process of finding the present value of an amount of cash at some future date, and along with compounding cash forms the basis of time value of money calculations.
..... Click the link for more information.
..... Click the link for more information.
In computer science, a problem is said to have optimal substructure if an optimal solution can be constructed efficiently from optimal solutions to its subproblems. This property is used to determine the usefulness of dynamic programming and greedy algorithms in a problem.
..... Click the link for more information.
..... Click the link for more information.
reinforcement learning is a sub-area of machine learning concerned with how an agent ought to take actions in an environment so as to maximize some notion of long-term reward.
..... Click the link for more information.
..... Click the link for more information.
In mathematics, the method of undetermined coefficients is an approach to finding a particular solution to certain inhomogeneous ordinary differential equations and recurrence relations.
..... Click the link for more information.
..... Click the link for more information.
backward induction is an algorithm used to compute subgame perfect equilibria in sequential games. The process proceeds by first looking at the last possible action and determining what the last player will do in each situation (i.e. in each information set).
..... Click the link for more information.
..... Click the link for more information.
In mathematics, an equation or system of equations is said to have a closed-form solution if, and only if, at least one solution can be expressed analytically in terms of a bounded number of certain "well-known" functions.
..... Click the link for more information.
..... Click the link for more information.
Numerical analysis is the study of algorithms for the problems of continuous mathematics (as distinguished from discrete mathematics).
One of the earliest mathematical writing is the Babylonian tablet YBC 7289, which gives a sexagesimal numerical approximation of ,
..... Click the link for more information.
One of the earliest mathematical writing is the Babylonian tablet YBC 7289, which gives a sexagesimal numerical approximation of ,
..... Click the link for more information.
The curse of dimensionality is a term coined by Richard Bellman to describe the problem caused by the exponential increase in volume associated with adding extra dimensions to a (mathematical) space.
..... Click the link for more information.
..... Click the link for more information.
The envelope theorem is a basic theorem used to solve maximization problems in microeconomics. It may be used to prove Hotelling's lemma, Shephard's lemma, and Roy's identity.
..... Click the link for more information.
..... Click the link for more information.
In mathematics, a recurrence relation is an equation that defines a sequence recursively: each term of the sequence is defined as a function of the preceding terms. A difference equation is a specific type of recurrence relation.
..... Click the link for more information.
..... Click the link for more information.
differential equation is a mathematical equation for an unknown function of one or several variables that relates the values of the function itself and of its derivatives of various orders.
..... Click the link for more information.
..... Click the link for more information.
The Euler-Lagrange equation or Lagrange's equation, developed by Leonhard Euler and Joseph-Louis Lagrange in the 1750s, is the major formula of the calculus of variations. It provides a way to solve for functions which extremize a given cost functional.
..... Click the link for more information.
..... Click the link for more information.
The Intertemporal Capital Asset Pricing Model, or ICAPM, is a linear factor model with wealth and state variable that forecast changes in the distribution of future returns or income.
..... Click the link for more information.
..... Click the link for more information.
In mathematics, a recurrence relation is an equation that defines a sequence recursively: each term of the sequence is defined as a function of the preceding terms. A difference equation is a specific type of recurrence relation.
..... Click the link for more information.
..... Click the link for more information.
Economic growth is the increase in value of the goods and services produced by an economy. It is conventionally measured as the percent rate of increase in real gross domestic product, or GDP. Growth is usually calculated in real terms, i.e.
..... Click the link for more information.
..... Click the link for more information.
worldwide view of the subject.
Please [ improve this article] or discuss the issue on the talk page.
Please [ improve this article] or discuss the issue on the talk page.
Mining is the extraction of valuable minerals or other geological materials from the earth, usually (but not always) from an ore body, vein, or (coal) seam.
..... Click the link for more information.
In political science and economics, the principal-agent problem treats the difficulties that arise under conditions of incomplete and asymmetric information when a principal hires an agent.
..... Click the link for more information.
..... Click the link for more information.
Economic policy
Monetary policy
Central bank Money supply
Fiscal policy
Spending Deficit Debt
Trade policy
Tariff Trade agreement
Finance
Financial market
..... Click the link for more information.
Investment or investing[1] is a term with several closely-related meanings in business management, finance and economics, related to saving or deferring consumption.
..... Click the link for more information.
..... Click the link for more information.
In finance, valuation is the process of estimating the market value of a financial asset or liability. Valuations can be done on assets (for example, investments in marketable securities such as stocks, options, business enterprises, or intangible assets such as patents and
..... Click the link for more information.
..... Click the link for more information.
In economics, factors of production are resources used in the production of goods and services, including land, labor, and capital.
..... Click the link for more information.
Land, labor, and capital
Resource in economics distinguish among such factors of production as:..... Click the link for more information.
Industrial organization is the field of economics that studies the strategic behavior of firms, the structure of markets and their interactions. It is also referred to as "Industrial Economics", but perhaps a most appropriate term is the "Economics of Imperfect Competition".
..... Click the link for more information.
..... Click the link for more information.
Economic policy
Monetary policy
Central bank Money supply
Fiscal policy
Spending Deficit Debt
Trade policy
Tariff Trade agreement
Finance
Financial market
..... Click the link for more information.
Economic policy
Monetary policy
Central bank Money supply
Fiscal policy
Spending Deficit Debt
Trade policy
Tariff Trade agreement
Finance
Financial market
..... Click the link for more information.
Economic policy
Monetary policy
Central bank Money supply
Fiscal policy
Spending Deficit Debt
Trade policy
Tariff Trade agreement
Finance
Financial market
Financial market participants
..... Click the link for more information.
Monetary policy
Central bank Money supply
Fiscal policy
Spending Deficit Debt
Trade policy
Tariff Trade agreement
Finance
Financial market
Financial market participants
..... Click the link for more information.
This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus




