The problem of controlling a stochastic process with unknown parameters over an infinite horizon with discounting is considered. The possibility of sacrificing current period expected reward for information leading to possible increases in future reward is examined. Agents express beliefs about unknown parameters in terms of distributions. Under general conditions the sequence of beliefs converges to a limit distribution. The limit distribution may or may not be concentrated at the true parameter value. In some cases complete learning is optimal; in others the optimal strategy does not imply complete learning. The paper concludes with examination of some special cases including high and low discount rates, discrete parameter and action spaces (the n-armed bandit with correlated arms), and a class of examples in which incomplete learning is optimal.
MLA
Easley, David, and Nicholas M. Kiefer. “Controlling a Stochastic Process with Unknown Parameters.” Econometrica, vol. 56, .no 5, Econometric Society, 1988, pp. 1045-1064, https://www.jstor.org/stable/1911358
Chicago
Easley, David, and Nicholas M. Kiefer. “Controlling a Stochastic Process with Unknown Parameters.” Econometrica, 56, .no 5, (Econometric Society: 1988), 1045-1064. https://www.jstor.org/stable/1911358
APA
Easley, D., & Kiefer, N. M. (1988). Controlling a Stochastic Process with Unknown Parameters. Econometrica, 56(5), 1045-1064. https://www.jstor.org/stable/1911358
By clicking the "Accept" button or continuing to browse our site, you agree to first-party and session-only cookies being stored on your device. Cookies are used to optimize your experience and anonymously analyze website performance and traffic.