000 03747nam a22002417a 4500
005 20251126172514.0
008 180327b xxu||||| |||| 00| 0 eng d
020 _a978-1886529441
041 _aeng
082 _a519.703
_bBerD4 Vol.2
100 _aBertsekas, Dimitri P.
245 _aDynamic Programming And Optimal Control :
_bApproximate Dynamic Programming [4th ed.] [Vol.2] /
_cDimitri P. Bertsekas
250 _a4th ed.
260 _aUSA:
_bAthena Scientific;
_c©2012
300 _a694p.
505 _aContents: 1. Discounted Problems - Theory. 2. Discounted Problems - Computational Methods. 3. Stochastic Shortest Path Problems. 4. Undiscounted Problems. 5. Average Cost per Stage Problems. 6. Approximate Dynamic Programming - Discounted Models. 7. Approximate Dynamic Programming - Nondiscounted Models and Generalizations.
520 _aThis 4th edition is a major revision of Vol. II of the leading two-volume dynamic programming textbook by Bertsekas, and contains a substantial amount of new material, as well as a reorganization of old material. The length has increased by more than 60% from the third edition, and most of the old material has been restructured and/or revised. Volume II now numbers more than 700 pages and is larger in size than Vol. I. It can arguably be viewed as a new book! Approximate DP has become the central focal point of Vol. II, and occupies more than half of the book (the last two chapters, and large parts of Chapters 1-3). Thus one may also view Vol. II as a followup of the author's 1996 book ``Neuro-Dynamic Programming" (coauthored with John Tsitsiklis). The present book focuses to a great extent on new research that became available after 1996. On the other hand, the textbook style of the book has been preserved, and some material has been explained at an intuitive or informal level, while referring to the journal literature or the Neuro-Dynamic Programming book for a more mathematical treatment. As the book's focus shifted, increased emphasis was placed on new or recent research in approximate DP and simulation-based methods, as well as on asynchronous iterative methods, in view of the central role of simulation, which is by nature asynchronous. A lot of this material is an outgrowth of research conducted in the six years since the previous edition. Some of the highlights, in the order appearing in the book, are: (a) A broad spectrum of simulation-based, approximate value iteration, policy iteration, and Q-learning methods based on projected equations and aggregation. (b) New policy iteration and Q-learning algorithms for stochastic shortest path problems with improper policies. (c) Reliable Q-learning algorithms for optimistic policy iteration. (d) New simulation techniques for multistep methods, such as geometric and free-form sampling, based on generalized weighted Bellman equations. (e) Computational methods for generalized/abstract discounted DP, including convergence analysis and error bounds for approximations. (f) Monte Carlo linear algebra methods, which extend the approximate DP methodology to broadly applicable problems involving large-scale regression and systems of linear equations. The book includes a substantial number of examples, and exercises, detailed solutions of many of which are posted on the internet. It was developed through teaching graduate courses at M.I.T., and is supported by a large amount of educational material, such as slides and videos, posted at the MIT Open Courseware, the author's, and the publisher's web sites.
650 _aDynamic Programming
650 _aApproximate Dynamic Programming
650 _aControl Theory
650 _aMathematical Optimization Techniques
942 _cBK
999 _c2546
_d2546