Improving the Scalability of Stochastic Planning Algorithms
Sponsored by:
National Science Foundation,
Division of Information & Intelligent Systems
Shlomo Zilberstein, PIs
Project Description
This project addresses computational barriers that have limited the usefulness of partially-observable Markov decision process (POMDP) algorithms. These algorithms play an important role in the area of planning under uncertainty. The Markov decision process (MDP) has emerged as a powerful and elegant framework for solving problems in a wide range of application domains such as mobile robot control, machine maintenance, gesture recognition, medical diagnosis and treatment, and policy making. For situations in which the decision maker can fully observe the intermediate state of the domain, there are many effective algorithms that can solve large problems. However, in many applications, it is unrealistic to assume that perfect state information is available. The more general, partially-observable MDP (POMDP) addresses this difficulty, but in this case the computational complexity of planning makes it hard to apply existing solution techniques to practical applications.
The project covers several new ways to overcome the key computational bottlenecks in POMDP algorithms. The main research thrusts include (a) Identify and examine new types of belief-space structures that can be used to accelerate significantly each of the key components of POMDP solution techniques; (b) Evaluate the impact of these improvements on a wide range of exact and approximate algorithms with the goal of demonstrating exponential acceleration; (c) Integrate the new approach with previously identified methods for accelerating MDP and POMDP algorithms using search, symbolic representations, and parallelization; and (d) Develop a new set of challenging test problems and benchmarks that are significantly harder than the existing toy problems and perform a rigorous evaluation and comparison of the developed techniques.
Related Publications
- Learning to Communicate in a Decentralized Environment.
C.V. Goldman, M. Allen, and S. Zilberstein. Autonomous Agents and Multi-Agent Systems, 15(1):47-90, 2007. [abs] [bib] [pdf]
- Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs.
S. Seuken and S. Zilberstein. Proceedings of the Twenty Third Conference on Uncertainty in Artificial Intelligence (UAI), Vancouver, British Columbia, 2007. [abs] [bib] [pdf]
- Bounded Dynamic Programming for Decetralized POMDPs.
C. Amato, A. Carlin, and S. Zilberstein. AAMAS Workshop on Multi-Agent Sequential Decision Making in Uncertain Domains, Honolulu, Hawaii, May, 2007. [abs] [bib] [pdf]
- Memory-Bounded Dynamic Programming for DEC-POMDPs.
S. Seuken and S. Zilberstein. Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI), 2009-2015, Hyderabad, India, 2007. [abs] [bib] [pdf]
- Finding Optimal POMDP Controllers Using Quadratically Constrained Linear Programs.
C. Amato, D.S. Bernstein, and S. Zilberstein. Proceedings of the Ninth International Symposium on Artificial Intelligence and Mathematics (ISAIM), Ft. Lauderdale, Florida, 2006. [abs] [bib] [pdf]
- Optimal Fixed-Size Controllers for Decentralized POMDPs.
C. Amato, D.S. Bernstein, and S. Zilberstein. AAMAS Workshop on Multi-Agent Sequential Decision Making in Uncertain Domains, Hakodate, Japan, May, 2006. [abs] [bib] [pdf]
- Efficient Maximization in Solving POMDPs.
Z. Feng and S. Zilberstein. Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI), 975-980, Pittsburgh, Pennsylvania, 2005. [abs] [bib] [pdf]
- Region-Based Incremental Pruning for POMDPs.
Z. Feng and S. Zilberstein. Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI), 146-153, Banff, Canada. 2004. [abs] [bib] [pdf]
- Symbolic Generalization for On-line Planning.
Z. Feng, E.A. Hansen, and S. Zilberstein. Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI), 209-216, Acapulco, Mexico, 2003. [abs] [bib] [pdf]