Improving the Scalability of Stochastic Planning Algorithms
Shlomo Zilberstein, PIs
This project addresses computational barriers that have limited the usefulness of partially-observable Markov decision process (POMDP) algorithms. These algorithms play an important role in the area of planning under uncertainty. The Markov decision process (MDP) has emerged as a powerful and elegant framework for solving problems in a wide range of application domains such as mobile robot control, machine maintenance, gesture recognition, medical diagnosis and treatment, and policy making. For situations in which the decision maker can fully observe the intermediate state of the domain, there are many effective algorithms that can solve large problems. However, in many applications, it is unrealistic to assume that perfect state information is available. The more general, partially-observable MDP (POMDP) addresses this difficulty, but in this case the computational complexity of planning makes it hard to apply existing solution techniques to practical applications.
The project covers several new ways to overcome the key computational bottlenecks in POMDP algorithms. The main research thrusts include (a) Identify and examine new types of belief-space structures that can be used to accelerate significantly each of the key components of POMDP solution techniques; (b) Evaluate the impact of these improvements on a wide range of exact and approximate algorithms with the goal of demonstrating exponential acceleration; (c) Integrate the new approach with previously identified methods for accelerating MDP and POMDP algorithms using search, symbolic representations, and parallelization; and (d) Develop a new set of challenging test problems and benchmarks that are significantly harder than the existing toy problems and perform a rigorous evaluation and comparison of the developed techniques.
- Learning to Communicate in a Decentralized Environment.
- Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs.
- Bounded Dynamic Programming for Decetralized POMDPs.
- Memory-Bounded Dynamic Programming for DEC-POMDPs.
- Finding Optimal POMDP Controllers Using Quadratically Constrained Linear Programs.
- Optimal Fixed-Size Controllers for Decentralized POMDPs.
- Efficient Maximization in Solving POMDPs.
- Region-Based Incremental Pruning for POMDPs.
- Symbolic Generalization for On-line Planning.