Solving POMDPs Using Quadratically Constrained Linear Programs

Christopher Amato, Daniel S. Bernstein, and Shlomo Zilberstein. Solving POMDPs Using Quadratically Constrained Linear Programs. Proceedings of the Ninth International Symposium on Artificial Intelligence and Mathematics (ISAIM), Ft. Lauderdale, Florida, 2006.

Abstract

Developing scalable algorithms for solving partially observable Markov decision processes (POMDPs) is an important challenge. One promising approach is based on representing POMDP policies as finite-state controllers. This method has been used successfully to address the intractable memory requirements of POMDP algorithms. We illustrate some fundamental theoretical limitations of existing techniques that use controllers. We then propose a new approach that formulates the problem as a quadratically constrained linear program (QCLP), the solution of which provides an optimal controller of a desired size. We evaluate several optimization methods for solving QCLPs and compare their performance with existing POMDP optimization methods. While the optimization algorithms used in this paper only guarantee locally optimal solutions, the results show consistent improvement of solution quality over the state-of-the-art techniques. The results show that powerful nonlinear programming algorithms can be used effectively to improve the performance and scalability of POMDP algorithms.

Bibtex entry:

@inproceedings{ABZisaim06,
  author	= {Christopher Amato and Daniel S. Bernstein and Shlomo Zilberstein},
  title		= {Solving {POMDP}s Using Quadratically Constrained Linear Programs},
  booktitle     = {Proceedings of the Ninth International Symposium on Artificial
                   Intelligence and Mathematics},
  year		= {2006},
  pages		= {},
  address       = {Ft. Lauderdale, Florida},
  url		= {http://rbr.cs.umass.edu/shlomo/papers/ABZisaim06.html}
}

shlomo@cs.umass.edu
UMass Amherst