Point-Based Policy Generation for Decentralized POMDPs

Feng Wu, Shlomo Zilberstein, and Xiaoping Chen. Point-Based Policy Generation for Decentralized POMDPs. Proceedings of the Ninth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 1307-1314, Toronto, Canada, 2010.

Abstract

Memory-bounded techniques have shown great promise in solving complex multi-agent planning problems modeled as DEC-POMDPs. Much of the performance gains can be attributed to pruning techniques that alleviate the complexity of the exhaustive backup step of the original MBDP algorithm. Despite these improvements, state-of-the-art algorithms can still handle a relative small pool of candidate policies, which limits the quality of the solution in some benchmark problems. We present a new algorithm, Point-Based Policy Generation, which avoids altogether searching the entire joint policy space. The key observation is that the best joint policy for each reachable belief state can be constructed directly, instead of producing first a large set of candidates. We also provide an efficient approximate implementation of this operation. The experimental results show that our solution technique improves the performance significantly in terms of both runtime and solution quality.

Bibtex entry:

@inproceedings{WZCaamas10,
  author	= {Feng Wu and Shlomo Zilberstein and Xiaoping Chen},
  title		= {Point-Based Policy Generation for Decentralized {POMDP}s},
  booktitle     = {Proceedings of the Ninth International Conference on Autonomous
                   Agents and Multiagent Systems},
  year		= {2010},
  pages		= {1307-1314},
  address       = {Toronto, Canada},
  url		= {http://rbr.cs.umass.edu/shlomo/papers/WZCaamas10.html}
}

shlomo@cs.umass.edu
UMass Amherst