DecisionTheoretic Foundations for MultiAgent Planning
Sponsored by:
National Science Foundation,
Division of Information & Intelligent Systems
Air Force Office of Scientific Research,
Systems and Software Program
Shlomo Zilberstein, PI
Research Assistants: Christopher Amato, Alan Carlin, Akshat Kumar, Marek Petrik
Collaborators: Martin Allen, Xiaoping Chen, Claudia Goldman, Feng Wu
Project Description
A fundamental challenge in artificial intelligence is how to achieve intelligent coordination of a group of decision makers in spite of stochasticity and limited information. Decision theory offers a normative framework for optimizing decisions under uncertainty. But due to computation complexity barriers, developing decisiontheoretic reasoning algorithms for multiagent systems is a serious challenge. In this project we develops a comprehensive approach to accomplish this using the framework of decentralized Markov decision processes. Target applications of this work involve a group of decision makers or agents that operate under uncertainty and rely on noisy sensors. Each agent may obtain different partial information about the global state. Such decentralized planning and coordination problems abound in practice. Examples include autonomous mobile robots, distributed management of servers or a power grid, as well as the operation of complex human organizations. In such problems, centralized planning is inadequate, because agents cannot share all their information all the time.
Despite rapid progress in recent years in decentralized decision making, existing algorithms can only solve small "toy" problems. Compared with singleagent MDPs, research of decentralized MDPs is in its infancy. Classical MDPs and POMDPs solution techniques are often inapplicable because of some fundamental differences between centralized and decentralized operation. The goal of this project is to develop wellfounded coordination strategies that are widely applicable and scalable to large domains. This requires the development of approximation algorithms that alleviate the large memory requirements of current solution techniques and scale up to larger problems. It is also important to derive error bounds for these algorithms that would make it possible to decide when a satisfactory plan is available. To evaluate these algorithms, we are also developing a new set of challenging test problems and benchmarks.
The resulting solution techniques are based on several principles: (a) combining dynamic programming with heuristic search to produce scalable solution techniques, (b) developing observation compression methods that identify the most relevant subset of observations to be considered while bounding the error, (c) developing algorithms that represent policies compactly as finitestate controllers, (d) exploiting randomization and correlation mechanisms to improve the value of compact policies, (e) developing quantitative approaches to represent and reason about interaction and to manage the tradeoff between the degree of interaction and problem hardness, and (f) exploiting the locality and structure of interactions between agents to improve the efficiency of coordination algorithms. The evaluation of these techniques has already shown that they can improve the scalability of existing algorithms by several orders of magnitude.
Useful Links

IJCAI 2009 Tutorial on DecisionTheoretic Planning for MultiAgent Systems
 AAMAS 2013 Workshop on Multiagent Sequential Decision Making Under Uncertainty
 AAMAS 2012 Workshop on Multiagent Sequential Decision Making Under Uncertainty
 AAMAS 2011 Workshop on MultiAgent Sequential Decision Making in Uncertain Domains
 AAMAS 2010 Workshop on MultiAgent Sequential Decision Making in Uncertain Domains

AAMAS 2009 Workshop on MultiAgent Sequential Decision Making

AAMAS 2008 Workshop on MultiAgent Sequential Decision Making

AAMAS 2007 Workshop on MultiAgent Sequential Decision Making
Related Publications
 Qualitative Planning under Partial Observability in MultiAgent Domains.
R.I. Brafman, G. Shani, and S. Zilberstein. Proceedings of the TwentySeventh Conference on Artificial Intelligence (AAAI), Bellevue, Washington, 2013. [abs] [bib] [pdf]
 MonteCarlo Expectation Maximization for Decentralized POMDPs.
F. Wu, S. Zilberstein, and N.R. Jennings. Proceedings of the TwentyThird International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, 2013. [abs] [bib] [pdf]
 Parameter Learning for Latent Network Diffusion.
X. Wu, A. Kumar, D. Sheldon, and S. Zilberstein. Proceedings of the TwentyThird International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, 2013. [abs] [bib] [pdf]
 Automated Generation of Interaction Graphs for ValueFactored DecPOMDPs.
W. Yeoh, A. Kumar, and S. Zilberstein. Proceedings of the TwentyThird International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, 2013. [abs] [bib] [pdf]
 Multiagent Planning, Control, and Execution.
E. Durfee and S. Zilberstein. In G. Weiss (Ed.), Multiagent Systems, Second Edition, 485546, MIT Press, 2013. [abs] [bib] [pdf]
 MessagePassing Algorithms for MAP Estimation Using DC Programming.
A. Kumar, S. Zilberstein, and M. Toussaint. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 656664, 2012. [abs] [bib] [pdf]
 Online Planning for MultiAgent Systems with Bounded
Communication.
F. Wu, S. Zilberstein, and X. Chen. Artificial Intelligence, 175(2):487511, 2011. [abs] [bib] [pdf]
 Decentralized Monitoring of Anytime Decision
Making.
A. Carlin and S. Zilberstein. Proceedings of the Tenth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 157164, Taipei, Taiwan, 2011. [abs] [bib] [pdf]
 Message Passing Algorithms for Large Structured
Decentralized POMDPs.
A. Kumar and S. Zilberstein. Proceedings of the Tenth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 10871088, Taipei, Taiwan, 2011. [abs] [bib] [pdf]
 MessagePassing Algorithms for Quadratic Programming
Formulations of MAP Estimation.
A. Kumar and S. Zilberstein. Proceedings of the TwentySeventh Conference on Uncertainty in Artificial Intelligence (UAI), 428435, Barcelona, Spain, 2011. [abs] [bib] [pdf]
 Scalable Multiagent Planning Using Probabilistic
Inference.
A. Kumar, S. Zilberstein, and M. Toussaint. Proceedings of the TwentySecond International Joint Conference on Artificial Intelligence (IJCAI), 21402146, Barcelona, Spain, 2011. [abs] [bib] [pdf]
 Online Planning for Ad Hoc Autonomous Agent
Teams.
F. Wu, S. Zilberstein, and X. Chen. Proceedings of the TwentySecond International Joint Conference on Artificial Intelligence (IJCAI), 439445, Barcelona, Spain, 2011. [abs] [bib] [pdf]
 On MessagePassing MAP Estimation in Graphical Models and
DCOPs.
A. Kumar, W. Yeoh, and S. Zilberstein. Proceedings of the Thirteenth International Workshop on Distributed Constraint Reasoning (DCR), 5770, Barcelona, Spain, 2011. [abs] [bib] [pdf]
 Bounded Rationality in Multiagent Systems Using
Decentralized Metareasoning.
A. Carlin and S. Zilberstein. In T. Guy, M. Karny, and D. Wolpert (Eds.), Decision Making with Imperfect Decision Makers, 128, Springer, 2011. [abs] [bib] [pdf]
 PointBased Backup for Decentralized POMDPs: Complexity and
New Algorithms.
A. Kumar and S. Zilberstein. Proceedings of the Ninth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 13151322, Toronto, Canada, 2010. [abs] [bib] [pdf]
 PointBased Policy Generation for Decentralized
POMDPs.
F. Wu, S. Zilberstein, and X. Chen. Proceedings of the Ninth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 13071314, Toronto, Canada, 2010. [abs] [bib] [pdf]
 Anytime Planning for Decentralized POMDPs
using Expectation Maximization.
A. Kumar and S. Zilberstein. Proceedings of the TwentySixth Confererence on Uncertainty in Artificial Intelligence (UAI), Catalina Island, California, 2010. [abs] [bib] [pdf]
 Rollout Sampling Policy Iteration for Decentralized
POMDPs.
F. Wu, S. Zilberstein, and X. Chen. Proceedings of the TwentySixth Confererence on Uncertainty in Artificial Intelligence (UAI), Catalina Island, California, 2010. [abs] [bib] [pdf]
 FiniteState Controllers Based on Mealy Machines for
Centralized and Decentralized POMDPs.
C. Amato, B. Bonet, and S. Zilberstein. Proceedings of the TwentyFourth Conference on Artificial Intellgence (AAAI), Atlanta, Georgia, 2010. [abs] [bib] [pdf]
 TrialBased Dynamic Programming for MultiAgent
Planning.
F. Wu, S. Zilberstein, and X. Chen. Proceedings of the TwentyFourth Conference on Artificial Intellgence (AAAI), Atlanta, Georgia, 2010. [abs] [bib] [pdf]
 Policy Iteration for Decentralized Control of Markov Decision Processes.
D.S. Bernstein, C. Amato, E.A. Hansen, and S. Zilberstein. Journal of Artificial Intelligence Research, 34:89132, 2009. [abs] [bib] [pdf]
 A Bilinear Programming Approach for Multiagent Planning.
M. Petrik and S. Zilberstein. Journal of Artificial Intelligence Research, 35:235274, 2009. [abs] [bib] [pdf]
 Analyzing Myopic Approaches for MultiAgent Communications.
R. Becker, A. Carlin, V. Lesser, and S. Zilberstein. Computational Intelligence, 25(1):3150, 2009. [abs] [bib] [pdf]
 Optimizing FixedSize Stochastic Controllers
for POMDPs and Decentralized POMDPs.
C. Amato, D.S. Bernstein, and S. Zilberstein. Autonomous Agents and MultiAgent Systems, 2009. [abs] [bib] [pdf]
 Achieving Goals in Decentralized POMDPs.
C. Amato and S. Zilberstein. Proceedings of the Eighth International Conference on Autonomous Systems and Multiagent Systems (AAMAS), 593600, Budapest, Hungary, 2009. [abs] [bib] [pdf]
 ConstraintBased Dynamic Programming for Decentralized POMDPs with Structured Interactions.
A. Kumar and S. Zilberstein. Proceedings of the Eighth International Conference on Autonomous Systems and Multiagent Systems (AAMAS), 561568, Budapest, Hungary, 2009. [abs] [bib] [pdf]
 Value of Communication In Decentralized POMDPs.
A. Carlin and S. Zilberstein. AAMAS Workshop on MultiAgent Sequential Decision Making in Uncertain Domains (MSDM), Budapest, Hungary, 2009. [abs] [bib] [pdf]
 Dynamic Programming Approximations for Partially Observable Stochastic Games.
A. Kumar and S. Zilberstein. Proceedings of the TwentySecond International FLAIRS Conference, Sanibel Island, Florida, 2009. [abs] [bib] [pdf]
 EventDetecting MultiAgent MDPs: Complexity and ConstantFactor Approximation.
A. Kumar and S. Zilberstein. Proceedings of the TwentyFirst International Joint Conference on Artificial Intelligence (IJCAI), 201207, Pasadena, California, 2009. [abs] [bib] [pdf]
 Incremental Policy Generation for FiniteHorizon DECPOMDPs.
C. Amato, J.S. Dibangoye, and S. Zilberstein. Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS), Thessaloniki, Greece, 2009. [abs] [bib] [pdf]
 MultiAgent Online Planning with Communication.
F. Wu, S. Zilberstein, and X. Chen. Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS), Thessaloniki, Greece, 2009. [abs] [bib] [pdf]
 Myopic and NonMyopic Communication Under Partial Observability.
A. Carlin and S. Zilberstein. Proceedings of Intelligent Agent Technology (IAT), Milan, Italy, 2009. [abs] [bib] [pdf]
 Complexity of Decentralized Control: Special
Cases.
M. Allen and S. Zilberstein. Proceedings of Neural Information Processing Systems (NIPS), Vancouver, British Columbia, December 2009. [abs] [bib] [pdf]
 CommunicationBased Decomposition Mechanisms for Decentralized MDPs.
C.V. Goldman and S. Zilberstein. Journal of Artificial Intelligence Research, 32:169202, 2008. [abs] [bib] [pdf]
 Formal Models and Algorithms for Decentralized Decision Making under Uncertainty.
S. Seuken and S. Zilberstein. Autonomous Agents and MultiAgent Systems, 17(2):190250, October, 2008. [abs] [bib] [pdf]
 A Successive Approximation Algorithm for Coordination Problems.
M. Petrik and S. Zilberstein. Proceedings of the Tenth International Symposium on Artificial Intelligence and Mathematics (ISAIM), Ft. Lauderdale, Florida, 2008. [abs] [bib] [pdf]
 ValueBased Observation Compression for DECPOMDPs.
A. Carlin and S. Zilberstein. Proceedings of the Seventh International Conference on Autonomous Systems and Multiagent Systems (AAMAS), 501508, Estoril, Portugal, 2008. [abs] [bib] [pdf]
 Heuristic Policy Iteration for InfiniteHorizon Decentralized POMDPs.
C. Amato and S. Zilberstein. AAMAS Workshop on MultiAgent Sequential Decision Making in Uncertain Domains (MSDM), 115, Estoril, Portugal, 2008. [abs] [bib] [pdf]
 Observation Compression in DECPOMDP Policy Trees.
A. Carlin and S. Zilberstein. AAMAS Workshop on MultiAgent Sequential Decision Making in Uncertain Domains (MSDM), 3145, Estoril, Portugal, 2008. [abs] [bib] [pdf]
 Optimizing FixedSize Stochastic Controllers for POMDPs.
C. Amato, D.S. Bernstein, and S. Zilberstein. AAAI Workshop on Advancements in POMDP Solvers, Chicago, Illinois, 2008. [abs] [bib] [pdf]
 POMDP and DECPOMDP PointBased Observation Aggregation.
A. Carlin and S. Zilberstein. AAAI Workshop on Advancements in POMDP Solvers, Chicago, Illinois, 2008. [abs] [bib] [pdf]
 Interaction Structure and Dimensionality in Decentralized Problem Solving
M. Allen, M. Petrik, and S. Zilberstein. Technical Report 0811, Computer Science Department, University of Massachusetts Amherst, 2008. [abs] [bib] [pdf]
 Learning to Communicate in a Decentralized Environment.
C.V. Goldman, M. Allen, and S. Zilberstein. Autonomous Agents and MultiAgent Systems, 15(1):4790, 2007. [abs] [bib] [pdf]
 Solving POMDPs Using Quadratically Constrained Linear Programs.
C. Amato, D.S. Bernstein, and S. Zilberstein. Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI), 24182424, Hyderabad, India, 2007. [abs] [bib] [pdf]
 MemoryBounded Dynamic Programming for DECPOMDPs.
S. Seuken and S. Zilberstein. Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI), 20092015, Hyderabad, India, 2007. [abs] [bib] [pdf]
 Bounded Dynamic Programming for Decetralized POMDPs.
C. Amato, A. Carlin, and S. Zilberstein. AAMAS Workshop on MultiAgent Sequential Decision Making in Uncertain Domains, Honolulu, Hawaii, May, 2007. [abs] [bib] [pdf]
 Optimizing MemoryBounded Controllers for Decentralized POMDPs.
C. Amato, D.S. Bernstein, and S. Zilberstein. Proceedings of the Twenty Third Conference on Uncertainty in Artificial Intelligence (UAI), Vancouver, British Columbia, 2007. [abs] [bib] [pdf]
 Improved MemoryBounded Dynamic Programming for Decentralized POMDPs.
S. Seuken and S. Zilberstein. Proceedings of the Twenty Third Conference on Uncertainty in Artificial Intelligence (UAI), Vancouver, British Columbia, 2007. [abs] [bib] [pdf]
 Agent Influence as a Predictor of Difficulty for Decentralized ProblemSolving.
M. Allen and S. Zilberstein. Proceedings of the TwentySecond Conference on Artificial Intelligence (AAAI), 688693, Vancouver, British Columbia, 2007. [abs] [bib] [pdf]
 Anytime Coordination Using Separable Bilinear Programs.
M. Petrik and S. Zilberstein. Proceedings of the TwentySecond Conference on Artificial Intelligence (AAAI), 750755, Vancouver, British Columbia, 2007. [abs] [bib] [pdf]