Decision-Theoretic Foundations for Multi-Agent Planning

Sponsored by:
National Science Foundation, Division of Information & Intelligent Systems
Air Force Office of Scientific Research, Systems and Software Program

Shlomo Zilberstein, PI
Research Assistants: Christopher Amato, Alan Carlin, Akshat Kumar, Marek Petrik
Collaborators: Martin Allen, Xiaoping Chen, Claudia Goldman, Feng Wu

Project Description

A fundamental challenge in artificial intelligence is how to achieve intelligent coordination of a group of decision makers in spite of stochasticity and limited information. Decision theory offers a normative framework for optimizing decisions under uncertainty. But due to computation complexity barriers, developing decision-theoretic reasoning algorithms for multi-agent systems is a serious challenge. In this project we develops a comprehensive approach to accomplish this using the framework of decentralized Markov decision processes. Target applications of this work involve a group of decision makers or agents that operate under uncertainty and rely on noisy sensors. Each agent may obtain different partial information about the global state. Such decentralized planning and coordination problems abound in practice. Examples include autonomous mobile robots, distributed management of servers or a power grid, as well as the operation of complex human organizations. In such problems, centralized planning is inadequate, because agents cannot share all their information all the time.

Despite rapid progress in recent years in decentralized decision making, existing algorithms can only solve small "toy" problems. Compared with single-agent MDPs, research of decentralized MDPs is in its infancy. Classical MDPs and POMDPs solution techniques are often inapplicable because of some fundamental differences between centralized and decentralized operation. The goal of this project is to develop well-founded coordination strategies that are widely applicable and scalable to large domains. This requires the development of approximation algorithms that alleviate the large memory requirements of current solution techniques and scale up to larger problems. It is also important to derive error bounds for these algorithms that would make it possible to decide when a satisfactory plan is available. To evaluate these algorithms, we are also developing a new set of challenging test problems and benchmarks.

The resulting solution techniques are based on several principles: (a) combining dynamic programming with heuristic search to produce scalable solution techniques, (b) developing observation compression methods that identify the most relevant subset of observations to be considered while bounding the error, (c) developing algorithms that represent policies compactly as finite-state controllers, (d) exploiting randomization and correlation mechanisms to improve the value of compact policies, (e) developing quantitative approaches to represent and reason about interaction and to manage the tradeoff between the degree of interaction and problem hardness, and (f) exploiting the locality and structure of interactions between agents to improve the efficiency of coordination algorithms. The evaluation of these techniques has already shown that they can improve the scalability of existing algorithms by several orders of magnitude.

Useful Links

Related Publications

  • Qualitative Planning under Partial Observability in Multi-Agent Domains.

    R.I. Brafman, G. Shani, and S. Zilberstein. Proceedings of the Twenty-Seventh Conference on Artificial Intelligence (AAAI), Bellevue, Washington, 2013. [abs] [bib] [pdf]

  • Monte-Carlo Expectation Maximization for Decentralized POMDPs.

    F. Wu, S. Zilberstein, and N.R. Jennings. Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, 2013. [abs] [bib] [pdf]

  • Parameter Learning for Latent Network Diffusion.

    X. Wu, A. Kumar, D. Sheldon, and S. Zilberstein. Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, 2013. [abs] [bib] [pdf]

  • Automated Generation of Interaction Graphs for Value-Factored Dec-POMDPs.

    W. Yeoh, A. Kumar, and S. Zilberstein. Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, 2013. [abs] [bib] [pdf]

  • Multiagent Planning, Control, and Execution.

    E. Durfee and S. Zilberstein. In G. Weiss (Ed.), Multiagent Systems, Second Edition, 485-546, MIT Press, 2013. [abs] [bib] [pdf]

  • Message-Passing Algorithms for MAP Estimation Using DC Programming.

    A. Kumar, S. Zilberstein, and M. Toussaint. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 656-664, 2012. [abs] [bib] [pdf]

  • Online Planning for Multi-Agent Systems with Bounded Communication.

    F. Wu, S. Zilberstein, and X. Chen. Artificial Intelligence, 175(2):487-511, 2011. [abs] [bib] [pdf]

  • Decentralized Monitoring of Anytime Decision Making.

    A. Carlin and S. Zilberstein. Proceedings of the Tenth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 157-164, Taipei, Taiwan, 2011. [abs] [bib] [pdf]

  • Message Passing Algorithms for Large Structured Decentralized POMDPs.

    A. Kumar and S. Zilberstein. Proceedings of the Tenth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 1087-1088, Taipei, Taiwan, 2011. [abs] [bib] [pdf]

  • Message-Passing Algorithms for Quadratic Programming Formulations of MAP Estimation.

    A. Kumar and S. Zilberstein. Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (UAI), 428-435, Barcelona, Spain, 2011. [abs] [bib] [pdf]

  • Scalable Multiagent Planning Using Probabilistic Inference.

    A. Kumar, S. Zilberstein, and M. Toussaint. Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI), 2140-2146, Barcelona, Spain, 2011. [abs] [bib] [pdf]

  • Online Planning for Ad Hoc Autonomous Agent Teams.

    F. Wu, S. Zilberstein, and X. Chen. Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI), 439-445, Barcelona, Spain, 2011. [abs] [bib] [pdf]

  • On Message-Passing MAP Estimation in Graphical Models and DCOPs.

    A. Kumar, W. Yeoh, and S. Zilberstein. Proceedings of the Thirteenth International Workshop on Distributed Constraint Reasoning (DCR), 57-70, Barcelona, Spain, 2011. [abs] [bib] [pdf]

  • Bounded Rationality in Multiagent Systems Using Decentralized Metareasoning.

    A. Carlin and S. Zilberstein. In T. Guy, M. Karny, and D. Wolpert (Eds.), Decision Making with Imperfect Decision Makers, 1-28, Springer, 2011. [abs] [bib] [pdf]

  • Point-Based Backup for Decentralized POMDPs: Complexity and New Algorithms.

    A. Kumar and S. Zilberstein. Proceedings of the Ninth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 1315-1322, Toronto, Canada, 2010. [abs] [bib] [pdf]

  • Point-Based Policy Generation for Decentralized POMDPs.

    F. Wu, S. Zilberstein, and X. Chen. Proceedings of the Ninth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 1307-1314, Toronto, Canada, 2010. [abs] [bib] [pdf]

  • Anytime Planning for Decentralized POMDPs using Expectation Maximization.

    A. Kumar and S. Zilberstein. Proceedings of the Twenty-Sixth Confererence on Uncertainty in Artificial Intelligence (UAI), Catalina Island, California, 2010. [abs] [bib] [pdf]

  • Rollout Sampling Policy Iteration for Decentralized POMDPs.

    F. Wu, S. Zilberstein, and X. Chen. Proceedings of the Twenty-Sixth Confererence on Uncertainty in Artificial Intelligence (UAI), Catalina Island, California, 2010. [abs] [bib] [pdf]

  • Finite-State Controllers Based on Mealy Machines for Centralized and Decentralized POMDPs.

    C. Amato, B. Bonet, and S. Zilberstein. Proceedings of the Twenty-Fourth Conference on Artificial Intellgence (AAAI), Atlanta, Georgia, 2010. [abs] [bib] [pdf]

  • Trial-Based Dynamic Programming for Multi-Agent Planning.

    F. Wu, S. Zilberstein, and X. Chen. Proceedings of the Twenty-Fourth Conference on Artificial Intellgence (AAAI), Atlanta, Georgia, 2010. [abs] [bib] [pdf]

  • Policy Iteration for Decentralized Control of Markov Decision Processes.

    D.S. Bernstein, C. Amato, E.A. Hansen, and S. Zilberstein. Journal of Artificial Intelligence Research, 34:89-132, 2009. [abs] [bib] [pdf]

  • A Bilinear Programming Approach for Multiagent Planning.

    M. Petrik and S. Zilberstein. Journal of Artificial Intelligence Research, 35:235-274, 2009. [abs] [bib] [pdf]

  • Analyzing Myopic Approaches for Multi-Agent Communications.

    R. Becker, A. Carlin, V. Lesser, and S. Zilberstein. Computational Intelligence, 25(1):31-50, 2009. [abs] [bib] [pdf]

  • Optimizing Fixed-Size Stochastic Controllers for POMDPs and Decentralized POMDPs.

    C. Amato, D.S. Bernstein, and S. Zilberstein. Autonomous Agents and Multi-Agent Systems, 2009. [abs] [bib] [pdf]

  • Achieving Goals in Decentralized POMDPs.

    C. Amato and S. Zilberstein. Proceedings of the Eighth International Conference on Autonomous Systems and Multiagent Systems (AAMAS), 593-600, Budapest, Hungary, 2009. [abs] [bib] [pdf]

  • Constraint-Based Dynamic Programming for Decentralized POMDPs with Structured Interactions.

    A. Kumar and S. Zilberstein. Proceedings of the Eighth International Conference on Autonomous Systems and Multiagent Systems (AAMAS), 561-568, Budapest, Hungary, 2009. [abs] [bib] [pdf]

  • Value of Communication In Decentralized POMDPs.

    A. Carlin and S. Zilberstein. AAMAS Workshop on Multi-Agent Sequential Decision Making in Uncertain Domains (MSDM), Budapest, Hungary, 2009. [abs] [bib] [pdf]

  • Dynamic Programming Approximations for Partially Observable Stochastic Games.

    A. Kumar and S. Zilberstein. Proceedings of the Twenty-Second International FLAIRS Conference, Sanibel Island, Florida, 2009. [abs] [bib] [pdf]

  • Event-Detecting Multi-Agent MDPs: Complexity and Constant-Factor Approximation.

    A. Kumar and S. Zilberstein. Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence (IJCAI), 201-207, Pasadena, California, 2009. [abs] [bib] [pdf]

  • Incremental Policy Generation for Finite-Horizon DEC-POMDPs.

    C. Amato, J.S. Dibangoye, and S. Zilberstein. Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS), Thessaloniki, Greece, 2009. [abs] [bib] [pdf]

  • Multi-Agent Online Planning with Communication.

    F. Wu, S. Zilberstein, and X. Chen. Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS), Thessaloniki, Greece, 2009. [abs] [bib] [pdf]

  • Myopic and Non-Myopic Communication Under Partial Observability.

    A. Carlin and S. Zilberstein. Proceedings of Intelligent Agent Technology (IAT), Milan, Italy, 2009. [abs] [bib] [pdf]

  • Complexity of Decentralized Control: Special Cases.

    M. Allen and S. Zilberstein. Proceedings of Neural Information Processing Systems (NIPS), Vancouver, British Columbia, December 2009. [abs] [bib] [pdf]

  • Communication-Based Decomposition Mechanisms for Decentralized MDPs.

    C.V. Goldman and S. Zilberstein. Journal of Artificial Intelligence Research, 32:169-202, 2008. [abs] [bib] [pdf]

  • Formal Models and Algorithms for Decentralized Decision Making under Uncertainty.

    S. Seuken and S. Zilberstein. Autonomous Agents and Multi-Agent Systems, 17(2):190-250, October, 2008. [abs] [bib] [pdf]

  • A Successive Approximation Algorithm for Coordination Problems.

    M. Petrik and S. Zilberstein. Proceedings of the Tenth International Symposium on Artificial Intelligence and Mathematics (ISAIM), Ft. Lauderdale, Florida, 2008. [abs] [bib] [pdf]

  • Value-Based Observation Compression for DEC-POMDPs.

    A. Carlin and S. Zilberstein. Proceedings of the Seventh International Conference on Autonomous Systems and Multiagent Systems (AAMAS), 501-508, Estoril, Portugal, 2008. [abs] [bib] [pdf]

  • Heuristic Policy Iteration for Infinite-Horizon Decentralized POMDPs.

    C. Amato and S. Zilberstein. AAMAS Workshop on Multi-Agent Sequential Decision Making in Uncertain Domains (MSDM), 1-15, Estoril, Portugal, 2008. [abs] [bib] [pdf]

  • Observation Compression in DEC-POMDP Policy Trees.

    A. Carlin and S. Zilberstein. AAMAS Workshop on Multi-Agent Sequential Decision Making in Uncertain Domains (MSDM), 31-45, Estoril, Portugal, 2008. [abs] [bib] [pdf]

  • Optimizing Fixed-Size Stochastic Controllers for POMDPs.

    C. Amato, D.S. Bernstein, and S. Zilberstein. AAAI Workshop on Advancements in POMDP Solvers, Chicago, Illinois, 2008. [abs] [bib] [pdf]

  • POMDP and DEC-POMDP Point-Based Observation Aggregation.

    A. Carlin and S. Zilberstein. AAAI Workshop on Advancements in POMDP Solvers, Chicago, Illinois, 2008. [abs] [bib] [pdf]

  • Interaction Structure and Dimensionality in Decentralized Problem Solving

    M. Allen, M. Petrik, and S. Zilberstein. Technical Report 08-11, Computer Science Department, University of Massachusetts Amherst, 2008. [abs] [bib] [pdf]

  • Learning to Communicate in a Decentralized Environment.

    C.V. Goldman, M. Allen, and S. Zilberstein. Autonomous Agents and Multi-Agent Systems, 15(1):47-90, 2007. [abs] [bib] [pdf]

  • Solving POMDPs Using Quadratically Constrained Linear Programs.

    C. Amato, D.S. Bernstein, and S. Zilberstein. Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI), 2418-2424, Hyderabad, India, 2007. [abs] [bib] [pdf]

  • Memory-Bounded Dynamic Programming for DEC-POMDPs.

    S. Seuken and S. Zilberstein. Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI), 2009-2015, Hyderabad, India, 2007. [abs] [bib] [pdf]

  • Bounded Dynamic Programming for Decetralized POMDPs.

    C. Amato, A. Carlin, and S. Zilberstein. AAMAS Workshop on Multi-Agent Sequential Decision Making in Uncertain Domains, Honolulu, Hawaii, May, 2007. [abs] [bib] [pdf]

  • Optimizing Memory-Bounded Controllers for Decentralized POMDPs.

    C. Amato, D.S. Bernstein, and S. Zilberstein. Proceedings of the Twenty Third Conference on Uncertainty in Artificial Intelligence (UAI), Vancouver, British Columbia, 2007. [abs] [bib] [pdf]

  • Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs.

    S. Seuken and S. Zilberstein. Proceedings of the Twenty Third Conference on Uncertainty in Artificial Intelligence (UAI), Vancouver, British Columbia, 2007. [abs] [bib] [pdf]

  • Agent Influence as a Predictor of Difficulty for Decentralized Problem-Solving.

    M. Allen and S. Zilberstein. Proceedings of the Twenty-Second Conference on Artificial Intelligence (AAAI), 688-693, Vancouver, British Columbia, 2007. [abs] [bib] [pdf]

  • Anytime Coordination Using Separable Bilinear Programs.

    M. Petrik and S. Zilberstein. Proceedings of the Twenty-Second Conference on Artificial Intelligence (AAAI), 750-755, Vancouver, British Columbia, 2007. [abs] [bib] [pdf]

shlomo@cs.umass.edu
UMass Amherst