Distributed Multi-Agent Systems

Many engineering systems can be characterized as a large scale collection of interacting subsystems each having access to local information, making local decisions, having local interactions with neighbors, and seeking to optimize local objectives that may well be in conflict with other subsystems. A representative sampling includes autonomous vehicle teams, cooperative robotics, queueing systems, distributed computing, electronic commerce, wireless networks, sensor networks, traffic control, social networks, and combat systems.

The analysis and design of such control systems falls under the broader framework of "complex and distributed systems". Other names include "multi-agent control," "cooperative control," "networked control," as well as "team theory" or "swarming." Regardless of the nomenclature, the central challenge remains the same. That is to derive desirable collective behaviors through the design of individual agent control algorithms. The potential benefits of distributed decision architectures include the opportunity for real-time adaptation (or self-organization) and robustness to dynamic uncertainties such as individual component failures, non-stationary environments, and adversarial elements. These benefits come with significant challenges, such as the complexity associated with a potentially large number of interacting agents and the analytical difficulties of dealing with overlapping and partial information.

Sensor Coverage Sudoku? Weapon Target Assignment

A Non-Cooperative Approach to Cooperative Control

My research focuses on dealing with the distributed nature of decision making and information processing through a non-cooperative game-theoretic formulation. The interactions of a distributed/multi-agent control system are modeled as a non-cooperative game among agents, with the desired collective behavior being expressed as a Nash equilibrium. There are several advantages to modeling the interactions of a multi-agent control system as a non-cooperative game where agents are "self-interested". Most notably, there is an inherent robustness to agent failures and dynamic policy constraints. The challenge of modeling a multiagent system as a non-cooperative game is to design local agent objective functions, which may very well be in conflict with one another, and implementable learning dynamics such that the collective behavior of the agents is guaranteed to satisfy the desired global objective.

For illustrative purposes, consider the problem of Sudoku. While Sudoku is not typically thought of as a distributed multi-agent system, it illustrates the same fundamental design challenges inherent within any application domain.

Global Objective: Solve puzzle

Approach: Design white cells as self-interested autonomous agents with available actions {1,2...,9}. Assign each agent a local objective function, such as the number of conflicts with the agent's neighbors (see figure), that is appropriately aligned with the global objective. Employ a distributed learning algorithm that guarantees finding a solution to the puzzle.

Regardless of the specifc application domain, the general approach remains the same. For example, in the case of dynamic sensor coverage one can design autonomous self-interested sensors as opposed to centrally conrolled resources and model their interactions as a non-cooperative game.

Learning in Games: Prescriptive vs. Descriptive

The theory of learning in games has sought to understand how and why equilibria emerge in non-cooperative games. Traditionally, social sciences would develop descriptive behavioral models for players, analyze the limiting behavior, and generalize the results for large classes of games. My research focuses on understanding these player behavioral models not from a descriptive point of view, but rather from a prescriptive point of view. The goal is to use these behavioral models as a prescriptive control approach in multi-agent systems where the guaranteed limiting behavior would represent a desirable operating condition, such as a solution to the Sudoku puzzle or a desirable allocation of sensors over a mission space. The development of learning algorithms with desirable performance guarantees is of independent interest to both fields of social sciences and engineering, but the emphasis is different. Social sciences primarily worry about rationality of models, whereas my focus is on the performance of these algorithms in large scale distributed systems.

A learning algorithm specifies precisely how players process past observations to make future decisions. In large scale systems, the sheer quantity of information makes centralized information gathering and processing computationally prohibitive. Furthermore, most of the distributed learning algorithms that guarantee convergence to an equilibrium are prohibitive in their informational and computational requirements. For example, consider the problem of congestion in a large scale network where the "drivers" or "packets" are designed as self-interested agents. In settings with a large number of agents scattered over a large network, the informational and observational requirement of the learning algorithms must be small enough to ensure that one can effectively implement such an algorithm. An example of an implementable learning algorithm is

My research has focused on deriving distributed learning algorithms with desirable convergence properties, such as minimizing the total congestion over the network. The emphasis is on the development of simple algorithms with minimal computational requirements to accommodate easy implementation in a wide variety of engineered systems.

The Cost of Localization in Large Scale Coordination

In any multi-agent system, transitioning from a centralized control scheme to a distributed control scheme where agents make decisions independently comes with several advantages as aforementioned. However, there is also a cost associated with localizing decisions. Most notably, localizing decisions may enduce new equilibria which may not be desirable with respect to the global objective. My research focuses on bounding this efficiency loss. In particular, I have focused on deriving worst case efficiency bounds, known as the price of anarchy and price of stability, for general multi-agents systems. I have analyzed the Cost of Localization for several application domains including dynamic sensor coverage, network coding, and load balancing in queueing systems.