Robby the Robot
Copyright (C) 2009 Adam J. DiCarlo
Robby is a robot who tries to pick up as many littered soda cans as possible in a 10x10 (100-cell) office, which can
have up to 1 can per cell. More about Robby is at
... [More] http://adamdicarlo.com/genetic-algorithm-robby-the-robot .
Please file an issue on github if you find compile errors with a certain software combination or any other bug. [Less]
Settings in which many people and many computer systems work together, despite being distributed both geographically and in time, are increasingly the norm. Examples of group activities in which computer systems participate, whether as autonomous agents or as proxies for individual
... [More] people, include online auctions, hospital care-delivery systems, emergency response systems, military systems, and systems administration groups. These increasingly prevalent, heterogeneous group activities of computer systems and people-- whether competitive, cooperative, or collaborative--frequently require decision-making on the part of autonomous-agent systems or the support of decision-making by people.
A wide variety of issues arise in the decision making that accompanies task-oriented group activities. How do groups decide who should perform which tasks in service of which goals? How are coalitions formed? How do agents coordinate their behavior? How do agents select an appropriate recipe for achieving a goal from among the set that might be applicable? How are tasks allocated among agents capable of performing them? How do agents reconcile their current intentions to perform actions in service of group goals with new opportunities that arise? How do agents avoid conflicts in their plans for achieving goals? How does altruistic or cooperative behavior arise in individual behaviors? What incentives can be provided to encourage cooperation, helpfulness, or other group-benefiting behaviors when agents are designed independently or serve different organizations?
Professors Barbara Grosz and Sarit Kraus developed the game Colored Trails (CT) as a testbed for investigating the decision-making that arises in task settings, where the key interactions are among goals (individual and group), tasks required to accomplish those goals, and resources needed to perform the tasks. CT allows the modeling of all of these phenomena and exploration of their causes and ramifications. It provides the basis for development of a testbed that supports investigations of human decision-making and comparisons among computational strategies is needed, one that would enable human decision-making to be studied both in groups comprising only people and in heterogeneous groups of people and computer system and for computational strategies to be studied both in settings where computational agents interact only with other such agents and in heterogeneous settings. CT is parameterized in ways that allow for increasing complexity along a number of different dimensions that influence the performance of different approaches to decision making. It allows for specification of different reward structures, enabling examination of such trade-offs as the importance of the performance of others or the group as a whole to the outcome of an individual and the cost-benefit tradeoffs of collaboration-supporting actions. The game parameters may be set to vary environmental features such as task complexity, availability of and access to task-related information, and dependencies between agents. Although several testbeds and competitions have been developed to test strategies for automated agents operating in multi-agent settings, CT is the first testbed to be designed to investigate decision-making in heterogeneous groups of people and computer systems. It is thus novel in addressing the need to understand how computer agents should behave when they are participants in group activities that include people.
Structure of the gameColored Trails (CT) is played by two or more players on a rectangular board of colored squares. The rules are simple: Each player is given a starting position, a goal position on the board, and a set of chips in colors taken from the same palette as the squares. Players may advance toward their goals by moving to an adjacent board square. Such a move is allowed only if the player has a chip of the same color as the square, and the player must turn in the chip to carry out the move. Players may negotiate with their peers to exchange chips. Communication is controlled; players used a fixed but expressive messaging protocol.
The scoring function, which determines the payoff to the individual players, is a parameter of CT game instances and may depend on a variety of factors. At its simplest, it may consist of a weighted sum of such components as: whether the individual reached the goal, the final distance of the agent from the goal, the final number of chips the agent held. The scoring function may be varied to reflect different possible social policies and utility trade-offs, establishing a context in which to investigate the effects of different decision-making mechanisms.
For example, by varying the relative weights of individual and group good in the scoring function we can make collaborative behavior may become more or less beneficial. Despite the simplicity of the rules, play of CT is able to model a broad range of aspects of task situations in which a group of agents perform actions; it allows for scenarios in which agents act as individuals or in teams or both. Traversing a path through the board corresponds to performing a complex task the constituents of which are the individual tasks represented by each square. Different colors represent different tasks. The existence of multiple paths to a goal corresponds to the availability of different "recipes" or methods for achieving goals. The possession of a chip of a particular color corresponds to having the skills and resources needed for a task, and being able to deploy them at the appropriate time. Not all players get chips of all colors much as agents have different capabilities, availability, and resources. The exchange of chips corresponds to agents producing the intended effects of actions for each other; in different domain settings an exchange could be taken to correspond to agents providing resources to each other, doing tasks for them, or enabling them in some other way to do tasks they otherwise could not do. The game environment may be set to model different knowledge conditions as well. For example, varying the amount of the board an agent can "see" corresponds to varying information about task constituents or resource requirements, whereas varying the information players have about each other's chips corresponds to varying information agents have about the capabilities of others. Various requirements on player objectives, goal squares, and paths correspond to different types of group activities and collaborative tasks. To distinguish cooperative settings from those in which agents act independently, the scoring function may have a significant factor in which an agent's reward depends on others' performance. [Less]