The projects aims at investigation of problems and challenges in implementation of adversarial behavior in dynamic environments from game theoretic perspective and investigation of issues in long-term (deliberative) vs. short-term (reactive) multi-agent planning in adversarial scenarios. The empirical evaluation of the project aimed at validation in multi-agent simulations of urban warfare scenarios including heterogeneous teams composed of UAVs, VTOLs and UGVs.
Tactical AgentFly phases I, II and III, aiming at investigation of techniques for coordinated area information collection from basic ones implementing surveillance and tracking of a single target by a single UAV, to advanced ones enabling tracking of multiple targets with relatively small number UAVs. The third phase of the project series, an on-going project at this time, aims at porting of the previously developed techniques to real hardware UAVs and VTOLs and is geared towards a real-world demonstrator in a mixed simulation.
Tactical AgentScout phases I, II. The first phase of was aimed at integration of aerial information collection in ISTAR-type missions with different types of ground assets, the main focus of the project was on multi-agent planning and task-allocation problems.
The second phase of the AgentScout project focused on 4 main objectives:
In patrolling scenarios a defender (also termed as a patroller) needs to protect multiple targets against an attacker. There are numerous examples of real-world scenarios that correspond to the patrolling problem – such as protecting a perimeter, covering some area, or protecting high-value targets. The common assumption in all of these examples is that the attacker can choose to attack any target at any time, and that the attack requires specific completion time (i.e. number of steps) during which the attacker can be discovered by the defender.
Game theory is a suitable framework for solving such problems as the solutions it provides are the optimal strategies for the defender given the opponents’ information, capabilities and intentions. Game theoretic models have been successfully used to solve specific variants of the patrolling games in previous works and there are several successful applications of similar game-theoretic models in real-world scenarios. We follow the previous work and extend existing patrolling game-theoretic models towards allowing the high-value targets to change their positions in time. Besides the theoretic work we integrated this method within the project Tactical AgentScout 2 and experimentally evaluated on a scenario, where an aerial patroller is protecting two convoys moving through an adversarial area. The experimental results showed that the game-theoretic model outperforms the baseline non-adversarial approach. Moreover, these results show that the game-theoretic model is better in spite of the fact that not all assumptions of the game-theoretic model were satisfied in the experimental settings.
In approaches for smart targets we developed algorithms for controlling a target that actively avoids detection, and algorithms for tracking and capture by a team of pursuers. These scenarios are known as variants of pursuit-evasion games. The objective of the team of pursuers in these games can be either to capture the evader, or to maximize time in which the evader can be observed by the pursuers. The algorithms we provide for creating behaviors in the games are based on general algorithms for playing (imperfect information) games. For games with full observability of the agents, we base the algorithms on the Monte-Carlo Tree Search (MCTS) algorithm with action selection based on the Upper Confidence Bound that has been recently very successful in various domains. In the partially observable games, we base our approaches on the paranoid version of information set search. We use with the more standard minimax algorithm as well as with MCTS. We tackled the scalability issues by adding domain specific knowledge to the game playing algorithms. We designed domain-specific “heavy” simulation methods for MCTS and we used procedural knowledge heuristic to prune the search space.
Plan repairing is a process of partial adaptation of a plan during its execution according to new conditions in the environment. The plan has not to be altered as a whole, but only its local part which is inconsistent with the new conditions, has to be changed. On the contrary, re-planning is a process of restarted planning during execution of a plan. The new planning process starts from the current state under the current context. Re-planning do not reuse any parts of the old (inconsistent) plan.
The research challenge for the frame of plan repairing and re-planning included finding answers for questions as: “For what problem types we need plan repairing and for what re-planning?” The hypothesis supporting plan repairing approach required local and relatively rare unpredictable effects. The re-planning approach do not care the amount or impacts of the unpredictable effects, as the planning process has to be always run from the current state to the goal state. On the other hand, re-planning requires larger amounts of information to be shared among the planners (agents) in order to reconstruct the global state and context of other agents.
For an experimental validation and evaluation of the outlined hypotheses we have designed a testing scenario in the context of the Tactical AgentScout2 project. The results of the synthetic experimetns supported the premise that the plan repairing algorithms are from the perspective of algorithm complexity much more suitable for highly dynamic environments sacrificing only a small increase of the plan length. Additionally, the results showed the plan repairing technique can be also used in cooperation with other problem solving approaches for additional adaptation or fixing of more efficiently generated plans. The complexity of the problem it thus transformed into the complexity of the plan repairing problem, which can, using known planning approaches, provide better insights into the problem and well-known heuristics. Finally, we have demonstrated the multi-agent plan repairing technique in a form of the distributed continual planning of robotic team in highly dynamical and unpredictable environment.
In the coordination an teamwork research track, we studied methods that will allow us to specify the joint mission of a number of agents. In particular, we were focusing on how we can specify the desired sequence of actions an agent needs to perform in order to effectively pursue the joint goals of the team.
For this purpose we were examining commitment machines – a framework for monitoring of communication protocols. In this framework, each synchronization point in the protocol can be defined as a commitment one agent makes to another agent. A protocol cannot proceed to later stages until all the relevant commitments are discharged. Such a mechanism turns out to be also suitable for the specification of desired interactions between two or more agents. Distributed commitment machines represent a more realistic flavor of commitment machines that allows the mechanism to be executed in a distributed way, i.e. by each involved agent.
In order to evaluate the method, we have implemented a simple example scenario in which a team of troops moves in formation through an urban area. The modelled team consists of one leader with four subordinate troops. The team moves in a coordinated fashion so that they jointly cover the surrounding area. The mechanism was implemented using Jazzyk agent programming language, which provided us an advantage of being able to combine different knowledge representation techniques in our program. Using the technologies mentioned above, we were able to build agents exhibiting robust coordinated behavior, which was specified in form of a short declarative program.
Peter Novak (project coordinator, contact person), Antonin Komenda, Jiri Vokrinek, Michal Jakob, Eduard Semsch, Dusan Pavlicek, Lukas Chrpa, Branislav Bosansky, Michal Cap, Viliam Lisy and Michal Pechoucek (principal investigator)
The work presented is supported by US ARMY, CERDEC projects W911NF-08-1-0521 1312AM0, W911NF-08-1-0521, BAA 8020902.A/W15P7T-05-R-P209, W911NF-10-1-0112, and by the Czech Ministry of Education, Youth and Sports under Research Programme no. MSM6840770038: Decision Making and Control for Manufacturing III.