Robots deciding their next move need help prioritizing
The team of researchers chose Capture the Flag because it’s played with two teams, each with multiple teammates, where the opposing team is also making decisions.
“Robots can learn how to react in an environment like a competitive game by using a kind of trial and error process, called reinforcement learning. They learn what actions to take in a given situation by playing the game,” said Huy Tran, a researcher in the Department of Aerospace Engineering at UIUC. “The challenge is to figure out how to create agents that can also adapt to unexpected situations.”
Tran said his team realized that the robots needed help in prioritizing tasks.
“Given the overall task of capturing the flag, there are actually sub tasks to accomplish the along the way, which we model in a hierarchical structure. What we wanted to explore was whether or not this type of hierarchy would help with the ability to adapt.”With hierarchical deep reinforcement learning, Tran said tasks are split up—to capture the flag or to tag a member of the opposing team to eliminate them—so the model can handle more complex problems.
“By breaking the task into sub tasks, we can improve adaptation. We trained a high-level decision maker who assigns a sub task for each agent to focus on.” Tran said. The hierarchical structure helps by making updates to the model simpler, Tran said. Only the hierarchical controller would need to be updated rather than each of the agents.
“This approach has the potential to solve interesting and challenging problems, but there are a lot of issues that we still need to address before we can deploy these systems in real-world situations. For example, we learned that this framework can help with adaptation,” Tran said, “but we recognize that in this study we decided what the sub tasks should be based on our own intuition of how the game works. That is not ideal because it has our own biases. What we're doing now is looking at new techniques to allow agents to figure out what those sub goals should be on their own.”
The study, “Evaluating Adaptation Performance of Hierarchical Deep Reinforcement Learning,” was written by Neale Van Stralen, Seung Hyun Kim, Huy T. Tran, and CSL's Girish Chowdhary. The research was funded by the Defense Advanced Research Projects Agency and was presented at the 2020 IEEE International Conference on Robotics and Automation (ICRA). A short video illustrates the work that includes the hierarchical controller in action.