Photo credit: www.sciencedaily.com
Robots designed for household tasks may struggle to adapt when moved from their training environment in a factory to a user’s kitchen, potentially failing to execute simple chores like cleaning the sink or taking out the trash. This is primarily due to the differences between the two settings.
To mitigate these challenges, engineers often strive to closely align the simulated training settings with those of real-world deployment. However, researchers from MIT and various institutions have discovered that training an artificial intelligence (AI) agent in a distinct environment can sometimes yield superior results.
Their findings suggest that preparing an AI agent in a less chaotic environment can enhance its performance compared to an agent trained under the same unpredictable conditions used for evaluations. The researchers have termed this intriguing dynamic the “indoor training effect.”
Serena Bono, a research assistant at the MIT Media Lab and lead author of a study on this phenomenon, analogizes the effect with tennis training: “If a player practices in a controlled, indoor setting devoid of distractions, they may master various strokes more efficiently. When they transition to a more challenging outdoor court where wind is a factor, their prior, clearer training can lead to better overall performance than if they had started on the windy court.”
To investigate this phenomenon, the researchers focused on training AI agents to play modified versions of Atari games infused with varying levels of unpredictability. They were pleasantly surprised to observe that the indoor training effect consistently emerged across different games and their variants.
The team hopes their discoveries pave the way for enhanced training methodologies for AI agents.
“This opens up new avenues for consideration. Rather than just replicating the conditions of real-world settings during training, we might construct simulations that allow AI agents to learn more effectively,” explains co-author Spandan Madan, a Harvard University graduate student.
Joining Bono and Madan in this research are Ishaan Grover (MIT), Mao Yasueda (Yale), Cynthia Breazeal (MIT Media Lab), Hanspeter Pfister (Harvard), and Gabriel Kreiman (Harvard Medical School). Their findings are slated for presentation at the Association for the Advancement of Artificial Intelligence Conference.
Challenges in Training
The researchers aimed to unravel the reasons behind the disappointing performance of reinforcement learning agents when they transition to environments that deviate from their training situations.
Reinforcement learning involves a trial-and-error process where agents explore their training environment to discern actions that maximize rewards.
To further their study, the team introduced a variable of controlled noise into an aspect of the reinforcement learning framework known as the transition function. This function outlines the probabilities of state changes based on chosen actions. For instance, while playing Pac-Man, this function would inform the probability of ghost movements across the game board.
Traditionally, the AI’s training and testing stages utilize the same transition function. The researchers manipulated this function by introducing noise, leading to poorer performance by the agent in Pac-Man.
Nonetheless, when they trained the agent using a noise-free version of the game and later assessed its capabilities in a noise-augmented version, it outperformed an agent trained exclusively in the noise-infused environment.
“Common wisdom asserts that accurately replicating the deployment conditions during training maximizes effectiveness. We rigorously challenged this notion because we found it hard to accept,” Madan remarks.
By introducing varying levels of noise into the transition function, the researchers could evaluate numerous scenarios, although this did not accurately mirror realistic gameplay. Increased noise in Pac-Man led to erratic ghost behavior, impacting gameplay fidelity.
When examining the indoor training effect under standard settings in Pac-Man, the researchers tweaked probabilities so that ghosts behaved more predictably while enhancing their likelihood of moving vertically compared to horizontally. Agents trained in noise-free conditions consistently outperformed their counterparts in these authentic gaming scenarios.
“Our results indicated that noise addition wasn’t merely a byproduct of our experimental design; it appears to be an intrinsic characteristic of the reinforcement learning dynamics we studied, which was unexpected,” Bono states.
Investigating Exploration Patterns
As the team delved into their findings, they noted correlations related to the exploration approaches taken by AI agents.
In cases where both agents explored similar areas, the one trained in a noise-free environment exhibited better performance, possibly owing to a clearer understanding of game mechanics void of confusing noise. Conversely, when their exploration diverged significantly, the agent trained amidst noise tended to excel, likely because it was required to discern patterns absent in a noise-free context.
“If I only master tennis using my forehand in a noise-free context but find myself needing to utilize my backhand against noise in a different scenario, I will not be as proficient in the noise-free training space,” Bono notes.
Looking ahead, the researchers aim to probe the indoor training effect’s relevance in more complicated reinforcement learning scenarios and its application in fields such as computer vision and natural language processing. They also seek to create training environments that leverage the indoor training effect to potentially enhance AI agent performance in unpredictable environments.
Source
www.sciencedaily.com