Photo credit: www.therobotreport.com
Figure AI Inc. recently showcased the capabilities of its Helix visual-language-action (VLA) model through a demonstration involving humanoid robots conducting a common household chore: putting away groceries. The robots were given a single prompt and, through visual assessment, they identified each grocery item and collaboratively organized them in the kitchen.
Two major observations emerged from the demonstration. Firstly, the robots operated independently but demonstrated a sense of teamwork when the situation required one robot to pass items to the other. This shows an ability to adapt tasks based on real-time needs.
Secondly, the robots did not communicate verbally, yet there were distinct moments of interaction that resembled a ‘telepathic’ understanding, as they seemed to pause and observe each other. Figure explained that the AI system orchestrates the overall task, dividing it into manageable subtasks while controlling each robot seamlessly.
This demonstration marked a notable instance of humanoid robots working together collaboratively.
Figure demonstrated the capability of the Figure humanoid robots to work collaboratively in handling grocery items. | Credit: The Robot Report
To complete their assigned task, the robots also performed actions like closing drawers, shutting the refrigerator door, and placing items strategically on the counter. These actions, although intuitive for humans, were not included in the initial instructions, illustrating the robustness of the robots’ training.
In a separate blog post, the company elaborated on the architecture of the Helix supervision system used in the demonstration. Central to Helix is the VLA model, which is being positioned as an essential technology across various humanoid robot manufacturers.
Scaling curves for different approaches to acquiring new robot skills. In conventional heuristic manipulation, skills grow with Ph.D.s who manually script them. In conventional robot imitation learning, skills scale with data collected. With Helix, new skills can be specified on the fly with language. | Credit: Figure AI
At the 2023 RoboBusiness event in Santa Clara, Calif., The Robot Report observed an earlier demonstration of LLM-based guidance for robots. In a keynote, Pras Velagapudi, the chief technology officer at Agility Robotics, impressed attendees with a video of the Digit humanoid successfully decluttering a room upon receiving the command, “Clean up the room.”
The Figure Helix demonstration builds upon this momentum, showcasing a more advanced implementation. The VLA model, fully developed and tested, was created using a significant dataset comprising approximately 500 hours of diverse teleoperated behaviors.
To establish natural language-conditioning for training, Figure utilized an auto-labeling system to create hindsight instructions. This system processed segmented video feeds from the robots’ onboard cameras, prompting the question, “What instruction would you have given the robot to achieve the action demonstrated in this video?”
Figure displays Helix functionality
The Helix VLA model signifies a notable step forward in robotics and artificial intelligence, with several key features distinguishing it from previous models:
1. Comprehensive upper-body control
Dexterity: The Helix model allows for extensive control over a humanoid’s upper body, including the torso, head, wrists, and individual fingers, achieving 35 degrees of freedom. This allows for a more refined and nuanced manipulation of various objects compared to earlier iterations.
Human-like motions: With the capability to maneuver the entire upper body, Helix can execute tasks in a more biologically-inspired way, such as coordinating hand and head movements for better visual alignment and optimizing torso positioning for improved reach.
2. Collaborative multi-robot interaction
Cooperative actions: The demonstration highlighted two robots effectively working in unison, facilitating more complex tasks like organizing groceries or assembling items.
Zero-shot generalization: By engaging in tasks with unfamiliar objects, the robots showcased their ability to adapt and generalize their learning to new scenarios.
3. Versatile object handling
General object recognition: Helix allows humanoids to identify and interact with an array of household items without extensive specific training.
Natural language processing: The robots comprehended and acted on ordinary language commands, illustrating their capacity to respond to broad instructions without detailed prior guidance.
4. Integrated neural network
Unified model for tasks: Transitioning from separate models for different behaviors, Helix utilizes a single neural network architecture for all functionalities, streamlining its operation.
Elimination of task-specific adjustments: With its adaptability, Helix can perform various tasks without tailored fine-tuning for each specific activity, enhancing user-friendliness across diverse environments.
5. Preparedness for commercial application:
Onboard computing: The Helix system operates on embedded GPUs within the Figure 02 humanoid, ensuring low power consumption, which is essential for practical applications in homes and workplaces.
Minimized latency: The onboard processing capability allows the robot to react swiftly to commands, enhancing its real-time interaction capabilities.
Advancements in production trials
In late 2024, Figure announced advancements in their robots’ developmental stages, transitioning to commercial trials, and revealing that they have already supplied Figure 02 systems to a paying client.
Recognized for its innovation, Figure AI received a 2024 RBR50 award. The Sunnyvale, California-based company has made significant strides since its emergence from stealth in January 2023, rapidly developing and testing its humanoid robots in practical settings.
Recently, Figure announced plans to certify its robot’s battery and safety control systems to adhere to industrial safety standards while also indicating intentions to produce 100,000 humanoid robots in the next four years. Reportedly, the company is also in discussions to raise $1.5 billion in funding.
Explore humanoid technology at the Robotics Summit
Humanoid robots will take center stage at the upcoming Robotics Summit & Expo, scheduled for April 30 to May 1 in Boston, produced by WTWH Media, which also oversees The Robot Report. The opening keynote on Day 2 will be delivered by Aaron Saunders, CTO of Boston Dynamics, who will discuss the redesigned Atlas robot and the future of humanoid robotics.
The first day will feature a panel discussion examining current challenges and developments in humanoid technology, featuring insights from industry leaders including Velagapudi, Aaron Prather from ASTM International, and Al Makke from Schaeffler. The discussion will cover technical hurdles, safety standards, and the real-world experiences from early humanoid implementations.
Bringing together over 5,000 developers, the Robotics Summit & Expo will be an opportunity to explore the latest advancements in robotics technology, engineering practices, and market trends.
The event will showcase more than 200 exhibitors, feature over 70 speakers on various stages, and provide 10+ hours of dedicated networking opportunities, including a Women in Robotics Breakfast and a career fair. The event will also include the RBR50 Pavilion and Awards Dinner, celebrating the 2024 RBR50 Robotics Innovation Award recipients.
Source
www.therobotreport.com