2024-06-05
1X Technologies' AI breakthrough in voice-controlled humanoids
In the quiet suburbs of Sunnyvale, California, a revolution is unfolding—not with a bang, but with a simple voice command. At the headquarters of 1X Technologies, a humanoid robot responds to spoken instructions, seamlessly chaining together a series of complex tasks. This isn't just a tech demo; it's a glimpse into a future where our robotic helpers understand and obey us as naturally as a trusted assistant.
The breakthrough, unveiled today in a captivating video, showcases 1X Technologies' latest advancements in applying artificial intelligence and teleoperation to train its robots. The key innovation? Controlling sequences of skills via voice, a leap that brings us closer to the long-dreamed-of era of helpful household humanoids.
"This update showcases progress we've made toward longer autonomous behaviors," explains Erik Jang, vice president of AI at 1X Technologies. "We've previously shown that our robots were able to pick up and manipulate simple objects, but to have useful home robots, you have to chain tasks together smoothly."
The challenge is more complex than it might seem. In a structured lab setting, a robot can be positioned perfectly to interact with objects. But in a real home? "The robot doesn't always position itself right next to a table," Jang points out. "So we need to be able to tell it to adjust its position and then manipulate the object."
This necessity has led to an intriguing discovery. As 1X builds out its repertoire of robot skills, they're uncovering a host of fundamental actions—like getting closer or backing up—that humans can direct using natural language. It's a significant step toward making robots that can understand and respond to our everyday commands.
Traditionally, programming a robot for thousands of potential tasks has been a daunting challenge. 1X initially aimed to create a single, all-encompassing neural network, but reality forced a rethink. "Before, we thought of a single model for thousands of tasks, but it's hard to train for so many skills simultaneously," Jang admits.
Their solution? A modular approach. "We've added a few hundred individual capabilities. Our library of skills is mapped to simple language descriptions." This method allows for faster development and easier scaling. Meanwhile, they're using "shadow mode" evaluations to compare predictions against baselines, steadily progressing toward that ultimate goal: a unified model for all tasks.
The company has already made strides with generic navigation and manipulation policies. "We can give the robot a goal—'Please go to this part of the room'—and the same neural network can navigate to all parts of the room," Jang says. "Tidying up a room involves four primitives: going anywhere in the room, adjusting for position, picking something up, and putting it down."
But the real magic happens when humans enter the loop. 1X has built a system that allows people to guide robots using natural language, particularly when mistakes occur. "We've built a way for humans to instruct the robots on tasks so that if they make a mistake, the human can dictate what the command should be," Jang explains.
This human-in-the-loop approach isn't just a failsafe; it's a learning tool. By treating natural-language commands as a new type of action, 1X translates low-level instructions into higher-level actions. This paves the way for robots that can work autonomously for extended periods, handling complex tasks like cleaning that involve interacting with various tools and appliances.
Even more revolutionary is 1X's "farm-to-table" training method. The same people who gather data by teleoperating the robots are the ones who train them. "I'm super proud of the work they do," Jang beams. "We've closed the loop, and the teleoperators train everything themselves." This approach shows that even users without computer science backgrounds can train robots, removing a significant bottleneck to scaling.
The implications are profound. If operators can train low-level skills, they can also teach higher-level ones. "It's now very clear to us that we can transition away from predicting robot actions at low levels to building agents that can operate at longer horizons," Jang says. He even envisions a future where robots work with advanced language models like Gemini Pro Vision or GPT 4.0, enabling truly sophisticated behaviors.
Moreover, by allowing users to set high-level goals for multiple robots, 1X is opening the door to efficient fleet management. Imagine a home where several robots work in concert, each understanding its role in a larger task, all coordinated by your voice commands.
For those wondering when such humanoid robots will be ready for our homes, Jang's answer is surprising: "A lot of people think that general-purpose home or humanoid robots are far away, but they're probably a lot closer than one thinks."
This isn't mere optimism. 1X has been methodically preparing for domestic deployment. Over the past year, they've pivoted from purely commercial uses to more diverse settings. Jang asserts that by designing its own actuators, 1X has made its NEO robot inherently safe around humans—a prerequisite for household use. The hardware's ability to compensate also gives the AI room for error, a crucial feature in unpredictable home environments.
Still, Jang acknowledges the road ahead. "The onus is on us to get away from making videos to making something that people can see in person without hiding actual performance details," he says. He also cautions against superficial assessments: "Not everything with a torso and four limbs is a humanoid... Not all robots are created equal."
The financial aspect is also critical. "There's a sweet spot between overspeccing costs and underspeccing costs," Jang explains. "Many of the top humanoid companies are making different choices, and there's a spectrum between millimeter-level precision on fingers and calibration with cameras to, on the other end, 3D-printed robots. It's a healthy competition."
As we stand on the brink of this AI-driven future, where robots understand our words and anticipate our needs, one thing becomes clear. The race to bring humanoid robots into our homes isn't just about technology—it's about understanding human behavior, language, and the subtleties of our daily lives. In that race, 1X Technologies, with its breakthrough in voice-controlled, task-chaining robots, has just taken a significant lead.
Share with friends:
Write and read comments can only authorized users
Last news