Fresh juice

2024-04-03

Teaching robots to tidy up based on user preferences using large language models

Different people tend to have unique needs and preferences—particularly when it comes to cleaning or tidying up. Home robots, especially robots designed to help humans with house chores, should ideally be able to complete tasks in ways that account for these individual preferences.

Researchers at Princeton University and Stanford University recently set out to personalize the assistance offered by home robots using large language models (LLMs), a class of artificial intelligence models that are becoming increasingly popular after the release of ChatGPT. Their approach, presented in a paper pre-published on arXiv, was initially tested on a mobile robot called TidyBot engineered to tidy up indoor environments.

"For a robot to personalize physical assistance effectively, it must learn user preferences that can be generally reapplied to future scenarios," Jimmy Wu, Rika Antonova and their colleagues wrote in their paper. "In this work, we investigate personalization of household cleanup with robots that can tidy up rooms by picking up objects and putting them away."

The approach proposed by the researchers leverages the widely documented summarization capabilities of LLMs like ChatGPT. These models can summarize information or provide generalized guideline after being trained on relatively small datasets or scenario examples.

As part of their study, Wu, Antonova and their colleagues used a LLM to create '"ummaries" of a user's preferences when it comes to tidying up, which are based on a few inputs offered by users. For instance, a user might insert textual input such as "Red colored clothes go in the drawer while white ones go in the closet," and the model will formulate generalized preferences that can then guide a robot's actions.

"A key challenge is determining the proper place to put each object, as people's preferences can vary greatly depending on personal taste or cultural background," Wu, Antonova and their colleagues explained in their paper.

“For instance, one person may prefer storing shirts in the drawer, while another may prefer them on the shelf. We aim to build systems that can learn such preferences from just a handful of examples via prior interactions with a particular person. We show that robots can combine language-based planning and perception with the few-shot summarization capabilities of large language models (LLMs) to infer generalized user preferences that are broadly applicable to future interactions. ”

To evaluate their approach, the researchers ran a series of tests, assessing both the generalized preferences it produced when fed data from text-based datasets and how it affected the ability of a real robot to tidy up in personalized ways. They specifically applied it to TidyBot, a robot they developed that cleans the floor, while also picking up random objects in its surroundings and placing them in specific places.

"This approach enables fast adaptation and achieves 91.2% accuracy on unseen objects in our benchmark dataset," Wu, Antonova and their colleagues wrote. "We also demonstrate our approach on a real-world mobile manipulator called TidyBot, which successfully puts away 85.0% of objects in real-world test scenarios."

The recent work by this team of researchers highlights the potential of LLMs not only as tools to assist users with written tasks or answer questions, but also to enhance the abilities of robotic systems. In the future, it could inspire other teams to start testing the potential of these models for robotics applications.

The researchers' proposed LLM-based approach and the TidyBot robot they developed could soon also contribute to the creation of increasingly advanced home robots that can complete chores and tidy up environments in ways that are aligned with their users' preferences. Further studies could also develop this method further and improve its performance, for instance allowing it to perform better in highly cluttered environments.

"Our implementation of the real-world system contains simplifications such as the use of hand-written manipulation primitives, use of top-down grasps, and assumption of known receptacle locations," the researchers wrote.

"These limitations could be addressed by incorporating more advanced primitives into our system and expanding the capabilities of the perception system. Additionally, since the mobile robots cannot drive over objects, the system would not work well in excessive clutter. It would be interesting to incorporate more advanced high-level planning, so that instead of always picking up the closest object, the robot could reason about whether it needs to first clear itself a path to move through the clutter."

Share with friends:

Write and read comments can only authorized users