Fresh juice

2023-11-01

Boston Dynamics turns Spot into a tour guide with ChatGPT

Robotics pioneer Boston Dynamics has unveiled a new proof-of-concept application for its Spot quadruped robot - an interactive tour guide powered by emergent artificial intelligence (AI) capabilities. The project demonstrates how large foundation models like ChatGPT can be integrated with physical robots to enable new behaviors and interactivity.

Boston Dynamics spent this past summer experimenting with foundational models, the massive AI systems trained on huge datasets that can perform a variety of tasks. The team was particularly inspired by advances in natural language processing models and visual question answering systems.

To test these AI capabilities in the real world, the company developed a demo tour guide Spot robot. Equipped with a speaker and microphone, Spot uses an image captioning model called BLIP-2 to describe objects it sees through its cameras. It then leverages a conversational chatbot model to elaborate on these observations, answer questions, make small talk, and guide the tour group.

According to Boston Dynamics, this showcases how large language models can act as "improv actors" - given a loose script, they can fill in coherent dialog spontaneously, bringing nuance and entertainment. The goal was not factual accuracy but rather to create an engaging, interactive experience.

The project required integrating audio capabilities onto Spot. The team 3D-printed a custom vibration-resistant mount to attach a speaker powered by Spot's onboard USB port. Microphone input is processed via OpenAI's Whisper speech recognition to transcribe human speech. Responses are generated via text-to-speech software to enable two-way conversation.

To make Spot's motions more lively and responsive during chat, the team utilized Spot's object tracking to turn its arm toward the current speaker. Simple gripper motions were also synchronized with speech to mimic a puppet mouth. Small touches like googly eyes further enhanced Spot's expressive range.

In testing, the team was impressed by the coherent and entertaining dialog the AI-powered Spot produced on the fly, even when given absurd personalities. However, it did sometimes make up facts - for example, claiming Boston Dynamics' logistics robot Stretch was designed for yoga.

Moving forward, Boston Dynamics plans to further explore combining physical robotics with large AI models. The company believes embodiment in the real world is key to grounding generative AI, while such models provide robots with flexible knowledge and reasoning. This demonstration highlights the exciting potential of AI and robotics to enable richer human-machine interaction.

Share with friends:

Write and read comments can only authorized users