Skip to Content
Artificial intelligence

This robot can tidy a room without any help

A new system helps robots navigate homes they’ve never seen before with a little help from open-source AI models.

robot arm next to a laundry basket and pile of clothes
Stephanie Arnett/MITTR | Envato, Getty

Robots are good at certain tasks. They’re great at picking up and moving objects, for example, and they’re even getting better at cooking.

But while robots may easily complete tasks like these in a laboratory, getting them to work in an unfamiliar environment where there’s little data available is a real challenge.

Now, a new system called OK-Robot could train robots to pick up and move objects in settings they haven’t encountered before. It’s an approach that might be able to plug the gap between rapidly improving AI models and actual robot capabilities, as it doesn’t require any additional costly, complex training.

To develop the system, researchers from New York University and Meta tested Stretch, a commercially available robot made by Hello Robot that consists of a wheeled unit, a tall pole, and a retractable arm, in a total of 10 rooms in five homes. 

While in a room with the robot, a researcher would scan their surroundings using Record3D, an iPhone app that uses the phone’s lidar system to take a 3D video to share with the robot. 

The OK-Robot system then ran an open-source AI object detection model over the video’s frames. This, in combination with other open-source models, helped the robot identify objects in that room like a toy dragon, a tube of toothpaste, and a pack of playing cards, as well as locations around the room including a chair, a table, and a trash can.

The team then instructed the robot to pick up a specific item and move it to a new location. The robot’s pincer arm did this successfully in 58.5% of cases; the success rate rose to 82% in rooms that were less cluttered. (Their research has not yet been peer reviewed.)

The recent AI boom has led to enormous leaps in language and computer vision capabilities, allowing robotics researchers access to open-source AI models and tools that didn’t exist even three years ago, says Matthias Minderer, a senior computer vision research scientist at Google DeepMind, who was not involved in the project.

“I would say it’s quite unusual to be completely reliant on off-the-shelf models, and that it’s quite impressive to make them work,” he says.

“We’ve seen a revolution in machine learning that has made it possible to create models that work not just in laboratories, but in the open world,” he adds. “Seeing that this actually works in a real physical environment is very useful information.”

Because the researchers’ system used models that weren’t fine-tuned to this particular project, when the robot couldn’t find the object it was instructed to look for it simply stopped in its tracks instead of trying to work out a solution. That significant limitation is one reason the robot was more likely to succeed in tidier environments—fewer objects meant fewer chances for confusion, and a clearer space for navigation.

Using ready-made open-source models was both a blessing and a curse, says Lerrel Pinto, an assistant professor of computer science at New York University, who co-led the project. 

“On the positive side, you don’t have to give the robot any additional training data in the environment, it just works,” he says. “On the con side, it can only pick an object up and drop it somewhere else. You can’t ask it to open a drawer, because it only knows how to do those two things.” 

Combining OK-Robot with voice recognition models could allow researchers to deliver instructions simply by speaking to the robot, making it easier for them to experiment with readily available datasets, says Mahi Shafiullah, a PhD student at New York University who co-led the research.

“There is a very pervasive feeling in the [robotics] community that homes are difficult, robots are difficult, and combining homes and robots is just completely impossible,” he says. “I think once people start believing home robots are possible, a lot more work will start happening in this space.”

Deep Dive

Artificial intelligence

Google DeepMind used a large language model to solve an unsolved math problem

They had to throw away most of what it produced but there was gold among the garbage.

AI for everything: 10 Breakthrough Technologies 2024

Generative AI tools like ChatGPT reached mass adoption in record time, and reset the course of an entire industry.

What’s next for AI in 2024

Our writers look at the four hot trends to watch out for this year

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.