Robots learn to perform chores by watching YouTube

Learning has been a holy grail in robotics for decades. If these systems are going to thrive in unpredictable environments, they’ll need to do more than just respond to programming — they’ll need to adapt and learn. What’s become clear the more I read and speak with experts is true robotic learning will require a combination of many solutions.

Video is an intriguing solution that’s been the centerpiece of a lot of recent work in the space. Roughly this time last year, we highlighted WHIRL (in-the-Wild Human Imitating Robot Learning), a CMU-developed algorithm designed to train robotic systems by watching a recording of a human executing a task.

This week, CMU Robotics Institute assistant professor Deepak Pathak is showcasing VRB (Vision-Robotics Bridge), an evolution to WHIRL. As with its predecessor, the system uses video of a human to demonstrate the task, but the update no longer requires them to execute in a setting identical to the one in which the robot will operate.

“We were able to take robots around campus and do all sorts of tasks,” PhD student Shikhar Bahl notes in a statement. “Robots can use this model to curiously explore the world around them. Instead of just flailing its arms, a robot can be more direct with how it interacts.”

The robot is watching for a few key pieces of information, including contact points and trajectory. The team uses opening a drawer as an example. The contact point is the handle and the trajectory is the direction in which it opens. “After watching several videos of humans opening drawers,” CMU notes, “the robot can determine how to open any drawer.”

Obviously not all drawers behave the same way. Humans have gotten pretty good at opening drawers, but that doesn’t mean the occasional weirdly built cabinet won’t give us some trouble. One of the key tricks to improving outcomes is making larger datasets for training. CMU is relying on videos from databases like Epic Kitchens and Ego4D, the latter of which has “nearly 4,000 hours of egocentric videos of daily activities from across the world.”

Bahl notes that there’s a massive archive of potential training data waiting to be watched. “We are using these datasets in a new and different way,” the researcher notes. “This work could enable robots to learn from the vast amount of internet and YouTube videos available.”

source

Rinsu Ann Easo
Rinsu Ann Easo
Diligent Technical Lead with 9 years of experience in software development. Successfully lead project management teams to build technological products. Exposed to software development life cycle including requirement analysis, program design, development and unit testing and application maintenance. Has worked on Java, PHP, PL/SQL, Oracle forms and Reports, Oracle, Bootstrap, structs, jQuery, Ajax, java script, CSS, Microsoft Excel, Microsoft Word, C++, and Microsoft Office.

You May Also Like

WealthCom Secures $65M Funding Boost for Expansion

The investment aims to enhance service offerings and accelerate growth.Highlights: WealthCom raises $65 million in Series B funding.New...

Financial Health Tools Could Unlock $5B in SME Lending

CFIT report highlights the potential of financial hygiene tools for small businesses.Highlights: CFIT report suggests financial health tools...

Coinbase Strengthens UK Crypto Borrowing Options for Customers

Users can now borrow against their cryptocurrency holdings in the UK.Highlights: Coinbase now allows UK customers to borrow...

Chase UK Appoints Monzo’s Malani as CEO, Strengthening Leadership Team

Former Monzo executive will lead Chase UK's growth strategy.Highlights: Chase UK appoints Monzo veteran Malani as CEO.This move...