Giving robots a better feel for object manipulation


A new learning system developed by MIT researchers improves robots’ abilities to mold materials into target shapes and make predictions about interacting with solid objects and liquids. The system, known as a learning-based particle simulator, could give industrial robots a more refined touch – and it may have fun applications in personal robotics, such as modelling clay shapes or rolling sticky rice for sushi.

In robotic planning, physical simulators are models that capture how different materials respond to force. Robots are “trained” using the models, to predict the outcomes of their interactions with objects, such as pushing a solid box or poking deformable clay. But traditional learning-based simulators mainly focus on rigid objects and are unable to handle fluids or softer objects. Some more accurate physics-based simulators can handle diverse materials, but rely heavily on approximation techniques that introduce errors when robots interact with objects in the real world.

In a paper being presented at the International Conference on Learning Representations in May, the researchers describe a new model that learns to capture how small portions of different materials – “particles” – interact when they’re poked and prodded. The model directly learns from data in cases where the underlying physics of the movements are uncertain or unknown. Robots can then use the model as a guide to predict how liquids, as well as rigid and deformable materials, will react to the force of its touch. As the robot handles the objects, the model also helps to further refine the robot’s control.

In experiments, a robotic hand with two fingers, called “RiceGrip,” accurately shaped a deformable foam to a desired configuration – such as a “T” shape – that serves as a proxy for sushi rice. In short, the researchers’ model serves as a type of “intuitive physics” brain that robots can leverage to reconstruct three-dimensional objects somewhat similarly to how humans do.

“Humans have an intuitive physics model in our heads, where we can imagine how an object will behave if we push or squeeze it. Based on this intuitive model, humans can accomplish amazing manipulation tasks that are far beyond the reach of current robots,” says first author Yunzhu Li, a graduate student in the Computer Science and Artificial Intelligence Laboratory (CSAIL). “We want to build this type of intuitive model for robots to enable them to do what humans can do.”

“When children are 5 months old, they already have different expectations for solids and liquids,” adds co-author Jiajun Wu, a CSAIL graduate student. “That’s something we know at an early age, so maybe that’s something we should try to model for robots.”

Joining Li and Wu on the paper are: Russ Tedrake, a CSAIL researcher and a professor in the Department of Electrical Engineering and Computer Science (EECS); Joshua Tenenbaum, a professor in the Department of Brain and Cognitive Sciences and a member of CSAIL and the Center for Brains, Minds, and Machines (CBMM); and Antonio Torralba, a professor in EECS and director of the MIT-IBM Watson AI Lab.

A new “particle simulator” developed by MIT improves robots’ abilities to mold materials into simulated target shapes and interact with solid objects and liquids. This could give robots a refined touch for industrial applications or for personal robotics. | Credit: MIT

Dynamic graphs

A key innovation behind the model, called the “particle interaction network” (DPI-Nets), was creating dynamic interaction graphs, which consist of thousands of nodes and edges that can capture complex behaviors of so-called particles. In the graphs, each node represents a particle. Neighboring nodes are connected with each other using directed edges, which represent the interaction passing from one particle to the other. In the simulator, particles are hundreds of small spheres combined to make up some liquid or a deformable object.

The graphs are constructed as the basis for a machine-learning system called a graph neural network. In training, the model over time learns how particles in different materials react and reshape. It does so by implicitly calculating various properties for each particle — such as its mass and elasticity — to predict if and where the particle will move in the graph when perturbed.

The model then leverages a “propagation” technique, which instantaneously spreads a signal throughout the graph. The researchers customized the technique for each type of material – rigid, deformable, and liquid – to shoot a signal that predicts particles positions at certain incremental time steps. At each step, it moves and reconnects particles, if needed.

For example, if a solid box is pushed, perturbed particles will be moved forward. Because all particles inside the box are rigidly connected with each other, every other particle in the object moves the same calculated distance, rotation, and any other dimension. Particle connections remain intact and the box moves as a single unit. But if an area of deformable foam is indented, the effect will be different. Perturbed particles move forward a lot, surrounding particles move forward only slightly, and particles farther away won’t move at all. With liquids being sloshed around in a cup, particles may completely jump from one end of the graph to the other. The graph must learn to predict where and how much all affected particles move, which is computationally complex.

Shaping and adapting

In their paper, the researchers demonstrate the model by tasking the two-fingered RiceGrip robot with clamping target shapes out of deformable foam. The robot first uses a depth-sensing camera and object-recognition techniques to identify the foam. The researchers randomly select particles inside the perceived shape to initialize the position of the particles. Then, the model adds edges between particles and reconstructs the foam into a dynamic graph customized for deformable materials.

Because of the learned simulations, the robot already has a good idea of how each touch, given a certain amount of force, will affect each of the particles in the graph. As the robot starts indenting the foam, it iteratively matches the real-world position of the particles to the targeted position of the particles. Whenever the particles don’t align, it sends an error signal to the model. That signal tweaks the model to better match the real-world physics of the material.

Next, the researchers aim to improve the model to help robots better predict interactions with partially observable scenarios, such as knowing how a pile of boxes will move when pushed, even if only the boxes at the surface are visible and most of the other boxes are hidden.

The researchers are also exploring ways to combine the model with an end-to-end perception module by operating directly on images. This will be a joint project with Dan Yamins’s group; Yamin recently completed his postdoc at MIT and is now an assistant professor at Stanford University. “You’re dealing with these cases all the time where there’s only partial information,” Wu says. “We’re extending our model to learn the dynamics of all particles, while only seeing a small portion.”

Editor’s Note: This article was republished with permission from MIT News.

Snake-inspired robot uses kirigami for swifter slithering

Bad news for ophiophobes: Researchers at the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) have developed a new and improved snake-inspired soft robot that is faster and more precise than its predecessor.

The robot is made using kirigami — a Japanese paper craft that relies on cuts to change the properties of a material. As the robot stretches, the kirigami surface “pops up” into a 3-D-textured surface, which grips the ground just like snake skin.

The first-generation robot used a flat kirigami sheet, which transformed uniformly when stretched. The new robot has a programmable shell, so the kirigami cuts can pop up as desired, improving the robot’s speed and accuracy.

The research was published in the Proceedings of the National Academy of Sciences.

“This is a first example of a kirigami structure with non-uniform pop-up deformations,” said Ahmad Rafsanjani, a postdoctoral fellow at SEAS and first author of the paper. “In flat kirigami, the pop-up is continuous, meaning everything pops at once. But in the kirigami shell, pop up is discontinuous. This kind of control of the shape transformation could be used to design responsive surfaces and smart skins with on-demand changes in their texture and morphology.”

The new research combined two properties of the material — the size of the cuts and the curvature of the sheet. By controlling these features, the researchers were able to program dynamic propagation of pop ups from one end to another, or control localized pop-ups.

Snake-inspired robot slithers even better than predecessor

This programmable kirigami metamaterial enables responsive surfaces and smart skins. Source: Harvard SEAS

In previous research, a flat kirigami sheet was wrapped around an elastomer actuator. In this research, the kirigami surface is rolled into a cylinder, with an actuator applying force at two ends. If the cuts are a consistent size, the deformation propagates from one end of the cylinder to the other. However, if the size of the cuts are chosen carefully, the skin can be programmed to deform at desired sequences.

“By borrowing ideas from phase-transforming materials and applying them to kirigami-inspired architected materials, we demonstrated that both popped and unpopped phases can coexists at the same time on the cylinder,” said Katia Bertoldi, the William and Ami Kuan Danoff Professor of Applied Mechanics at SEAS and senior author of the paper. “By simply combining cuts and curvature, we can program remarkably different behavior.”

Related content: 10 biggest challenges in robotics

Next, the researchers aim to develop an inverse design model for more complex deformations.

“The idea is, if you know how you’d like the skin to transform, you can just cut, roll, and go,” said Lishuai Jin, a graduate student at SEAS and co-author of the article.

This research was supported in part by the National Science Foundation. It was co-authored by Bolei Deng.

Editor’s note: This article was republished from the Harvard John A. Paulson School of Engineering and Applied Sciences.

Understand.ai accelerates image annotation for self-driving cars

Understand.AI accelerates image annotation for self-driving cars

Using processed images, algorithms learn to recognize the real environment for autonomous driving. Source: understand.ai

Autonomous cars must perceive their environment accurately to move safely. The corresponding algorithms are trained using a large number of image and video recordings. Single image elements, such as a tree, a pedestrian, or a road sign must be labeled for the algorithm to recognize them. Understand.ai is working to improve and accelerate this labeling.

Understand.ai was founded in 2017 by computer scientist Philip Kessler, who studied at the Karlsruhe Institute of Technology (KIT), and Marc Mengler.

“An algorithm learns by examples, and the more examples exist, the better it learns,” stated Kessler. For this reason, the automotive industry needs a lot of video and image data to train machine learning for autonomous driving. So far, most of the objects in these images have been labeled manually by human staffers.

“Big companies, such as Tesla, employ thousands of workers in Nigeria or India for this purpose,” Kessler explained. “The process is troublesome and time-consuming.”

Accelerating training at understand.ai

“We at understand.ai use artificial intelligence to make labeling up to 10 times quicker and more precise,” he added. Although image processing is highly automated, final quality control is done by humans. Kessler noted that the “combination of technology and human care is particularly important for safety-critical activities, such as autonomous driving.”

The labelings, also called annotations, in the image and video files have to agree with the real environment with pixel-level accuracy. The better the quality of the processed image data, the better is the algorithm that uses this data for training.

“As training images cannot be supplied for all situations, such as accidents, we now also offer simulations based on real data,” Kessler said.

Although understand.ai focuses on autonomous driving, it also plans to process image data for training algorithms to detect tumors or to evaluate aerial photos in the future. Leading car manufacturers and suppliers in Germany and the U.S. are among the startup’s clients.

The startup’s main office is in Karlsruhe, Germany, and some of its more than 50 employees work at offices in Berlin and San Francisco. Last year, understand.ai received $2.8 million (U.S.) in funding from a group of private investors.

Robotics Summit & Expo 2019 logoKeynotes | Speakers | Exhibitors | Register

Building interest in startups and partnerships

In 2012, Kessler started to study informatics at KIT, where he became interested in AI and autonomous driving when developing an autonomous model car in the KITCar students group. Kessler said his one-year tenure at Mercedes Research in Silicon Valley, where he focused on machine learning and data analysis, was “highly motivating” for establishing his own business.

“Nowhere else can you learn more within a shortest period of time than in a startup,” said Kessler, who is 26 years old. “Recently, the interest of big companies in cooperating with startups increased considerably.”

He said he thinks that Germany sleepwalked through the first wave of AI, in which it was used mainly in entertainment devices and consumer products.

“In the second wave, in which artificial intelligence is applied in industry and technology, Germany will be able to use its potential,” Kessler claimed.