Researchers back Tesla’s non-LiDAR approach to self-driving cars

If you haven’t heard, Tesla CEO Elon Musk is not a LiDAR fan. Most companies working on autonomous vehicles – including Ford, GM Cruise, Uber and Waymo – think LiDAR is an essential part of the sensor suite. But not Tesla. Its vehicles don’t have LiDAR and rely on radar, GPS, maps and other cameras and sensors.

“LiDAR is a fool’s errand,” Musk said at Tesla’s recent Autonomy Day. “Anyone relying on LiDAR is doomed. Doomed! [They are] expensive sensors that are unnecessary. It’s like having a whole bunch of expensive appendices. Like, one appendix is bad, well now you have a whole bunch of them, it’s ridiculous, you’ll see.”

“LiDAR is lame,” Musk added. “They’re gonna dump LiDAR, mark my words. That’s my prediction.”

While not as anti-LiDAR as Musk, it appears researchers at Cornell University agree with his LiDAR-less approach. Using two inexpensive cameras on either side of a vehicle’s windshield, Cornell researchers have discovered they can detect objects with nearly LiDAR’s accuracy and at a fraction of the cost.

The researchers found that analyzing the captured images from a bird’s-eye view, rather than the more traditional frontal view, more than tripled their accuracy, making stereo camera a viable and low-cost alternative to LiDAR.

Tesla’s Sr. Director of AI Andrej Karpathy outlined a nearly identical strategy during Autonomy Day.

“The common belief is that you couldn’t make self-driving cars without LiDARs,” said Kilian Weinberger, associate professor of computer science at Cornell and senior author of the paper Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving. “We’ve shown, at least in principle, that it’s possible.”

LiDAR uses lasers to create 3D point maps of their surroundings, measuring objects’ distance via the speed of light. Stereo cameras rely on two perspectives to establish depth. But critics say their accuracy in object detection is too low. However, the Cornell researchers are saying the date they captured from stereo cameras was nearly as precise as LiDAR. The gap in accuracy emerged when the stereo cameras’ data was being analyzed, they say.

“When you have camera images, it’s so, so, so tempting to look at the frontal view, because that’s what the camera sees,” Weinberger says. “But there also lies the problem, because if you see objects from the front then the way they’re processed actually deforms them, and you blur objects into the background and deform their shapes.”

Cornell researchers compare AVOD with LiDAR, pseudo-LiDAR, and frontal-view (stereo). Ground- truth boxes are in red, predicted boxes in green; the observer in the pseudo-LiDAR plots (bottom row) is on the very left side looking to the right. The frontal-view approach (right) even miscalculates the depths of nearby objects and misses far-away objects entirely.

For most self-driving cars, the data captured by cameras or sensors is analyzed using convolutional neural networks (CNNs). The Cornell researchers say CNNs are very good at identifying objects in standard color photographs, but they can distort the 3D information if it’s represented from the front. Again, when Cornell researchers switched the representation from a frontal perspective to a bird’s-eye view, the accuracy more than tripled.

“There is a tendency in current practice to feed the data as-is to complex machine learning algorithms under the assumption that these algorithms can always extract the relevant information,” said co-author Bharath Hariharan, assistant professor of computer science. “Our results suggest that this is not necessarily true, and that we should give some thought to how the data is represented.”

“The self-driving car industry has been reluctant to move away from LiDAR, even with the high costs, given its excellent range accuracy – which is essential for safety around the car,” said Mark Campbell, the John A. Mellowes ’60 Professor and S.C. Thomas Sze Director of the Sibley School of Mechanical and Aerospace Engineering and a co-author of the paper. “The dramatic improvement of range detection and accuracy, with the bird’s-eye representation of camera data, has the potential to revolutionize the industry.”

阅读原文