“Unmanned vehicles usually obtain a large amount of information about the surrounding environment by fusing the data of various sensors such as lidar, camera, and millimeter-wave radar. This article briefly understands the application of lidar and camera in the perception of unmanned vehicles.
Unmanned vehicles usually obtain this information by fusing data from various sensors such as Lidar, Camera, and Millimeter Wave Radar. Let’s take a brief look at the application of lidar and cameras in autonomous vehicle perception.
In order to ensure that the unmanned vehicle understands and grasps the environment, the environmental perception part of the unmanned system usually needs to obtain a large amount of information about the surrounding environment, specifically including: the location, speed and possible behavior of obstacles, drivable areas, traffic rules, etc. Unmanned vehicles usually obtain this information by fusing data from various sensors such as Lidar, Camera, and Millimeter Wave Radar. Let’s take a brief look at the application of lidar and cameras in autonomous vehicle perception.
LiDAR is a type of device that uses laser light for detection and ranging. It can send millions of light pulses to the environment every second. Its interior is a rotating structure, which allows LiDAR to establish the surrounding environment in real time. 3D map.
Generally speaking, lidar rotates and scans the surrounding environment at a speed of about 10Hz. The result of one scan is a 3-dimensional map composed of dense points. Each point has (x, y, z) information. This map is called For the point cloud map (Point Cloud Graph), as shown in the figure below, is a point cloud map established using the Velodyne VLP-32c lidar:
Lidar is still the most important sensor in unmanned systems because of its reliability. However, in practical use, lidar is not perfect, and often there is a problem that the point cloud is too sparse, and even some points are lost. The surface of the object is difficult to discern with lidar, and lidar cannot be used in conditions such as heavy rain.
In order to understand point cloud information, generally speaking, we perform two steps on point cloud data: segmentation and classification. Among them, segmentation is to cluster discrete points in the point cloud image into several wholes, and classification is to distinguish which category these wholes belong to (such as pedestrians, vehicles, and obstacles). Segmentation algorithms can be classified into the following categories:
・ Edge-based methods, such as gradient filtering, etc.;
・ Region-based methods, which use regional features to cluster neighboring points. The clustering is based on the use of some specified criteria (such as Euclidean distance, surface normal, etc.) Select several seed points from the point cloud, and then use the specified criteria to cluster adjacent points from these seed points;
・ Parametric methods, which use pre-defined models to fit point clouds. Common methods include Random Sample Consensus (RANSAC) and Hough Transform (HT);
Attribute-based method, first calculate the attribute of each point, and then cluster the points associated with the attribute;
・ Machine learning based methods;
After completing the target segmentation of the point cloud, the segmented targets need to be correctly classified. In this link, the classification algorithm in machine learning is generally used, such as Support Vector Machine (SVM) to analyze the clustering features. Classification. In recent years, due to the development of deep learning, the industry has begun to use specially designed Convolutional Neural Network (CNN) to classify three-dimensional point cloud clusters.
However, whether it is the method of extracting features-SVM or the method of original point cloud-CNN, due to the low resolution of the lidar point cloud itself, for targets with sparse reflection points (such as pedestrians), point cloud-based classification does not work. Reliable, so in practice, we often fuse lidar and camera sensors, use the high resolution of the camera to classify targets, use the reliability of lidar to detect and range obstacles, and fuse the advantages of both to complete environmental perception.
In driverless systems, we usually use image vision to complete road detection and target detection on the road. The detection of roads includes the detection of road lines (Lane Detection), the detection of drivable areas (Drivable Area Detection); the detection of road signs on the road includes the detection of other vehicles (Vehicle Detection), pedestrian detection (Pedestrian Detection), traffic signs and Detection and classification of all traffic participants, such as Traffic Sign Detection.
The detection of the lane line involves two aspects: the first is to identify the lane line, and the curvature of the curved lane line can be calculated; the second is to determine the offset of the vehicle itself relative to the lane line (that is, the unmanned vehicle itself is in the lane where the line is). One method is to extract some features of the lane, including edge features (usually seeking gradients, such as the Sobel operator), color features of the lane lines, etc., use a polynomial to fit the pixels we think may be lane lines, and then based on the polynomial and The current position of the camera mounted on the vehicle determines the curvature of the lane line ahead and the vehicle’s deviation from the lane.
One of the current methods for the detection of drivable areas is to use a deep neural network to directly segment the scene, that is, to complete the segmentation of the drivable area in the image by training a pixel-by-pixel classification deep neural network.
The detection and classification of traffic participants currently mainly rely on deep learning models. The commonly used models include two categories:
・ Region Proposal-based deep learning target detection algorithms represented by RCNN (RCNN, SPP-NET, Fast-RCNN, Faster-RCNN, etc.);
・Regression-based deep learning target detection algorithms represented by YOLO (YOLO, SSD, etc.).