Pacifica with Dataspeed DBW

As the internal procurement processes are near to finalization, my Ph.D. student, Aws, and I paid a visit to Dataspeed. After productive discussions regarding the purchase (remember that the original quote is more than 1.5 years old), Dr. Micho showed a Pacifica equipped with the company’s Drive-by-Wire systems.

We are even more excited now after the cool demonstrations, which show several advanced new features. Thank you, Dataspeed, for your time and effort to host us.

OmO: One-minute Only, all you need.

OmO is an incremental end-to-end learning for lateral control in autonomous driving.

End-to-end driving. a) a human driver drives a vehicle as we collect driving data. (b) The driving data including the front camera images with synchronized control signals are saved in storage. The collected data must have all necessary features that can be expected in a testing phase of the neural network. (c) The training station where a neural network is trained with the collected data to associate input with output. (d) The trained neural network is deployed to the AI chauffeur who drives the vehicle by using inferred steering angles, throttle, and brakes.
High-level system overview. (a) a human driver drives a vehicle in a simulated track. (b) The front camera images and driving control data are collected. The data we collected can be minimal just enough to train a neural network to be able to drive a short distance where a few straight roads and some curved roads exist. (c) neural network training system. The camera input data is fed to the neural network training module expecting to infer a steering angle associated with the input. (d) The trained neural network is deployed to the AI chauffeur. The AI chauffeur drives the vehicle as the data is being collected. The initial performance of the driving cannot be expected to be good, but the AI chauffeur will be able to drive a little bit farther compared to the previous step. After disconnecting the data flow II from a human driver, by continuing the loop, (c) $\to$ (d) $\to$ (b’) $\to$ (c), a little bit smarter neural network in the AI chauffeur will be replaced with the old one at every single loop. This incremental training and driving will eventually make the AI chauffeur’s neural network be better and better in driving.

Abstract

Developing an autonomous driving system necessitates the use of high-quality data. However, due to the high cost of human labor, collecting driving data is often too expensive. With minimal initial data from a human driver, this research provides a novel strategy that uses an incremental approach to data collecting and neural network training. The proposed method, One Minute Only (OmO), is an end-to-end behavior cloning methodology that uses a convolutional neural network to develop a lateral controller for a vehicle. OmO begins by collecting the minimal amount of human driver’s driving data, which includes steering angles, throttles, velocities, and geographical locations. The human driving data is used to train a convolutional neural network. The trained neural network is then deployed to the vehicle’s driving controller, an Artificial Intelligence (AI) chauffeur’s brain. A human driver is no longer in the loop at this stage, and the inexperienced AI chauffeur drives the vehicle on a simulated track to collect some further data. The collected driving data will be fed into a convolutional neural network training module, which will help the AI chauffeur develop a new and, hopefully, stronger neural controller. The two steps of data collection and neural network training with the gathered data alternate until the neural network learns to properly correlate an image input with a steering angle. Extensive experiments have validated our proposed work, OmO. We anticipate that by utilizing this incremental driving data collection and neural network controller training, human effort and time will be considerably reduced, and the development of autonomous car technology will be accelerated. The findings of the experiments, as well as data and other material, are available online.

Introduction

Data collection needs huge human effort. In this paper, we provide a novel strategy in which a human driver is required to drive only as much as is necessary, and an Artificial Intelligence (AI) chauffeur powered by a deep convolutional neural network drives on behalf of the human driver when more driving data is required. Figure 1 shows the high-level system overview. To begin, a human driver drives a virtual car and captures only the bare minimum of data needed to train a neural network to drive a short distance by replicating the human driver’s behavior. A human driver is no longer in the loop at this time. After that, a driving controller employs the trained neural network. We gave it the name of an AI chauffeur who is driving a simulated vehicle. We recorded driving data for a little longer as the AI chauffeur drove the vehicle. After training a new neural network with the newly obtained data, the AI’s brain is transplanted with a new neural network in the hopes of improving the AI’s driving performance next time. This process may be repeated iteratively until we have a neural network that can reliably predict steering angles enough to control the lateral motions of the simulated vehicle.

Method

To implement and test our approach, we have been developing an open-source platform, OSCAR (Open-Source robotic Car Architecture for Research and education). https://github.com/jrkwon/oscar

The hierarchical and modular structure of the OSCAR. All ROS related modules are located under the catkin_ws folder. The system settings regarding the BIMINet CNN, AI chauffeur, data collection can be done by the files under the config folder. The CNN named BIMINet is implemented with tools for training, testing, data visualization, neural network performance testing, and data analyses.

Vehicle Design

The chassis of the car is based on the Ford Fusion model. Three cameras are mounted on the front windshield. In this article, we solely used the front camera. Ouster’s 64 channel 3D LiDAR [11] is mounted to the top of the windshield. We did not employ the LiDAR sensor for this paper. The simulated car is built as a plugin within the fusion ROS package.

Tracks Design

We designed two different tracks in the Gazebo world format (Track A and B). Track A was utilized to collect data for neural network training. The original track design came from Dataspeed ADAS kit Gazebo/ROS simulator, which includes a track for a lane-keeping demonstration. The track is created with modular road segment models that include straight road segments of 50, 100, and 200 meters and radius curve road portions of 50 and 100 meters. We embellished the track with roadside objects such as gas stations, residences, and major architectural complexes. This is important to give some variations in the roadsides to make the lane-keeping task more practical and realistic. In this work, all data collection tasks were conducted for training in Track A.

Track A:
Track B

Data Collection

A simulated vehicle in the OSCAR platform sends its current velocity and position in a ROS topic named /base_pose_ground_truth. The current steering angle and throttle position of the vehicle is being sent through a ROS topic named /funsion. The front camera is used to collect data in this paper. The topic name of the camera image is /fusion/front_camera/image_raw. The image message must be converted to be saved as an image file. This can be done by cv_bridge.

The high-level view of the data collection system.

Training

The image below is the neural network architecture that we used.

The training process can be described as follows.

Experimental Results

We collected one-minute driving data from a human driver and used the data to train a neural network. The first trained neural network will be deployed to an AI chauffeur who is like an inexperienced driver. We have the AI chauffeur drive the vehicle to collect the 2nd round of driving data. The collected data from the AI chauffeur who is not good at driving will be fed to train the next level of training. We continued this data collection and training cycle until the trained neural network successfully drives the track.

(a) – (i) are one minute, one minute 42 seconds, two through eight minutes respectively. In all nine cases, training was successfully conducted. We added the early stopping callback with a criterion of three consecutive no improvement. Note that the y axis scale is different from each case. For example, (i) looks bad compared to (a), but the scale in (a) is around ten times the saturated range in (i).

All data collected are available at this link. (TODO: add the Google Drive link of the datasets)

Driven distances during data collection on Track B under Condition 2. (a) – (i) are one minute, one minute 42 seconds, two through eight minutes respectively. Note that the scales of (a)-(f) are different from (g)-(i). The bottom left point (0, 0) is the starting location and the initial orientation of the vehicle is to the right side.

Data Visualization

Data Visualization: Ground truth vs. Prediction
Autonomous Driving Track A
Autonomous Driving Track B

Discussion

Activation Maps

Activation maps. (a) – (i) are one minute, one minute 42 seconds, two through eight minutes respectively. Left in each label is a camera image input and right is an overlaid activation map on the region of interest. Blue is low and red is high activation values. The activations are taken from the last convolutional layer in the network. (a) show low confidence overall but some activations on the edges of the road. Interestingly up to five-minute data (d)-(f), neural networks learned the shapes of the roads is more important than the road itself. This can be a reasonable choice for neural networks to minimize MSEs of predictions due to the lack of varieties of both sides of the road. As more datasets were fed to the neural network training (g)-(i), the neural networks have to change the strategy to minimize MSEs due to different objects residing on the left and/or right side of the road that are not supposed to affect the steering angle predictions.

Conclusion

We offer OmO, a unique method that uses an incremental approach to data collection and neural network training using minimal data from a human driver. Extensive testing using two separate tracks and two different driving circumstances successfully validated the proposed method. We also developed OSCAR, an open-source platform for autonomous driving. We anticipate that by utilizing this incremental driving data collection and neural network controller training with the open-source platform, OSCAR, significant human effort and time will be saved, hence expediting the development process of autonomous car technology. As future work, we are planning to automate the entire process except for the initial human driving data.

Working during Pandemic

My office is closed and Robotics Lab is not allowed to be used for research. So, I have been working in a temporary space that is wide open in the IAVS. It is actually not bad to use a large space alone most of the time.

OSCAR Demo

The OSCAR is Open-Source Car Architecture for Research and education. This platform supports two vehicle models that are ready to be used in behavior cloning study.

Here are some demo videos.


The first one shows the fusion drives by itself after training a Deep Neural Network in a different track.


The second one is an example of a ‘fusion’ controlled remotely.


The third one shows the rover driven by a driving simulator that consists of a wheel with pedals and a gear shift.


The fourth video is a test mavros/setpoint_velocity/cmd_vel.