Leveraging Synthetic Data for Hands-on-Wheel Detection

This is part 1 of a series of blogs on our recent presentation at CVPR 2022 on leveraging synthetic data for hands-on-wheel detection. 

Currently, most vehicles on the road are driven by humans. People are prone to distractions while driving, which is the cause of 15% of the injury-causing accidents in the US.  In the next few years, European regulations will require car manufacturers to gradually include new safety technologies, such as Driver Monitoring Systems in vehicles [1]. In addition, the European NCAP has started requiring driver monitoring features in order to qualify for a 5-star safety rating, raising the urgency of the development of driver monitoring systems. 

In our paper, Hands-Up: Leveraging Synthetic Data for Hands-on-Wheel Detection, we develop a driver monitoring system to detect when a driver’s hands leave the wheel. Such a system could be useful in multiple DMS tasks, such as raising an alert when the driver is distracted. As with many deep learning projects, the challenge here is data. It is difficult to collect many images of drivers in a vehicle, and while some datasets do exist, they are limited in the number of drivers, vehicle types, behaviors, and camera models that they use. In addition, tagging tens of hours of videos in a consistent way is challenging. 

Synthetic data provides an alternative solution that, once developed, can be used to create significant variance with little manual effort. We developed a synthetic data platform that renders highly realistic scenes of drivers in cars. Our synthetic data platform allows for varying the camera position and type, scene lighting, and driver behavior (e.g., falling asleep, looking around, drinking, texting etc.,). It includes pixel perfect ground truth and 3D annotations so that no manual tagging is required. 

Read the Benchmark Report Hands-Up: Leveraging Synthetic Data for Hands-on-Wheel Detection

Over the past few years, there has been major progress in the field of synthetic data generation using simulation based techniques. These methods use high-end graphics engines and physics-based ray-tracing rendering in order to represent the world in 3D and create highly realistic images. Datagen has specialized in the generation of high-quality 3D humans, realistic 3D environments and generation of realistic human motion. This technology has been developed into a data generation platform which we used for these experiments. 

This work demonstrates the use of synthetic photo-realistic in-cabin data to train a Driver Monitoring System (DMS) that uses a lightweight neural network to detect whether the driver’s hands are on the wheel. We demonstrate that when only a small amount of real data is available, synthetic data can be a simple way to boost performance. Moreover, we adopt the data-centric approach and show how performing error analysis and generating the missing edge-cases in our platform boosts performance. This showcases the ability of human-centric synthetic data to generalize well to the real world, and help train algorithms in computer vision settings where data from the target domain is scarce or hard to collect. 

Our two main contributions in this work are:

  1. We demonstrate how using synthetic data along with a very small amount of real examples can boost performance relative to using the same amount of only-real data.
  2. We show, in practice, a complete iteration of a data-centric approach using our platform to generate a specific edge case that we were lacking in the training dataset. 

In our next blog, we will review the methods we used for this experiment.

Read the Benchmark Report Hands-Up: Leveraging Synthetic Data for Hands-on-Wheel Detection