Acquiring high-quality face datasets for your computer vision system training is a huge headache. Gathering and cleaning this data is time-consuming, expensive, and riddled with privacy issues. For face datasets, the challenge is confounded by the need to account for limitless combinations of facial structures, poses, hairstyles, and expressions, not to mention a huge range of lighting conditions (including low light or nighttime environments), and camera angles.

Synthetic data is a new, efficient, and controllable way to generate complete face datasets for training computer vision systems. Datagen provides high variance, photo-realistic simulated data at scale to bring AI models to production faster. We focus on human-centered applications that require data around humans, environments, and human-object interactions. Our data-centric technology delivers visual data with broad domain coverage and fully-controllable scene variance.

• 50K+ unique faces generated from high-res scans • Diverse pool of identities with different ethnicities, ages, genders, and BMIs • Full control of facial expressions and head poses • 5,000+ hair styles and facial hair • Control over extrinsic and intrinsic camera parameters • Wide range of lighting and background options

Create a high-variance dataset of faces to solve tasks such as face detection, recognition, head pose estimation, gaze detection, landmarks detection and more

The Platform offers control over:

  • Identity – age, gender, ethnicity
  • Scene – camera location, camera intrinsic matrix, lighting, and background
  • Eye gaze direction
  • Facial expression – including mouth and eye openness
  • Accessories – glasses, sun glasses, masks, jewelry, headsets
  • Facial hair
  • Head location and rotation

Visual Modalities

  • RGB (png)
  • Infrared (png)
  • Depth (exr)
  • Normal (exr)
  • Semantic Segmentation (exr)

Textual Modalities

  • 3D keypoints (JSON)
  • 3D dense keypoints (JSON)
  • 2D keypoints (JSON)
  • 2D dense keypoints (JSON)

