The Next AI Blog

Stay informed with the latest updates on synthetic data.

Lightbox Image

Fully Simulated Training Data: How We’re Using Synthetic Expo Markers to Train a Network

In this post, we are excited to present how we have trained a network to identify dry-erase markers exclusively with synthetic training data and achieved comparable results to manually gathered data. GPU capacity and neural network size have increased over time, while the ability of teams to obtain the necessary quantities of training data has...
Read More

10 Promising Synthesis Papers from CVPR 2021

The 2021 Conference on Computer Vision and Pattern Recognition (CVPR) has just commenced. With topics running the gamut from autonomous driving to medical imaging, CVPR 2021 features an exciting lineup of state-of-the-art technology that shows tremendous potential for practical applications. Navigating the labyrinth of highly information-dense papers is not exactly a walk in the park....
Read More

Eye Datasets

The eyes are the window to the soul. Or so the saying goes. But they’re also a window to a wide range of applications for Computer Vision development. From eye tracking to gaze estimation, eye-focused datasets are powering computer vision applications in a wide range of verticals. Here are a few examples: Virtual and Mixed...
Read More

Data privacy: Navigating Data Collection in an Increasingly Complex Regulatory Landscape

Data is the new oil. Some have even compared it to nuclear power. Metaphors aside, data and its uses are transforming the world we live in. Algorithms, analytics, and applications are changing every field of human life. But, just like oil – data has its downsides. One major concern about the increasing thirst for data...
Read More

A Friendly Guide to Public Indoor Environment Datasets

Understanding images of indoor home environments is a fundamental task for many applications of computer vision. One of the challenges in advancing computer vision is the availability of suitable datasets on which models can be trained. Particularly useful are public indoor datasets. Indoor means interior spaces such as within homes, buildings, offices, and the like....
Read More

Datagen at the Israel Machine Vision Conference 2020: “Solving the Data Bottleneck Using Simulated Data”

We are excited to share a lecture that Nathan Cavaglione, an algorithm engineer at Datagen, gave at IMVC 2020. IMVC is a conference dedicated to the latest in machine vision. It is a prestigious platform for industry and academic leaders in the field and consistently highlights groundbreaking work by global experts. We are proud to...
Read More

Procedural Modeling: a brief introduction

One of the key challenges in the field of computer vision is assembling large enough datasets with enough variance. As we have explored in previous posts, the lack of high-quality labeled training data is a major bottleneck in the nascent field of computer vision.  The complexity of image recognition and detection is compounded when objects...
Read More

Data Annotation: Key Concepts, Challenges, and Alternatives

In this piece, let’s explore the world of data annotation. We’ll start with the basics: defining data annotation, speaking about different types of annotation and data labeling techniques, including a survey of industry options, and touching on some of the limits and challenges associated with this process. We’ll then explore the promise and possibilities that Simulated...
Read More

Domain Drift: The Problem and Strategies To Address It

Nothing lasts forever – be it youth, beauty, your laptop’s battery, or the accuracy of machine learning models. We’re not here to troubleshoot your battery issues or offer skincare tips, but we are qualified to explain why your models lose accuracy over time. So, let’s get to it.  At the highest level, machine learning models...
Read More