Facial Recognition Algorithms and Libraries You Should Know

What Is a Facial Recognition Algorithm?

Face recognition is a technology that can identify the face of an individual whose image is stored in a dataset. Although other identification methods may be more accurate, facial recognition has been an important focus of research because it is easy to implement, convenient, and non-obtrusive.

A face recognition algorithm is a basic component of a face detection and recognition system. Face recognition algorithms typically perform the following main tasks: 

  • Detect faces in images, videos or live streams
  • Compute a mathematical model of the face image
  • Compare the model derived from a face to an image in a training set or database
  • Evaluate the comparison to see whether the face shows the required individual

Challenges of Face Recognition

Subtle variations in lighting conditions can challenge an automated facial recognition algorithm and skew the results – even if the person has a similar pose and expression. 

Illumination can significantly change a face’s appearance. In many cases, two images of the same face in a different light appear more different than the faces of two individuals with the same illumination. 

Facial recognition algorithms are also sensitive to variations in angles or poses. An individual’s pose changes based on head movements and camera positions. Using a different camera angle or pose alters the overall facial appearance, creating variations that impact the success of the facial recognition system. For example, suppose the database only contains frontal views of a subject. In that case, the algorithm might struggle to identify a face with a higher rotation angle, generating a flawed result or failing to recognize an identity altogether.

Another complicating factor is facial expressions, especially macro-expressions like happy, sad, angry, surprised, or afraid. More subtle changes include micro-expressions such as involuntary, rapid facial movements. An individual’s emotional state influences their expressions (both macro and micro), potentially skewing the results of a facial recognition system. In addition, face appearance can be changed by make-up and accessories such as eyeglasses or earrings, which can also make face recognition more difficult.  

Resolution is also a significant factor. Low resolution images can be difficult for face recognition algorithms to interpret. For example, closed circuit television (CCTV) cameras often generate images only 16×16 pixels in size – these images offer limited visual information and typically cannot be successfully analyzed by face recognition. A low-resolution image may also capture only a portion of the face making it harder to recognize. Most face recognition algorithms require at least 50×50 pixels for effective analysis.

Deep Learning Face Recognition: Algorithms and Libraries

FaceNet

FaceNet is an algorithm based on a deep convolutional neural network (CNN), which can be used for face recognition, verification and clustering. 

FaceNet works by mapping face images into a euclidean space, in such a way that the distance between images corresponds to similarity (the nearer two images, the more similar they are considered to be). FaceNet is trained using images that are scaled, transformed, and cropped around the face area. 

Unlike previous approaches, FaceNet learns mappings from the images and creates embeddings directly, rather than using an additional layer for recognition or verification. A major advantage is that the model is extremely lightweight, representing each face using only 128 bytes of data.

In the FaceNet paper, researchers tested it on the LFW and YouTube Faces DB, achieving accuracy of over 95% and reducing error rate compared to the best previous result by 30%.

Read the paper: Florian Schroff, Dmitry Kalenichenko, James Philbin, 2015

ArcFace

ArcFace is an ML model that tries to create a separation between a number of predefined different classes. A backbone trained with ArcFace is then used to extract a feature space where downstream tasks such as face verification and identification are possible. It is useful for face search and recognition applications.

ArcFace uses similarity learning to enable the solution of classification tasks by learning distance metrics. It replaces Softmax loss with angular margin loss to calculate the distance between face images. 

The loss function can be separated into two different parts, the nominator and denominator. because we are minimizing the loss, and because our loss function is negative, we would like to increase the nominator and decrease the denominator absolute values:

  • In the nominator, a cosine similarity between the normalized class embeddings and the class weight is calculated as an inner product between the two vectors. The closer the two vectors are to co-linearity, the closer the cosine similarity would be 1, the further away, the closer it will be to 0. Thus, the smaller the angles between the two vectors, the larger our nominator, the smaller our loss.
  • In the denominator, we want to minimize the cosine similarity between our class instance and all the other classes weights. 

Thus, we get a loss term which demands closeness to the mean of the class, and distance to all the other classes.

Image Source: Jiankang Deng, Jia Guo, Niannan Xue, Stefanos Zafeiriou, 2018

In most classification tasks, the FC (fully connected) layer uses the inner products of weight and features, applying Softmax to the output after calculating the features. 

ArcFace creates an embedding space where you have sufficient distance between different classes. The embedding space becomes sparser, so the classes are better separated.

Read the paper: Jiankang Deng, Jia Guo, Niannan Xue, Stefanos Zafeiriou, 2018

face.evoLVE

face.evoLVE is a popular and actively developed open source library that is primarily used for frontal face recognition. It provides all key components of face analytics, including:

  • Face alignment
  • Data processing
  • Backbones
  • Loss functions
  • Optimizations to improve performance

It provides multiple deep learning methods for face recognition, and supports multi-GPU training with PyTorch and PaddlePaddle, making it convenient to work with large-scale datasets, as well as low-shot databases with limited data. 

Another important feature of evoLVE is that it provides the images of common face benchmark datasets, before and after alignment, making it much easier to test models developed by library users. 

Get the library on Github

OpenFace

OpenFace is a tool for computer vision researchers building applications based on facial analysis and recognition. It provides the following face analysis capabilities:

  • Facial landmark detection—based on the paper “Convolutional Experts Constrained Local Model” (Zadeh, et. al, 2017)
  • Facial action unit recognition—based on the paper “Cross-Dataset Learning and Person-Specific Normalization for Automatic Action Unit” (Baltrušaitis, et. al, 2015)
  • Eye gaze estimation—based on the paper “Rendering of Eyes for Eye-Shape Registration and Gaze Estimation” (Wood, et. al, 2015)
  • Head pose estimation

 Importantly, the toolset is optimized for real-time performance and works with input from a standard webcam. 

Get the tool on Github