Anti-spoofing: how do facial recognition systems resist scammers?

In this article I will try to summarize information about existing methods of liveness detection, which are used to protect against hacking facial recognition systems.

What are we protecting from?

With the development of cloud technologies and web services, more and more transactions are moving into the online environment. At the same time, more than 50% of online transactions (retail) are made from mobile devices.

The growing popularity of mobile transactions can not be accompanied by the active growth of cybercrime.

Online fraud cases are 81% more likely than fraud at point of sale.

16.7 million personal data of Americans were stolen only for 2017 ( Javelin Strategy and Research ). The damage from the fraud with the seizure of accounts amounted to $ 5.1 billion.

In Russia, according to Group-IB , for 2017, hackers stole more than a billion rubles from the owners of Android smartphones, which is 136% more than a year earlier.

Traditional methods of ensuring security in remote authentication cases, for example, using test questions or SMS, are no longer so reliable thanks to improved user fraud and social engineering mechanisms. Here, biometrics is increasingly coming to the rescue, especially facial recognition.

According to Acuity Market Intelligence , by 2020 the total volume of biometric transactions, payment and non-payment, will exceed 800 million per year.

Face recognition technology is usually preferable due to the contactlessness and minimal requirements for user interaction, and at the same time, perhaps the most vulnerable to fraud attacks. An image of a person’s face is much easier to obtain than other biometric identifiers, such as a fingerprint or iris. Any photo of the user (obtained by taking close-up shots without the user's consent or from the Internet) can be used to deceive the system. This kind of attack, when a real user is replaced by a fraudster with a fake ID, is called spoofing.

Liveness detection methods

From time to time messages appear on the Internet about another successful attempt to deceive the facial recognition system. But do developers and researchers really take no action to improve the security of facial recognition systems? Of course, take. This is how liveness detection technologies appeared, the task of which is to check the identifier against a “live” user.

There are several classifications of liveness detection methods. First of all, they can be divided into hardware and software.

Hardware methods involve the use of additional equipment, for example, infrared cameras, thermal cameras, 3D cameras. Due to low sensitivity to lighting conditions and the ability to capture specific differences in images, these methods are considered the most reliable, in particular, according to the results of recent tests, the iPhone X, equipped with an infrared camera, turned out to be the only smartphone that successfully withstood attacks using a 3D face model. The disadvantages of such methods include the high cost of additional sensors and the complexity of integration into existing face recognition systems.

Hardware techniques are the perfect solution for mobile device manufacturers.

Unlike hardware, software methods do not require additional equipment (using a standard camera), which means that they are more accessible, at the same time, they are more vulnerable to spoofing, since the result of the check depends on factors such as lighting and camera resolution.

So, is it enough to buy a modern smartphone with biometrics and an infrared sensor “on board” and the problem is solved? It is a logical conclusion, if not for one BUT. According to forecasts, by 2020 only 35% of authentications will be carried out via biometrics embedded in mobile devices, while biometric mobile applications will be used in 65% of cases. There is one reason - such mobile devices are much more expensive, and therefore will not be widely used. This means that the focus is still shifting towards software methods that can work effectively on billions of devices with conventional cameras. On them and dwell in detail.

There are two types of software methods: active (dynamic) and passive (static).

Active methods require user collaboration. In this case, the system prompts the user to perform certain actions in accordance with the instructions, for example, to blink, turn his head in a certain way, smile, etc. (call-response protocol). This leads to the drawbacks of such methods: firstly, the need for cooperation negates the advantage of the face recognition system, as a noncooperative type of biometric authentication, users do not like to waste time on unnecessary “gestures”; secondly, if the required actions are known in advance, protection can be circumvented by playing a video or 3D replica with simulated facial expressions / movements.

The essence of such methods is in detecting movement in a sequence of input frames for extracting dynamic features that make it possible to distinguish between real and fake faces. Analysis methods are based on the fact that the movement of flat 2D-objects is significantly different from the movement of a real human face, which is a 3D object. Since active methods use more than one frame, they require more time to make a decision. The frequency of facial movements typically ranges from 0.2 to 0.5 Hz. Therefore, collecting data for detecting spoofing takes more than 3 seconds, while human vision, the ability of which, in essence, mimic these methods, determines movement and builds a map of the structure. the environment is much faster.

Unlike active, passive methods do not require user participation and rely on data analysis of a single 2D image, which provides quick response and convenience for the user. The most used: methods based on the Fourier spectrum (search for differences in the intensity of light reflection of 2D and 3D objects) and methods that extract the properties of image textures. The effectiveness of these methods decreases with changes in the direction and brightness of the lighting. In addition, modern devices are capable of transmitting images in high resolution and natural color, allowing you to fool the system.

What's better?

The table briefly presents the key characteristics of the main categories of methods. I will not describe the methods included in each category, there are many of them and they differ depending on the algorithms used and their combinations.

Method category	Principle of operation	Benefits	Restrictions
Methods based on movements (facial expressions) or temporal methods (dynamic, less often static)	Fixing involuntary muscle movements or actions on request	Good generalizing ability *	- Low reliability; - slow response (> 3 sec.); - high computational complexity; - effective against photos and 2D masks.
Texture Analysis Methods (Static)	Search for features of the texture characteristic of the printed face (blur, printing failures, etc.)	- Fast response (<1 sec.); - only one image is required; - low computation complexity; - low cost; - non-invasive method.	- Low generalizing ability; - Vulnerable to attacks with high resolution video.
Methods based on image quality analysis (static)	Image quality analysis of a real face and a fake 2D image (distortion analysis, analysis of mirror image distribution)	- Good generalizing ability; - fast response (<1 sec.); - low complexity of calculations.	- Different classifiers are required for different types of spoofing attacks; - vulnerable to modern devices.
Methods based on 3D facial structure (dynamic)	Fixation of differences in the properties of the optical flow generated by three-dimensional objects and two-dimensional planes (analysis of the trajectory of motion, building a depth map)	High reliability of methods (applied to 2D attacks and 3D attacks)	- Slow response (> 3 sec.); - sensitivity to lighting and image quality.
Multimodal methods (static and dynamic)	The combination of two or more biometric methods	- High reliability; - versatility (the choice of modality).	- Slow response (> 3 sec.); - the choice of modality facilitates the choice of the simplest method of attack; - the complexity of combining features extracted by different methods.
Methods using inertial sensors (dynamic)	Analysis of the correspondence of facial movements to camera movement with the help of built-in sensors of a mobile device (accelerometer and gyroscope)	- High reliability methods (applied to 2D attacks); - The necessary sensors are already in the complete set of smartphones.	- Slow response (> 3 sec.); - the result depends on the measurement accuracy of the sensors; - sensitivity to lighting, occlusion and facial expressions.

* The ability of the model to work effectively in cases that go beyond the teaching examples (for example, when changing the conditions of registration of the template: lighting, noise, image quality)

Methods of different types can be combined with each other, but because of the processing time of various parameters, the detection efficiency of such hybrid methods leaves much to be desired.
The picture of application in modern face recognition systems is approximately as follows *:

* According to the analysis of systems of more than 20 vendors.

As can be seen from the graph, dynamic methods prevail, with the emphasis placed on the request for action. Such a choice is likely due to the assumption that typical attackers have limited technical skills and simple means. In practice, the development of technologies and the increase in their availability lead to the emergence of more sophisticated methods of spoofing.

An example of this is the report of researchers from the University of North Carolina, who managed to deceive five face recognition algorithms using textured 3D models of heads of volunteers created on a smartphone using studio photos and photos from social networks, as well as virtual reality technology to simulate movements and facial expressions. “Deceived” systems just relied on the analysis of user actions (with building a structure or simply checking for movements), at least no other methods were stated by other vendors at that time.

But the FaceLive method, which at that time was not used in face recognition systems, missed the attacks only in 50% of cases. The liveness detection mechanism compares the similarity between the changes in the direction of movement of a mobile phone measured by an accelerometer and changes in facial landmarks (nose, eyes, etc.) observed on video from a camera. A live user is detected if changes in the position of the head in the face video are consistent with the movements of the device. The disadvantages of the method include the dependence on the accuracy of the inertial sensors of the device, the level of illumination, the user's mimicry and the long duration of the procedure.

Successfully resist attacks using a 3D model that mimics facial expressions and movements, according to the authors of the report, are capable of analyzing blood flow, light projection and using an infrared camera.

Blood flow analysis is based on identifying differences in the reproduction of periodic changes in skin color as a result of heart contractions. Fake images reproduce color worse.

When using a light projection, the built-in device or external light source emits flashes at random intervals. When you try to cheat, the 3D rendering system should be able to quickly and accurately visualize the projected lighting patterns on the model. The requirement for additional equipment is a significant limitation.

The mentioned report was published in 2016, during which time some algorithms were improved. For example, some vendors claim the ability of their systems to successfully resist attacks using 3D masks.

An example of a serious attitude towards technology reliability is Apple and Microsoft. Face ID at one time helped draw the attention of a wide audience to face recognition, demonstrating what the future of personal data security might look like. But soon after the launch, dozens of videos appeared (mostly fake ones) on the subject of technology cheating. In 2017, Windows Hello face recognition was tricked with a printed image. Returning to the results of the Forbes tests , it can be stated that companies have since done a lot of work, as a result of which their system has not been hacked.

I personally did not see any real examples of (for the purpose of committing a crime) hacking of facial recognition systems, in contrast, say, to systems based on fingerprint scanning. Those. all hacking attempts were made either to test the reliability or to discredit the technology. Of course, facial recognition systems are not as common as fingerprint scanning systems, but they are still used, including in banks, where security is given maximum attention.

Let's sum up

The developers of face recognition systems, of course, are concerned about security issues, all vendors offer protection from spoofing (well, or declare it available), an exception is some mobile device manufacturers, but they usually warn about the possibility of deception recognition technology individuals offering it as an additional protection factor.
Traditional methods tend to be subject to restrictions such as dependence on lighting conditions, speed of response, interactivity, or high cost. Therefore, improvement of algorithms is required to improve the user qualities of recognition systems.
Future protection mechanisms should anticipate the development of spoofing technologies and quickly adapt to new threats.
The introduction of modern algorithms will make fraud "expensive pleasure", and therefore impractical for most intruders, i.e. the more technical tools and skills are required to carry out attacks, the more protected users can feel.
The presence of new algorithms in the Graph of the ratio of the use of various methods, albeit in insignificant proportions, indicates the search by vendors for more effective means of protection against spoofing. Companies are experimenting, often offering not one, but several methods of liveness detection, which cannot but inspire optimism about the future of face recognition systems.

Source: https://habr.com/ru/post/436700/

Anti-spoofing: how do facial recognition systems resist scammers?

What are we protecting from?

Liveness detection methods

What's better?

Let's sum up

More articles: