A few weeks ago, a deepfake video call not only caught some politicians cold and fooled them badly. These particularly tricky attacks are also increasing significantly in the daily use of cybercriminals. According to VMWare’s 2022 Global Incident Response Threat Report, 66% of incident responders testified to deepfake attacks in the last 12 months. 
What actually are deepfakes?
The name comes from a combination of two terms: “deep learning” and “fake”. Behind the first term is a type of machine learning that uses artificial intelligence (AI) to analyse data (often audio-visual media in this case) at multiple levels (“deep”).
With these patterns, e.g. obtained from a video, it is subsequently possible to create a particularly accurate fake with different content or new statements. For example, you could create a 3D face model from existing videos of a celebrity that replicates realistic movements and familiar behaviours and expressions – a deepfake.
The same is possible with audio data. With the advancement of artificial intelligence, it is now even very easy to create such fakes. 
How are deepfakes used?
Cyber-attacks mostly use AI-generated media to imitate the image or speech of a real person. According to statistics, the number of video forgeries slightly predominates, with the rest involving audio/speech.
The mode of transmission of deepfake attacks varies. Emails are again at the top of the most used gateways. They are followed by mobile messaging, voice calls and social media transmission. Attackers also increasingly use third-party meeting apps and business collaboration tools, according to the report.
How to identify deepfakes?
The German BSI has set up an information page to sensitise users to the dangers of deepfake attacks and to train them to recognise the methods. 
Especially deepfake videos in real-time usually contain so-called artefacts, i.e. artificial changes or aberrations. These can be, for example, blurred transitions at the edge of the face used or blurred contours. Limited facial expressions, limited movement of the head or inconsistencies in the lighting can also indicate deepfakes.
In voice transmissions, a metallic sound, monotone pronunciation or conspicuousness in the reproduction of other languages or an accent may be noticeable. Unusual sounds or delays in response may also indicate the use of speech technologies to create deepfakes. However, the absence of the conspicuous features mentioned as examples does not automatically indicate valid data. Technologies and attack methods continue to develop and will make the detection of deepfakes even more difficult in the future.
Learnings and outlook
Although deepfakes are by no means flawless, they repeatedly appear in the context of social engineering attacks. The purpose is on the one hand to defraud, but on the other hand also to manipulate people in order to induce them to take ill-considered actions. If the victim resets a password or clicks on a link, the attacker has usually achieved his goal. Awareness and user training are essential points to avert this danger! In the future, we expect deepfake attacks to gain in importance.