The connection may not be obvious enough to anyone as to how the smart face identification feature of your phone can become the seed for a futuristic arms race involving AI. Enter “Deepfakes” — the new digital menace in the block that seems to have caused a big stir for this very same reason. “Deepfakes” is a moniker blend from “deep learning” and “fake”. It basically refers to a class of neural network-based algorithms which manipulate as well generate hyper realistic images, voices, text passages of anything from inanimate objects and animals to humans.
With the rapid development in the field of artificial intelligence coupled with widespread creation of synthetic media through open-source, accuracy of these “Deepfakes” also has been alarmingly high. So much so that conventional synthetic content detection has failed repeatedly and we have reached a somewhat full-on deep fake panic mode.
Although these models have found use in a wide variety of applications, including enhancing accessibility through text-to-speech, and helping in medical imaging, the community has still largely remained uncertain about the potential scenarios of the technology getting abused. This calls for immediate and cautious consideration from specialists in the field in creating awareness, given the recent media coverage of fake videos emerging. Most of them have remained humorous in all intentions, however in 2018 we got an educative yet spine chilling sneak peek of what can become a new age propagandist weapon, when Jordan Peele posing as Barack Obama demonstrated how easy it is to deepfake almost anything and gave this speech.
So, where did it all begin?
Strictly speaking in terms of scientific development, the technology is five years old and comes from the house of Deepmind, a subsidiary of Alphabet Inc. Deepmind is a London based corporation specializing in all things AI. Alpha Fold, a 2018 product from the same company, proved to be more accurate than competitors at predicting the three-dimensional structure of proteins from a list of their composites, potentially paving the way to treat diseases such as Parkinson’s and Alzheimer’s.
The above is one example of what has remained stepping stone in order to achieve what is known as the “holy grail” of deep learning — “General A.I”. Deepmind’s sale to Google in 2014 helped Deepmind to bank on Google’s large resources of raw data and computational power, which are all but fodder for an efficient AI model.
The nitty-gritties of deep learning
To know the problem better it becomes essential to divulge in the details of how the technology itself has evolved into what it is today. Historically in pursuit of Artificial General Intelligence (AGI) there have been two schools of thoughts among researchers about how to achieve it. On one hand in the 80s and 90s there was symbolic AI, which entailed describing all rules of program for a system to think like humans. On the other hand, there were those who tried to replicate the brain’s physical structure in network form.
Both of these were soon outdated as both methods failed to capture the broad methodology of processing information in the brain. In layman terms there was a need to emulate the software of the brain rather than the hardware.
Working on those lines, Deepmind researchers eventually invented the Generative Adversarial Network (GAN) — the most successful generative model as far as creation of deepfake images go. It comprises of two competing programs acting as “adversary” to each other; a generator that produces images with the goal of deceiving the discriminator into believing that they are real, while the discriminator is awarded for spotting fakes.
So, what all media has been touched by this technological masterpiece?
In short, everything!
Deepfakes of Text
Open AI, a company backed by Elon Musk to come up with possible counters for the AI apocalypse (should they arrive), has come up with a revolutionary program that can write anything from fiction to news articles on any topic in a shockingly believable and contextually precise manner. It is so accurate that the non-profit company has decided to not release their research lest it gets abused as well.
Trained on dataset of about 10 million articles from Reddit, this new model called GPT2 has become far more general purpose than any previous mode.
It is a no-brainer that these models have a potential to generate endless series of negative reviews, spam and highly realistic false news articles that can be hard to authenticate, if fallen into the wrong hands.
Deepfakes of Images
The advent of image synthesis first started with image detection. Algorithms figured out how to decipher images and that’s when phones started to get unlocked by your face.
In early 2019, developments related to deepfakes of images entered a whole new realm when a team of researchers released a paper introducing Big GAN. BigGAN is essentially a GAN that was trained on a huge database of 14 million diverse image sets. The new wave of deep fake images owe their existence to this particular piece of technology.
Deepfakes of Videos
Deepmind has always remained at the forefront when it comes to synthetic video generation technology. With efficient modelling techniques and a training data set, compiled entirely from YouTube videos focused only on human action recognition, researchers have created what they call — Dual Video Discriminator GAN (DVD-GAN) .Besides Deepmind, other research facilities have also come up with their own piece of deep learning technology.
Does all these imply that we are just at the mercy of so called deepfake content creators?
…. Fortunately, the picture is not so dim yet.
Several new and promising methods for correctly identifying Deepfakes have been coming off rapidly, such as procedures involving “digital fingerprints” being added to videos for authentication.
Increasing cases of fraud, manipulation of video evidences and fabrication of data with target to malign authorities has led to some initiative in establishing special jurisdiction. Lawmakers in California, concerned about the possible risks of deepfakes in the 2020 elections have passed a law in October 2019 banning the distribution of “materially deceptive audio or visual media” within 60 days of an election.
Creating awareness seems to be one of the significant methods in tackling the rampant spread of deepfakes. If more people are aware that hyper realistic medium exists and that cases of fraud based on such devices happen, next time onwards people would be more cynical and conscious to look for anything that seems even a bit off. This evidently would decrease the scenarios of people getting deceived.
The first step is to ensure what signs to look for. Researchers working at Cornell University after studying closely the deepfake video footages through their own neural network have identified that subjects blink far less in a fake video. They further suggest to look for any blurring marks or discolouration at the edges of the faces. As far as deepfake images go, experts have also pointed out a glaring inconsistency with made up human images. The fakes generally tend to carry discrepancies such as one earring in an ear or sometimes abrupt changes in the subjects on the background. But as deepfakes are constantly improving, it will continue to get harder to spot one by mere human senses alone.
Here is where researchers are working to devise AI to fight AI. Since technology for creation of deepfakes evolves quickly and becomes more accessible to miscreants, creating equally powerful detection models become crucial for identifying the truth. Audio and visual deepfakes that are done well are hard to catch.
Lack of training data has been a significant obstacle for researchers trying to build effective deepfake detection systems. This is where our best bet in fighting deepfakes turns out to be the major online platforms. Google, Facebook and Twitter have come up with their own huge databases of fake video contents. Twitter is drafting a deep fake policy, and Facebook has begun developing technology solutions to detect deep fakes.
Google, on the other hand worked with paid actors to record their videos and using several publicly available deepfake generation tools, created thousands of deepfakes and later made it available for the research community as a training database for creation of deepfake detection methods.
An alternative detection method has also been proposed which is less data-hungry, by Dr. Siwei Lyu, Professor of Computer Science at the University of Albany. Currently deepfake algorithms generate images of limited resolutions, which are then warped to match the original video, making it possible to identify deepfaked videos through measuring the warping of faces.
So, are the counters all in place?
Far from that…
In all regards deepfakes are essentially an evolution of a prevalent threat. Industry experts worry, how long till we reach a state when this deepfake technology is available on someone’s laptop and it can be done in say maybe an hour or so by a cyber-criminal sitting anywhere in the world. It may not be too far to say that groundwork for a coordinated deployment of AI fakes against an individual or corporation has been laid, although we haven’t experienced it yet.
Rather there are several obvious legal and ethical conundrums still to be dealt with before any detection infrastructure can be setup. Is it possible to use deep fakes for any viably good reasons? How do we differentiate, for example, between satire, parody and content intended to deceive for nefarious, rather than artistic and educational needs?
Even if California’s ‘Anti-Deepfake Bill’ gets passed and possibly goes on to become a landmark piece of law, one can’t help but ponder: who is responsible for proving that audio or video has been manipulated?
It is fair to stay there is still a brief period of suspended action, in which we have a chance to define how we, as consumers and media, can shape up to deal with deep fakes. An essential first step is to start getting literate and rightfully aware of this box of pandora that is this Deepfakes.