Imagine a perfect world where your smart alarm clock calculates the time you should wake up, depending on weather conditions and traffic congestion. When you wake up, your smart closet already prepared the clothes appropriate for a meeting based on what you wore in a similar situation. Your smart kitchen has breakfast ready, taking into account your doctor’s low-carb diet recommendation. And when you travel to your destination, all you need to do is start your smart car.
Sadly, the real world isn’t perfect. And machine learning (ML), which powers most of the smart devices you use, can get distracted, too. And when they do, bad things could happen.
What is Adversarial Machine Learning?
Adversarial machine learning is a technique that aims to distract ML models. As a result, these models could make mistakes. It can thus be considered a form of attack where an attacker inserts “adversarial noise” into a machine’s code, causing it to malfunction.
Developers who are studying this field call the distractions they use “adversarial examples,” as they are only representations of real noise. But the effects of introducing adversarial examples to an ML model are practically the same as the real thing.
How does Adversarial Machine Learning Work?
Christian Szegedy, a research scientist, discovered adversarial examples. He illustrated how image classification systems could be fooled into thinking that a bus is an ostrich by introducing noise.

The screenshot was taken from the presentation, “Measuring Neural Net Robustness with Constraints.”
https://obastani.github.io/docs/dars18-presentation.pdf
Szegedy and his colleagues also demonstrated how a deep neural network system identified a panda as a gibbon after the image’s lighting was distorted.

The screenshot was taken from the research paper, “Explaining and Harnessing Adversarial Examples.”
https://arxiv.org/pdf/1412.6572v3.pdf
The video below shows an image that Google always classifies as a turtle. After injecting adversarial noise, the turtle was identified as a rifle.
How Dangerous is Adversarial Machine Learning?
The examples we tackled so far have been somehow funny—a bus identified as an ostrich, a panda thought to be a gibbon, and a turtle tagged as a rifle. Imagine a photo of you mistakenly identified by Facebook as your mom. That would be a fun and endearing addition to dinner conversations.
Adversarial machine learning can also be used in the context of natural language processing (NLP). Adversarial examples caused a word vector model to make incorrect predictions in incomplete sentences. At worst, that would result in poorly written documents and, at best, in funny sentences.
But put in a different perspective, adversarial noise could be dangerous. For instance, Tesla’s self-driving cars can be tricked into thinking that stickers on the road are interchanges. As such, the vehicle could shift and drive straight into approaching traffic.
An actual experiment by McAfee technicians also fooled Tesla 2016 Model S and Model X. It caused the cars to go over the speed limit. They did it by simply putting a strip of electrical tape over a speed limit sign. So instead of 35 miles per hour, the autopilot cars saw the sign as 85 miles per hour.
Tesla has thus stopped manufacturing 2016 Model S and Model X and has not used the same vision technology for other models.
Applying adversarial noise in the context of medical imaging detection systems could result in misdiagnosis as well. In an experiment, experts found that adversarial examples can cause misclassifications of skin lesions and incorrect diagnoses.
Conclusion
AI is still at its infancy and, by extension, so are ML, neural networks, and other AI applications. Adversarial examples show that there is a lot more to learn, explore, and improve.
The discovery of adversarial examples in ML paved the way for robustness tests, too. So, in a way, adversarial machine learning pushed AI researchers to add another level of testing to their quality assurance assessments, thereby making AI systems more secure.
