top of page

Mental Models of Adversarial Machine Learning

Presented at:

A paper we co-authored was accepted at USENIX Symposium on Usable Privacy and Security and studies practitioners’ perception of vulnerabilities in AI systems.



The Challenge:


Although machine learning is widely used in practice, little is known about practitioners’ understanding of potential security challenges. Tampering with some features often suffices to change the classifier’s outputs to a class chosen by the adversary. Analogously, slightly altering the training data enables the attacker to decrease performance of the classifier.


Another change in the training data allows the attacker to enforce a particular output class when a specified stimulus is present. Most state-of-the-art attacks and mitigations are in an ongoing arms race. Although machine learning is increasingly used in industry, very little is known about ML security in practice. At the same time, previous works show that practitioners are concerned about AML, and failures already occur, very little is known about ML security in practice.


QuantPi’s Contribution:


Together with Kathrin Grosse and Battista Biggio (University Cagliari) as well as Michael Backes and Katharina Krombholz (CISPA Helmholtz Center for Information Security), QuantPi's Lukas Bieringer conducted a qualitative study focusing on developers’ mental models of the machine learning pipeline and potentially vulnerable components. After giving an informal overview of concrete attacks discussed in prior research in adversarial machine learning (e.g. poisoning, model stealing) and recalling prior applications of the mental model concept in information security, the paper reports in detail about semi-structured interviews and drawing tasks that have been conducted with ML practitioners.


The study reveals two facets of practitioners’ mental models of machine learning security. Firstly, practitioners often confuse machine learning security with threats and defenses that are not directly related to machine learning. Secondly, in contrast to most academic research, the participants perceive security of machine learning as not solely related to individual models, but rather in the context of entire workflows that consist of multiple components. Jointly with the additional findings of the study, these two facets provide a foundation to substantiate mental models for machine learning security and have implications for the integration of adversarial machine learning into corporate workflows, decreasing practitioners’ reported uncertainty, and appropriate regulatory frameworks for machine learning security.



More Information:

bottom of page