The Role of Data Geometry in Adversarial Machine Learning

Abstract

In this talk, I establish the critical role that data geometry plays in adversarial machine learning. This is a recently developing field that is concerned with the impact of malicious actors on machine learning systems. I will first provide a brief overview of adversarial machine learning before delving deeper into evasion attacks, which is the main threat I focus on. The first part of the talk will demonstrate how the effectiveness of black-box attacks, made query-efficient through the use of data geometry, leads to security vulnerabilities for deployed machine learning systems. Having established the threat, I will examine the use of linear transformations of data as a defense against evasion attacks. I will show that using Principal Component Analysis to reduce the dimension of data and subsequently training classifiers on this data is an effective defense even against adaptive white-box attacks with full knowledge of the defense. The third part of the talk will step away from the attack-defense arms race to establish fundamental bounds on learning using optimal transport. These bounds critically rely on the geometry of the data and I will show instantiations of this bound for the special cases of Gaussian and empirical data distributions. Finally, I will conclude by discussing current and future research directions that arise from my dissertation.