ML Seminar - Robust Distributed Training! But at What Cost?
Abstract: In this talk, we aim to quantify the robustness of distributed training against worst-case failures and adversarial nodes. We show that there is a gap between robustness guarantees, depending on whether adversarial nodes have full control of the hardware, the training data, or both. Using ideas from robust statistics and coding theory we establish robust and scalable training methods for centralized, parameter server systems. Perhaps unsurprisingly, we prove that robustness is impossible when a central authority does not own the training data, e.g., in federated learning systems. We then provide a set of attacks that force federated models to exhibit poor performance on either the training, test, or out-of-distribution data sets. Our results and experiments cast doubts on the security presumed by federated learning system providers, and show that if you want robustness, you probably have to give up some of your data privacy.
Bio: Dimitris Papailiopoulos is an Assistant Professor of ECE and CS (by courtesy) at UW-Madison. His research spans machine learning, information theory, and distributed systems, with a current focus on scalable and fault-tolerant distributed machine learning systems. Dimitris was a postdoctoral researcher at UC Berkeley and a member of the AMPLab. He earned his Ph.D. in ECE from UT Austin in 2014, under the supervision of Alex Dimakis. Dimitris is a recipient of the NSF CAREER Award (2019), a Sony Faculty Innovation Award (2019), the Benjamin Smith Reynolds Award for Excellence in Teaching (2019), and the IEEE Signal Processing Society, Young Author Best Paper Award (2015). In 2018, he co-founded MLSys, a new conference that targets research at the intersection of machine learning and systems.