Hamed Hassani (UPenn)
Date: Friday, September 11, 2020
Time: 11:00 AM – 12:00 PM (CDT; UTC -5)
Location: Online (Zoom link will be provided)
Seminar time shown in CDT (UTC -5)
Join us for a special virtual installment of the ML Seminar Series:
In this talk, we aim to quantify the robustness of distributed training against worst-case failures and adversarial nodes. We show that there is a gap between robustness guarantees, depending on whether adversarial nodes have full control of the hardware, the training data, or both. Using ideas from robust statistics and coding theory we establish robust and scalable training methods for centralized, parameter server systems.
Few-shot classification, the task of adapting a classifier to unseen classes given a small labeled dataset, is an important step on the path toward human-like machine learning. I will present what I think are some of the key advances and open questions in this area. I will then focus on the fundamental issue of overfitting in the few-shot scenario. Bayesian methods are well-suited to tackling this issue because they allow practitioners to specify prior beliefs and update those beliefs in light of observed data.
Large-scale machine learning training, in particular, distributed stochastic gradient descent (SGD), needs to be robust to inherent system variability such as unpredictable computation and communication delays. This work considers a distributed SGD framework where each worker node is allowed to perform local model updates and the resulting models are averaged periodically. Our goal is to analyze and improve the true speed of error convergence with respect to wall-clock time (instead of the number of iterations).