Please notice time change.


Alexander Beatson will present his general exam on Wednesday, May 10, 2017 at 4pm in CS 301.  The members of his committee are: Han Liu (ORFE, adviser), Elad Hazan, and Arvind Narayanan.

Everyone is invited to attend his talk, and those faculty wishing to remain for the oral exam following are welcome to do so. His abstract and reading list follow below.

Title: Deep learning in some non-iid settings


In real-world settings observed data may often be non-iid. There might be differences between the training and deployment distributions, which we can model in the worst case as perturbation by an adversary [1], or the training data may be drawn from different distributions at different times or at different locations, and we might wish to learn from the global distribution without storing or communicating the data [12]. I will discuss some work on attacks on machine learners and on non-iid distributed deep learning with communication constraints.


Attacks on machine learners: One threat model for machine learning involves an adversary perturbing the training set. We show that even a “blind” adversary with no knowledge of the training data can harm learning (meaningfully lower-bound the learner’s effective sample size) via simple data injection strategies. Another threat model involves an adversary perturbing the examples observed in deployment. Linear and deep models are particularly vulnerable to such adversarial examples [2-4]. We show that deep generative models can be used to correctly classify adversarial examples or detect them as being far from the manifold of real data.


Non-iid distributed deep learning: “Federated learning” [12] features communication constraints and data distributed non-iid across nodes. We propose an asynchronous algorithm for this setting which uses the Hessian of the likelihood at each node to weight model averaging and compute proximal constraints. This may be derived as an iterated natural proximal method or an application of the Laplace approximation [9]. Experiments show this converges faster than other asynchronous algorithms when data are non-iid across nodes. We may approximate the algorithm by sending only a small fraction of the variable updates from each node, sampled according to diagonal of that node’s Hessian. This maintains improved convergence in the non-iid setting, and in both iid and non-iid settings reduces the communication volume needed to reach a given accuracy.



Reading List:


Attacks on machine learners:

Overview of the field:

[1] Concrete problems in AI safety. Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, Dan Mané.


Adversarial examples in deep learning:

[2] Explaining and harnessing adversarial examples. Ian J. Goodfellow, Jonathon Shlens, Christian Szegedy.

[3] Practical black-box attacks against deep learning systems using adversarial examples. Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z. Berkay Celik, Ananthram Swami.

[4] Towards Evaluating the Robustness of Neural Networks. Nicholas Carlini, David Wagner.


Previously proposed defenses:

[5] Biologically inspired protection of deep networks from adversarial attacks. Aran Nayebi, Surya Ganguli.

[6] Parseval Networks: Improving Robustness to Adversarial Examples. Cisse Moustapha, Bojanowski Piotr, Grave Edouard, Dauphin Yann, Usunier Nicolas.


Deep generative models:

[7] Generative Adversarial Networks. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio.

[8] Auto-Encoding Variational Bayes. Diederik P Kingma, Max Welling.


Non-iid and distributed deep learning:

The Laplace approximation:

[9] A practical Bayesian framework for backpropogation networks. David J. C. MacKay.


Using the Laplace approximation to estimating weight importances in deep learning:

[10] Optimal Brain Damage. Yann LeCun, John S. Denker, Sara A. Solla.

[11] Overcoming catastrophic forgetting in neural networks. James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, Raia Hadsell.


Federated learning:

[12] Communication-Efficient Learning of Deep Networks from Decentralized Data. H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, Blaise Agüera y Arcas.

[13] Federated Learning: Strategies for Improving Communication Efficiency. Jakub Konečný, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, Dave Bacon.


Textbook:

[14] Deep Learning. Ian Goodfellow, Yoshua Bengio, Aaron Courville.


_______________________________________________
talks mailing list
talks@lists.cs.princeton.edu
To edit subscription settings or remove yourself, use this link:
https://lists.cs.princeton.edu/mailman/listinfo/talks