<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class=""><div lang="EN-US" link="blue" vlink="purple" class=""><div class="WordSection1"><p class="MsoNormal"><span style="color:#1F497D" class=""> </span></p><p class="MsoNormal" align="center" style="text-align:center"><!--[if gte vml 1]><v:shapetype id="_x0000_t75" coordsize="21600,21600" o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f">
<v:stroke joinstyle="miter" />
<v:formulas>
<v:f eqn="if lineDrawn pixelLineWidth 0" />
<v:f eqn="sum @0 1 0" />
<v:f eqn="sum 0 0 @1" />
<v:f eqn="prod @2 1 2" />
<v:f eqn="prod @3 21600 pixelWidth" />
<v:f eqn="prod @3 21600 pixelHeight" />
<v:f eqn="sum @0 0 1" />
<v:f eqn="prod @6 1 2" />
<v:f eqn="prod @7 21600 pixelWidth" />
<v:f eqn="sum @8 21600 0" />
<v:f eqn="prod @7 21600 pixelHeight" />
<v:f eqn="sum @10 21600 0" />
</v:formulas>
<v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect" />
<o:lock v:ext="edit" aspectratio="t" />
</v:shapetype><v:shape id="Picture_x0020_1" o:spid="_x0000_s1026" type="#_x0000_t75" alt="shieldor" style='position:absolute;left:0;text-align:left;margin-left:44.25pt;margin-top:2.45pt;width:80.75pt;height:74.15pt;z-index:-251658752;visibility:visible;mso-wrap-style:square;mso-width-percent:0;mso-height-percent:0;mso-wrap-distance-left:9pt;mso-wrap-distance-top:0;mso-wrap-distance-right:9pt;mso-wrap-distance-bottom:0;mso-position-horizontal:absolute;mso-position-horizontal-relative:text;mso-position-vertical:absolute;mso-position-vertical-relative:text;mso-width-percent:0;mso-height-percent:0;mso-width-relative:page;mso-height-relative:page'>
<v:imagedata src="cid:image001.png@01D08324.90A4B8C0" o:title="shieldor" />
<w:wrap type="tight"/>
</v:shape><![endif]--><!--[if !vml]--><img width="108" height="99" align="left" hspace="12" alt="shieldor" v:shapes="Picture_x0020_1" class="" apple-inline="yes" id="72A676F5-8C38-4ED1-9458-53F5F6CD2FF1" apple-width="yes" apple-height="yes" src="cid:image003.jpg@01D0EBAB.945F6880"><!--[endif]--><b class=""><span style="font-size:20.0pt;font-family:NewCenturySchlbk" class="">DEPARTMENT
OF <o:p class=""></o:p></span></b></p><p class="MsoNormal" align="center" style="text-align:center"><b class=""><span style="font-size:20.0pt;font-family:NewCenturySchlbk" class="">ELECTRICAL ENGINEERING SEMINAR SERIES
</span></b><span style="font-size:10.0pt;font-family:"Times New Roman","serif"" class=""><o:p class=""></o:p></span></p><p class="MsoNormal" style="margin-left:.5in;text-autospace:none"><b class=""><span style="font-size:12.0pt;font-family:NewCenturySchlbk" class=""> </span></b></p><p class="MsoPlainText" style="text-align:justify"><span style="font-size:12.0pt;font-family:"Georgia","serif"" class=""> </span></p><p class="MsoPlainText" style="text-align:justify"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D" class=""> </span></p><p class="MsoPlainText" style="text-align:justify;text-indent:.5in"><b class=""><span style="font-size:12.0pt;font-family:"Georgia","serif"" class="">Speaker: Anna Choromanska,<o:p class=""></o:p></span></b></p><p class="MsoPlainText" style="margin-left:.5in;text-align:justify;text-indent:.5in">
<span style="font-size:12.0pt;font-family:"Georgia","serif"" class=""> Courant Institute of Mathematical Sciences, New York University<o:p class=""></o:p></span></p><p class="MsoPlainText" style="text-align:justify;text-indent:.5in"><b class=""><span style="font-size:12.0pt;font-family:"Georgia","serif"" class="">Title: Optimization for large-scale machine learning: large data and large model<o:p class=""></o:p></span></b></p><p class="MsoPlainText" style="text-align:justify;text-indent:.5in"><b class=""><span style="font-size:12.0pt;font-family:"Georgia","serif"" class="">Date:</span></b><span style="font-size:12.0pt;font-family:"Georgia","serif"" class=""> Thursday, September 17, 2015<o:p class=""></o:p></span></p><p class="MsoPlainText" style="text-align:justify;text-indent:.5in"><b class=""><span style="font-size:12.0pt;font-family:"Georgia","serif"" class="">Time:</span></b><span style="font-size:12.0pt;font-family:"Georgia","serif"" class=""> 4:30pm<o:p class=""></o:p></span></p><p class="MsoPlainText" style="text-align:justify;text-indent:.5in"><b class=""><span style="font-size:12.0pt;font-family:"Georgia","serif"" class="">Room:</span></b><span style="font-size:12.0pt;font-family:"Georgia","serif"" class=""> E-Quad, B205<o:p class=""></o:p></span></p><p class="MsoPlainText" style="text-align:justify;text-indent:.5in"><b class=""><span style="font-size:12.0pt;font-family:"Georgia","serif"" class="">Host:</span></b><span style="font-size:12.0pt;font-family:"Georgia","serif"" class=""> <b class="">Emmanuel Abbe<span style="color:#1F497D" class=""><o:p class=""></o:p></span></b></span></p><p class="MsoPlainText" style="text-align:justify"><b class=""><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D" class=""> </span></b></p><p class="MsoPlainText" style="text-align:justify"><b class=""><span style="font-size:12.0pt;font-family:"Georgia","serif"" class="">Abstract:</span></b><span style="font-size:12.0pt;font-family:"Georgia","serif"" class=""> The talk will focus on selected challenges in modern large-scale
machine learning in two settings: i) large data setting and ii) large model (deep learning) setting. The first part of the talk will focus on the case when the learning algorithm needs to be scaled to large data. The multi-class classification problem will
be addressed, where the number of classes (k) is extremely large, with the goal of obtaining train and test time complexity logarithmic in the number of classes. A reduction of this problem to a set of binary classification problems organized in a tree structure
will be discussed. A top-down online tree construction approach for constructing logarithmic depth trees will be demonstrated, which is based on a new objective function. Under favorable conditions, the new approach leads to logarithmic depth trees that have
leaves with low label entropy. Discussed approach comes with theoretical guarantees following from convex analysis, though the underlying problem is inherently non-convex. The second part of the talk focuses on the theoretical analysis of more challenging
non-convex learning setting, deep learning with multilayer networks. Despite the success of convex methods, deep learning methods, where the objective is inherently highly non-convex, have enjoyed a resurgence of interest in the last few years and they achieve
state-of-the-art performance. In the second part of the talk we move to the world of non-convex optimization where recent findings suggest that we might eventually be able to describe these approaches theoretically. The connection between the highly non-convex
loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model will be established. It will be shown that under certain assumptions i) for large-size networks, most local minima are
equivalent and yield similar performance on a test set, (ii) the probability of finding a “bad” local minimum, i.e. with high value of loss, is non-zero for small-size networks and decreases quickly with network size, (iii) struggling to find the global minimum
on the training set (as opposed to one of the many good local ones) is not useful in practice and may lead to overfitting. Discussion of open problems concludes the talk.<o:p class=""></o:p></span></p><p class="MsoPlainText" style="text-align:justify"><span style="font-size:12.0pt;font-family:"Georgia","serif"" class=""> </span></p><p class="MsoPlainText" style="text-align:justify"><b class=""><span style="font-size:12.0pt;font-family:"Georgia","serif"" class="">Bio:</span></b><span style="font-size:12.0pt;font-family:"Georgia","serif"" class=""> Anna Choromanska is a Post-Doctoral Associate in the Computer Science
Department at Courant Institute of Mathematical Sciences, New York University. She is working in the Computational and Biological Learning Lab, which is a part of Computational Intelligence, Learning, Vision, and Robotics Lab, of Prof. Yann LeCun. She graduated
with her PhD from Columbia University, Department of Electrical Engineering, where she was the The Fu Foundation School of Engineering and Applied Science Presidential Fellowship holder. She was advised by Prof. Tony Jebara. She completed her MSc with distinctions
in the Department of Electronics and Information Technology, Warsaw University of Technology with double specialization, Electronics and Computer Engineering and Electronics and Informatics in Medicine. She was working with various industrial institutions,
including AT&T Research Laboratories, IBM T.J. Watson Research Center and Microsoft Research New York. Her research interests are in machine learning, optimization and statistics with applications in biomedicine and neurobiology. She also holds a music degree
from Mieczyslaw Karlowicz Music School in Warsaw, Department of Piano Play. She is an avid salsa dancer performing with the Ache Performance Group.<o:p class=""></o:p></span></p>
</div>
</div>
</div></blockquote></div><br class=""><style class=""><!--
/* Font Definitions */
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Georgia;
        panose-1:2 4 5 2 5 4 5 2 3 3;}
@font-face
        {font-family:Consolas;
        panose-1:2 11 6 9 2 2 4 3 2 4;}
@font-face
        {font-family:NewCenturySchlbk;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
        {mso-style-priority:99;
        mso-style-link:"Plain Text Char";
        margin:0in;
        margin-bottom:.0001pt;
        font-size:10.5pt;
        font-family:Consolas;}
span.PlainTextChar
        {mso-style-name:"Plain Text Char";
        mso-style-priority:99;
        mso-style-link:"Plain Text";
        font-family:Consolas;}
span.EmailStyle19
        {mso-style-type:personal;
        font-family:"Arial","sans-serif";
        color:windowtext;}
span.EmailStyle20
        {mso-style-type:personal;
        font-family:"Arial","sans-serif";
        color:windowtext;}
span.EmailStyle21
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
span.EmailStyle22
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
span.EmailStyle23
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
span.EmailStyle24
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
span.EmailStyle25
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
span.EmailStyle26
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style></body></html>