Donghun Lee will present his FPO, "Learning To Learn Optimally: A Practical Framework for Machine Learning Applications With Finite Time Horizon" on Tuesday, 4/23/2019 at 2pm in CS 105.

The members of his committee are as follows: Examiners: Warren Powell (adviser), Ryan Adams, and Peter Ramadge (ELE); Readers: Mengdi Wang (ORFE), Yuxin Chen (ELE), and Elad Hazan.

A copy of his thesis is available upon request. All are welcome to attend.  Abstract follows below.

Most machine learning algorithms with asymptotic guarantees leave nite time
horizon issues such as initialization or tuning open to the end users, to whom the
burden may cause undesirable outcome in practice where nite time horizon performance
matters. As an inspirational case of the undesirable nite time behavior, we
identify the nite time bias in Q-learning algorithm and present a method to alleviate
the bias on-the-
y. Motivated by the gap between the asymptotic guarantees and
the practical burdens of machine learning, we investigate the problem of learning to
learn, de ned as the problem of learning how to apply a given machine learning algorithm
to solve a given task with a nite time horizon objective function. To address
the problem more generally, we develop the framework of learning to learn optimally
(LTLO), which models the problem of optimal application of a machine learning algorithm
to a given task in a nite horizon. We demonstrate the use of the LTLO
framework as a modeling tool for a real world problem via an example of learning to
learn how to bid in sponsored search auctions. We show the practical bene t of using
the LTLO framework as a baseline to construct meta-LQKG+, a knowledge gradient
based LTLO algorithm designed to solve online hyperparameter optimization approximately
with a few number of trials, and demonstrate the practical sample eciency
of the algorithm. Answering to the need for a robust anytime LTLO algorithm, we
develop online regularized knowledge gradient policy, which solves the problem of
LTLO with high probability and has a sublinear regret bound.