[talks] H Luo general exam
Melissa M. Lawson
mml at CS.Princeton.EDU
Mon May 6 15:04:56 EDT 2013
Haipeng Luo will present his research seminar/general exam on Monday May 13 at
2PM in Room 402. The members of his committee are: Rob Schapire, Sebastien
Bubeck (ORFE), and Moses Charikar. Everyone is invited to attend his talk and
those faculty wishing to remain for the oral exam following are welcome to do so.
His abstract and reading list follow below.
----- Original Message -----
Title : Online Learning with Unknown Time Horizon
Abstract : We consider online learning when the time horizon T is unknown. We apply a minimax analysis, beginning with the fixed horizon case, and then moving on to two unknown-horizon settings, one that assumes the horizon is chosen randomly according to some known distribution, and the other which allows the adversary full control over the horizon. For the random horizon setting with restricted losses, we derive a fully optimal minimax algorithm. And for the adversarial horizon setting, we prove a nontrivial lower bound which shows that the adversary obtains strictly more power than when the horizon is fixed and known. Based on the minimax solution of the random horizon setting, we then propose a new adaptive algorithm which “pretends” that the horizon is drawn from a distribution from a special family, but no matter how the actual horizon is chosen, the worst-case regret is bounded by O(sqrt(T*lnN)), where N is the number of actions. Furthermore, the constant in our regret bound approaches 1, which is better than the best previously known bound for this setting (as far as we know). Moreover, our algorithm can be generalized to other games, such as predicting with expert advice.
Reading List :
1. Mehryar Mohri, Afshin Rostamizadeh and Ameet Talwalkar. Foundations of Machine Learning. MIT Press, 2012.
2. Nicolo Cesa-Bianchi and Gabor Lugosi. Prediction, Learning, and Games (Chapter 2,4,6,7). Cambridge University Press, 2006.
3. Jacob Abernethy and Manfred K. Warmuth. Repeated games against budgeted adversaries. In Advances in Neural Information Processing Systems 24, 2010.
4. Jacob Abernethy, Manfred K. Warmuth, and Joel Yellin. Optimal strategies from random walks. In Proceedings of the 21st Annual Conference on Learning Theory, 2008.
5. Jacob Abernethy, Alekh Agarwal, Peter L. Bartlett, Alexander Rakhlin. A Stochastic View of Optimal Regret through Minimax Duality. arXiv:0903.5328, 2009.
6. Peter Auer, Nicolo Cesa-Bianchi, and Claudio Gentile. Adaptive and self-conﬁdent on-line learning algorithms. Journal of Computer and System Sciences, 64(1):48–75, 2002.
7. Nicolo Cesa-Bianchi, Yoav Freund, David P. Helmbold, David Haussler, Robert E. Schapire, and Manfred K. Warmuth. How to use expert advice. Journal of the ACM, 44(3):427–485, May 1997.
8. Kamalika Chaudhuri, Yoav Freund, and Daniel Hsu. A parameter-free hedging algorithm. arXiv:0903.2851, 2009.
9. Yoav Freund and Robert E. Schapire. Adaptive game playing using multiplicative weights. Games and Economic Behavior, 29:79–103, 1999.
10. David Haussler, Jyrki Kivinen, and Manfred K. Warmuth. Sequential prediction of individual sequences under general loss functions. IEEE Transactions on Neural Networks, 10(6):1291–1304, 1999.
11. Nick Littlestone and Manfred Warmuth. The weighted majority algorithm. In 30th Annual Symposium on Foundations of Computer Science, pages 256–261, 1989.
12. Alexander Rakhlin, Ohad Shamir, Karthik Sridharan. Relax and Randomize: From Value to Algorithms. arXiv:1204.0870, 2012.
13. V. G. Vovk. A game of prediction with expert advice. Journal of Computer and System Sciences, 56(2):153–173, April 1998.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the talks