Claudia Roberts will present her General Exam "Selectively Contextual Bandits" on Thursday, January 23, 2020 at 2pm in CS 402

16 Jan 2020

      Claudia Roberts will present her General Exam "Selectively Contextual Bandits" on Thursday, January 23, 2020 at 2pm in CS 402. 

The members of her committee are as follows: Arvind Narayanan (adviser), Matthew Salganik (SOC), and Barbara Engelhardt 

Everyone is invited to attend her talk, and those faculty wishing to remain for the oral exam following are welcome to do so. Her abstract and reading list follow below. 

Personalization is an integral part of most web-service applications and determines which experience to display to each member. A popular algorithmic framework used in industrial personalization systems are contextual bandits, which seek to learn a personalized treatment assignment policy in the presence of treatment effects that vary with the observed contextual features of the members. In order to keep the optimization task tractable, such systems can myopically make independent personalization decisions that can conspire to create a suboptimal experience in the aggregate of the member’s interaction with the web-service. We design a new family of online learning algorithms that benefit from personalization while optimizing the aggregate impact of the many independent decisions. Our approach selectively interpolates between any contextual bandit algorithm and any context-free multi-armed bandit algorithm and leverages the contextual information for a treatment decision only if this information promises significant gains over a decision that does not take it into account. Apart from helping users of personalization systems feel less targeted, simplifying the treatment assignment policy by making it selectively reliant on the context can help improve the rate of learning. We evaluate our approach on several datasets including a video subscription web-service and show the benefits of such a hybrid policy. 

Reading List : 

Methods/Techniques (Contextual Bandits) 

    * 

[ https://papers.nips.cc/paper/4321-an-empirical-evaluation-of-thompson-sampli... | An Empirical Evaluation of Thompson Sampling ] 
    * 

[ https://arxiv.org/pdf/1003.0146.pdf | A Contextual-Bandit Approach to Personalized News Article Recommendation ] 
    * 

[ https://web.stanford.edu/~bvr/pubs/TS_Tutorial.pdf | A Tutorial on Thompson Sampling ] 
    * 

[ https://arxiv.org/pdf/1003.5956 | Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms  ] 
    * 

[ https://www.microsoft.com/en-us/research/wp-content/uploads/2016/06/ftir-onl... | Online Evaluation for Information Retrieval ] 

Textbooks 

    * 

[ http://incompleteideas.net/book/the-book-2nd.html | Reinforcement Learning: An Introduction ] (Ch 2, 16.7) 
    * 

[ http://www.mmds.org/#ver21 | Mining of Massive Datasets ] (Ch 8, 9) 
    * 

[ http://shop.oreilly.com/product/0636920027393.do | Bandit Algorithms for Website Optimization ] 

Historical Context 

    * 

[ http://rr.cs.cmu.edu/aaai.pdf | Foundations and Grand Challenges of Artificial Intelligence ] 

Motivating 

    * 

[ https://link.springer.com/content/pdf/10.1023%2FA%3A1007046204478.pdf | The Perpetuation of Subtle Prejudice: Race and Gender Imagery in 1990s Television Advertising ] 
    * 

[ https://www.researchgate.net/publication/5153026_Getting_too_personal_Reacta... | Getting too personal: Reactance to highly personalized email solicitations ] 
    * 

[ https://science.sciencemag.org/content/sci/347/6221/509.full.pdf | Privacy and human behavior in the age of information ] 
    * 

[ https://www.sciencedirect.com/science/article/pii/S0167923610001983 | The personalization privacy paradox: An exploratory study of decision making process for location-aware marketing ] 
    * 

[ https://www.microsoft.com/en-us/research/uploads/prod/2019/07/mehrotra-2017-... | Auditing Search Engines for Differential Satisfaction Across Demographics ]

Nicki Mahler

tags

participants (1)