
Arman Suleimenov will present his MSE talk on Thursday May 3 at 10AM in Room 402. The members of his committee are Andrea LaPaugh, advisor, and Adam Finkelstein, reader. Everyone is invited to attend his talk. His abstract follows below. ---------------- Title Twitter News: harnessing Twitter to build an article recommendation system Abstract With more than 140 million active users and 340 million tweets a day (as of March 2012), Twitter presents a great source of recommendation knowledge for articles shared on the platform. Collaborative filtering methods (based on matrix factorization or neighborhood-based algorithms) suffer from extremely sparse user-article matrix as well as cold start problem. Content-based filtering requires us to fetch the text/title of an article given the url which (given the fact we don't limit ourselves to a small subset of well-formatted news sites) becomes a challenge in itself. In this work, we analyze 836 Twitter users from the technology and entrepreneurship space with 78,508 links shared by them. We explore and evaluate the following (old and novel) techniques for an article recommendation engine: bag-of-words Naive Bayes, vector-to-vector similarity where the user vector is constructed from the text of the tweets produced, topic-modeling based approach where we learn the topic distribution for each article and thanks to that reduce dimensions of the user-article matrix, model where apart from relevance and novelty we take into account connection clarity and transition smoothness between articles, content-boosted collaborative filtering (with probabilistic matrix factorization) where pseudo user-ratings are created as well as the hybrid model of some of the best-performing techniques above.