[Ml-stat-talks] Fwd: Pl. refer Ph.D students for summer internship positions at Yahoo! Labs in Bangalore

Robert Schapire schapire at CS.Princeton.EDU
Sun Oct 24 13:10:37 EDT 2010


-------- Original Message --------
Subject: 	Pl. refer Ph.D students for summer internship positions at 
Yahoo! Labs in Bangalore
Date: 	Sun, 24 Oct 2010 21:49:23 +0530
From: 	Muthusamy Chelliah <mchellia at yahoo-inc.com>


Yahoo! Labs in Bangalore (http://bangalore.yahoo.com/labs) is looking to 
hire interns for the summer of 2011. Summer interns will get to do 
original research in the areas of text and multimedia data mining, 
machine learning, information retrieval and extraction, online ad 
matching, microeconomics, and social network analysis.

The research we do at Yahoo! Labs has a number of distinct 
characteristics. First, the solutions that we devise need to work at 
true Web scale -- millions of users, billions of pages, and petabytes of 
data! And what's more, we get to test our algorithms on live Web 
traffic, and continuously refine them based on real-time feedback.

Some of the prominent research projects we're working on in the 
Bangalore Labs include:

    * *Web information extraction and integration:*The Web is a vast
      repository of human knowledge. Consequently, extracting structured
      data (e.g., product names and prices, restaurant names, addresses,
      ratings and reviews) from Web pages can lead to better ranking and
      presentation of search results. We are developing wrapper
      induction and machine learning techniques that leverage page
      structure, content features, and content redundancy for structured
      information extraction. We are also addressing the hard research
      problems of de-duplicating extracted records, and integrating
      records for the same entity from multiple sources.
    * *Multimedia search:*Increasingly, users of the Web are drawn to
      interesting multimedia content in the form of images and videos.
      Flickr alone has a sizable fraction of all the images on the Web. 
      Our research aims to answer questions like: How do we retrieve the
      most relevant multimedia content taking into account both content
      features and metadata (e.g., tags)? How do we find similar or
      related content -- across music, image, and video databases --
      when we have billions of objects?
    * *Computational advertising:*The central challenge here is to find
      the best ad to present to a user engaged in a given context, such
      as querying a search engine ("sponsored search") or reading a Web
      page ("contextual advertising"). Selecting an ad which improves
      the user's Web-experience while maximizing revenue is a
      non-trivial challenge. It involves quantifying relevance between
      an ad and its context, predicting click through rates and
      optimizing on-line auction mechanisms among other problems.
      Machine learning models such as gradient boosted decision trees,
      statistical techniques like logistic regression, and controlled
      2nd price auctions are some example techniques that are optimized
      simultaneously. This being a nascent area, there is tremendous
      scope for new algorithms that perform and scale better.
    * *Social network and data analysis:*Social networking sites like
      Facebook and Linkedin are gaining in popularity in today's Web.
      Our research focuses on analyzing the social graphs underlying
      mail, IM and social networking sites to find influencers, filter
      out spam, and recommend ads, topics and people to users based on
      their connections. Another interesting trend on the Web is the
      rapid growth of user-generated content in the form of blogs,
      article coments, ratings and reviews, photos, etc. Ranking,
      summarizing, detecting abuse, and analyzing the sentiment of the
      vast amounts of user-generated content pose a non-trivial research

If you know of PhD students who are interested in working on the above 
exciting research problems in the Web, and have strong backgrounds in 
text/data mining, statistics and machine learning, then please forward 
this mail message to them.

http://bangalore.yahoo.com/labs/publications.html has a list of papers 
that describes our recent research.

  At the moment, we are hiring to fill full-time research scientist as 
well as summer (2011) internship positions in Bangalore. Interested 
candidates can send their resumes to the following email address: 
yahoo-labs-blr at yahoo-inc.com.

Best regards


Head, Academic Relations,

Yahoo! Labs, India

