[Ml-stat-talks] Sam Brody, Monday 11/22
blei at CS.Princeton.EDU
Fri Nov 19 09:43:49 EST 2010
sam brody is speaking at 3PM on monday 11/22. this looks to be
interesting to enthusiasts of probabilistic modeling, natural language
processing, and computational linguistics. announcement below.
Learning Meaning from Statistics
Sam Brody, Columbia University
Monday, November 22, 3:00PM
Computer Science Room 402
Statistical Semantics deals with the connection between meaning and
its representation in language. Methods using statistical information
from raw text can address many problems in NLP, and require little or
no annotated training data. In this talk, I will present two such
methods which provide solutions to two high-level problems in NLP:
ambiguity and synonymy.
First, I will describe a state-of-the-art unsupervised word sense
disambiguation (WSD) method which combines distributional and semantic
similarity to automatically create sense-labeled training data. The
resulting corpus is used to train highly accurate machine-learning
classifiers for disambiguation. This method significantly outperform
previous approaches, while remaining free from manual supervision.
The second part of the talk will discuss a system for detecting two
types of synonymy: similarity in topic, and similarity in sentiment.
The system is designed to extract this information automatically from
a set of online reviews of products or services. For aspect detection,
it uses a modified version of the Latent Dirichlet Allocation (LDA)
topic model, and for sentiment it uses a graph-based algorithm to
automatically assign sentiment polarity scores to all adjectives in
the data, in a topic-specific manner. The resulting topic and sense
assignments correlate strongly with human judgment.
The approaches described in the talk are widely applicable, and offer
benefits to many NLP tasks, including information extraction, topic
and sentiment detection, and summarization.
More information about the Ml-stat-talks