[Topic-models] Why does alpha serve as a scalar in Blei's code?
elkan at cs.ucsd.edu
Sat Jan 15 15:57:58 EST 2011
In LDA, it is useful to allow both the alpha and beta priors to be non-uniform,
and to learn them. See Accounting for burstiness in topic models
by Doyle and Elkan, ICML 2009.
>From Section 2:
"Learning the hyperparameters can provide information
about the corpus: alpha indicates how semantically diverse
documents are, with lower alpha indicating increased di-
versity, while beta indicates how similar the topics are,
with higher beta indicating more similarity between top-
ics. Learning non-uniform values for the hyperparam eters
allows di fferent words and topics to have di fferent
tendencies; some topics can be more general than oth-
ers (e.g., function words versus medical jargon), and
some words can be likely to appear in more topics than
others (e.g., words with multiple senses)."
Abstract: Many diff erent topic models have been used
successfully for a variety of applications.
However, even state-of-the-art topic models
suff er from the important fl aw that they do
not capture the tendency of words to appear
in bursts; it is a fundamental property of lan-
guage that if a word is used once in a doc-
ument, it is more likely to be used again.
We introduce a topic model that uses Dirich-
let compound multinomial (DCM) distribu-
tions to model this burstiness phenomenon.
On both text and non-text datasets, the
new model achieves better held-out likeli-
hood than standard latent Dirichlet alloca-
tion (LDA). It is straightforward to incorpo-
rate the DCM extension into topic models
that are more complex than LDA.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Topic-models