# [Topic-models] Intuition behind CTM and DTM

Lei Tang L.Tang at asu.edu
Sun Nov 30 22:43:12 EST 2008

```In correlated topic model paper,  the expectation of the log normalizer in
eq(6) is upper bounded by eq (7).

Maybe it's a silly question. How to derive eq(7) with a Taylor expansion?

-Lei

On Mon, Nov 24, 2008 at 7:29 AM, David Mimno <mimno at cs.umass.edu> wrote:

> On Mon, Nov 24, 2008 at 12:14:11AM -0700, Lei Tang wrote:
> > 1. In correlated topic models, the topic proportion is sampled from a
> > logistic normal distribution instead of Dirichlet as in LDA. I didn't
> quite
> > understand the intuition behind such a modeling. Why is logistic normal
> > distribution has such power?
>
> There are two primary advantages:
>
> First, covariance. (See John Aitchison's work for a more detailed
> discussion.) Let's say I have a corpus with three topics: sports (team,
> player, league), politics (weapons, trade, president), and negotiation
> (meeting, deadline, agreement). Both sports and politics occur with
> negotiation, but sports and politics rarely cooccur.
>
> With a Dirichlet, all I can say is how often I expect each topic to occur
> (the values of the parameters in proportion to each other) and how much I
> expect any given document to follow those proportions (the sum of the
> parameters, where larger = less variance). With a logistic normal, I can
> set up a covariance matrix with positive covariance between sports and
> negotiation but negative covariance between sports and politics.
>
> Second, there are very well studied models for time-series and
> spatio-temporal data in continuous spaces. These usually aren't applicable
> to count data like words, but if you can represent the word counts as
> derived from a real-valued hidden variable, Kalman filtering and dynamic
> linear models become available.
>
> Here are two R functions that might help give some intuition for the
> parameterization and the behavior of Dirichlets and logistic normals:
>
> ## Dirichlet
> rdirichlet <- function(alpha = c(1.0, 1.0, 1.0)) {
>        n <- length(alpha)
>        result <- rep(0, )
>        for (i in 1:n) {
>                result[i] <- rgamma(1, alpha[i])
>        }
>
>        result / sum(result)
> }
>
> ## zero-mean logistic normal
> rlogisticnorm <- function(covariance = matrix(c(2, 0.5, -0.5, 0.5, 2, 0.5,
> -0.5, 0.5, 2), nrow=3)) {
>        n <- dim(covariance)[1]
>        result <- exp(covariance %*% rnorm(n))
>
>        result / sum(result)
> }
>
> -David
> _______________________________________________
> Topic-models mailing list
> Topic-models at lists.cs.princeton.edu
> https://lists.cs.princeton.edu/mailman/listinfo/topic-models
>

--
Lei Tang
Dept.of Computer Science and Engineering,
Arizona State University
http://www.public.asu.edu/~ltang9/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/topic-models/attachments/20081130/44a36f20/attachment.htm>
```