# [Topic-models] Intuition behind CTM and DTM

David Mimno mimno at cs.umass.edu
Mon Nov 24 09:29:20 EST 2008

```On Mon, Nov 24, 2008 at 12:14:11AM -0700, Lei Tang wrote:
> 1. In correlated topic models, the topic proportion is sampled from a
> logistic normal distribution instead of Dirichlet as in LDA. I didn't quite
> understand the intuition behind such a modeling. Why is logistic normal
> distribution has such power?

First, covariance. (See John Aitchison's work for a more detailed
discussion.) Let's say I have a corpus with three topics: sports (team,
player, league), politics (weapons, trade, president), and negotiation
(meeting, deadline, agreement). Both sports and politics occur with
negotiation, but sports and politics rarely cooccur.

With a Dirichlet, all I can say is how often I expect each topic to occur
(the values of the parameters in proportion to each other) and how much I
expect any given document to follow those proportions (the sum of the
parameters, where larger = less variance). With a logistic normal, I can
set up a covariance matrix with positive covariance between sports and
negotiation but negative covariance between sports and politics.

Second, there are very well studied models for time-series and
spatio-temporal data in continuous spaces. These usually aren't applicable
to count data like words, but if you can represent the word counts as
derived from a real-valued hidden variable, Kalman filtering and dynamic
linear models become available.

Here are two R functions that might help give some intuition for the
parameterization and the behavior of Dirichlets and logistic normals:

## Dirichlet
rdirichlet <- function(alpha = c(1.0, 1.0, 1.0)) {
n <- length(alpha)
result <- rep(0, )
for (i in 1:n) {
result[i] <- rgamma(1, alpha[i])
}

result / sum(result)
}

## zero-mean logistic normal
rlogisticnorm <- function(covariance = matrix(c(2, 0.5, -0.5, 0.5, 2, 0.5,
-0.5, 0.5, 2), nrow=3)) {
n <- dim(covariance)[1]
result <- exp(covariance %*% rnorm(n))

result / sum(result)
}

-David
```