Jayesh Choudhari choudhari.jayesh at iitgn.ac.in
Mon May 1 18:02:39 EDT 2017

Hey Doga,

I am not completely sure about the covariance matrix question.
But for the next question about word assignments -- if you go through the
generative model, for each word in the document, a topic is chosen from the
topic distribution, and then the word is drawn from the word distribution
of that topic. So as you can see, for each word, there is a chance that we
can get a new topic, and that applies to the same word (repeated multiple
times) as well. Each of the instances of the same word is like a new word,
and thus the topic assignment might be different for different instances.


On Tue, May 2, 2017 at 1:10 AM, Gulhan, Doga C <Doga_Gulhan at hms.harvard.edu>

> Hello,
> Thank you very much for implementing topic models and providing the R
> package. I am trying to use this method for mutational signature analysis
> in whole genome sequences. I read on github that this was the platform for
> asking questions.
> I was wondering why the covariance matrix is K-1 dimensional and not a
> square matrix of size of topics?
> Another question I have is about how words in each document are associated
> to a topic. In the cases when a particular word occurs in a document
> several times, are all of such cases associated to a single topic. If a
> word belongs to multiple topics and if a document contains more than one of
> these topics, then shouldn't the same word be assigned to different topics
> in the same document? Based on 'wordassignments' this is not what is
> happening. Did I get it correctly?
> Thank you very much for your help.
> Best regards,
> Doga
