[Topic-models] Topic assignment of words

Gulhan, Doga C Doga_Gulhan at hms.harvard.edu
Tue May 2 09:47:16 EDT 2017

Hi Jayesh,

Thanks a lot for your explanation! Do you know if there is a way to obtain the word assignments for each occurrence using the R package? The wordassignments slot doesn't contain this info. I only see one topic assigned each word in the vocabulary.



From: Jayesh Choudhari <choudhari.jayesh at iitgn.ac.in>
Sent: Monday, May 1, 2017 6:02:39 PM
To: Gulhan, Doga C
Cc: topic-models at lists.cs.princeton.edu
Subject: Re: [Topic-models] Topic assignment of words

Hey Doga,

I am not completely sure about the covariance matrix question.
But for the next question about word assignments -- if you go through the generative model, for each word in the document, a topic is chosen from the topic distribution, and then the word is drawn from the word distribution of that topic. So as you can see, for each word, there is a chance that we can get a new topic, and that applies to the same word (repeated multiple times) as well. Each of the instances of the same word is like a new word, and thus the topic assignment might be different for different instances.


On Tue, May 2, 2017 at 1:10 AM, Gulhan, Doga C <Doga_Gulhan at hms.harvard.edu<mailto:Doga_Gulhan at hms.harvard.edu>> wrote:


Thank you very much for implementing topic models and providing the R package. I am trying to use this method for mutational signature analysis in whole genome sequences. I read on github that this was the platform for asking questions.

I was wondering why the covariance matrix is K-1 dimensional and not a square matrix of size of topics?

Another question I have is about how words in each document are associated to a topic. In the cases when a particular word occurs in a document several times, are all of such cases associated to a single topic. If a word belongs to multiple topics and if a document contains more than one of these topics, then shouldn't the same word be assigned to different topics in the same document? Based on 'wordassignments' this is not what is happening. Did I get it correctly?

Thank you very much for your help.

Best regards,


Topic-models mailing list
Topic-models at lists.cs.princeton.edu<mailto:Topic-models at lists.cs.princeton.edu>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/topic-models/attachments/20170502/f8a9d206/attachment-0001.html>

More information about the Topic-models mailing list