[Topic-models] TMVE for Online LDA
svmehta at stanford.edu
Mon Jul 9 14:39:31 EDT 2012
I was looking at the online lda implementation and had a couple questions
on interpretation of some of per document parameters.
The scaled_score appears to be the total gamma weight for a particular
topic in the document divided by the number of words in the document. Is
the purpose of the scaled score simply to be able to distinguish the
relative importance of a topic across documents?
Also, I understand that the posterior over the per topic weights theta is
parameterized by gamma. If I want to recover the multinomial theta that
produced the topic assignments in a given document, should I simply
normalize the scores of a document such that they add to 1? Is this the
actual multinomial theta or is it actually just the realized proportions of
the assignments drawn from that distribution?
On Sun, Jul 1, 2012 at 5:58 PM, Allison Chaney <ajb.chaney at gmail.com> wrote:
> Hi all,
> In a similar project to Colorado Reed's TMA, we have incorporated TMVE
> with online LDA . We've used Django as well; the biggest advantage to
> this system is that you can browse the corpus and topics as the model is
> being run.
> You can find the code fo the project here:
> Please let me know if you see any bugs/problems, or if you have any
>  you can download Matt Hoffman's python code for online LDA here:
> Topic-models mailing list
> Topic-models at lists.cs.princeton.edu
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Topic-models