[Topic-models] LDA gamma update

Holt, John (RIS-DAY) John.D.Holt at lexisnexis.com
Fri Mar 4 08:25:05 EST 2016


I have noticed the same thing.

If you move the update of the topic digamma_gam to outside the word loop so every phi is calculated with the same digamma_gam, you will get very similar results.

Running 25 topics on the included AP data, and then using the topic word print program for 5 words, you get the same words for 18 of the 25 topics.  There are 4 topics with 1 word difference, and 3 topics with two words difference.

I have not examined these differences closely enough to say if they are significant.

With respect to the first question, the all of the values for phi are pre-initialized and a sum is calculated.  The subtraction of the old from the new allows for the sum to be updated without the looping through the words.  Recall that var_gamma[k] is uses the sum across the words of the topic phi values.

From: <topic-models-bounces at lists.cs.princeton.edu<mailto:topic-models-bounces at lists.cs.princeton.edu>> on behalf of mohammad bakhtiari <educatemb at gmail.com<mailto:educatemb at gmail.com>>
Date: Thursday, March 3, 2016 at 9:55 AM
To: "topic-models at lists.cs.princeton.edu<mailto:topic-models at lists.cs.princeton.edu>" <topic-models at lists.cs.princeton.edu<mailto:topic-models at lists.cs.princeton.edu>>
Subject: [Topic-models] LDA gamma update

Hi everyone

I first want to mention a point and then ask two questions

In LDA paper, to update gamma in Variational Inference, as you can see:

[Inline image 1]
in one iteration updating phi of a word should not have effect on updating phi for other words(e.g. consider first iteration). However, in lda-c it seems that updating gamma immediately after updating phi for a word make that word affect the updating (phi for) other words. If I am wrong, please correct me?

1- I can't understand why, for each word , old phi subtracted from new phi. can someone tell me nicely?
2- can I update gamma outside of for loop(on words) and then use it for updating phi?

the code of updating gamma and phi from lda-c:
for (n = 0; n < doc->length; n++)
    phisum = 0;
    for (k = 0; k < model->num_topics; k++)
        oldphi[k] = phi[n][k];
        phi[n][k] =
            digamma_gam[k] +

        if (k > 0)
            phisum = log_sum(phisum, phi[n][k]);
            phisum = phi[n][k]; // note, phi is in log space

    for (k = 0; k < model->num_topics; k++)
        phi[n][k] = exp(phi[n][k] - phisum);
        var_gamma[k] =
            var_gamma[k] + doc->counts[n]*(phi[n][k] - oldphi[k]);
        // !!! a lot of extra digamma's here because of how we're computing it
        // !!! but its more automatically updated too.
        digamma_gam[k] = digamma(var_gamma[k]);

thanks for your time and consideration, I am looking forward.

---------------------------------------- The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. This message may be an attorney-client communication and/or work product and as such is privileged and confidential. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message.  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/topic-models/attachments/20160304/4bc5090e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 36976 bytes
Desc: image.png
URL: <http://lists.cs.princeton.edu/pipermail/topic-models/attachments/20160304/4bc5090e/attachment-0001.png>

More information about the Topic-models mailing list