# [Topic-models] LDA beginner's questions

Veena T veenat2005 at gmail.com
Sun Nov 23 01:00:33 EST 2008

>> Hi,
>
>> Thanks a lot for replying.
>> Some comments and questions inline...

>>>>4. Why is p(z_n | \theta) given by \theta_i for the unique i such that
>>>>z_n^i = 1 ?
>>> Because \theta = { \theta_1, \theta_2, ..., \theta_n }, where each entry
>> >gives
>> >the probability for one topic. And for topic z_n^i = 1 it is given by
>> >\theta_i.
>
> >So that means there's no possibility of a word belonging to one topic
>> more than the others?
> >If a word belongs to a topic, that words belongs to only this topic?
> >Can this assignment change the next time we notice the same word later
> >in the document?
>
>Yes there is. P(w_i|z_p = 1) is in general different from P(w_i|z_q = 1).
In English: The word "belongs the most" to the topic that gives it the
highest probablity to occur. This >probability is different for topics
p,q,...

>>>What steps do we make in order to make the LDA work correctly?
>> I do not understand the question.
>
> The part you wrote below answered my question :)
>
>>>Estimate parameters and then do inference, or the other way around? I
>>>think this is missing in the paper.
>
>> First parameter estimation, then inference. You need the parameters for
>> inference.
>
> Please correct me if I am wrong:
> the estimation works by estimating \alpha and \beta,
> while inference gives me the values of z ?
>
>Yes. And estimation with GibbsLDA (http://gibbslda.sourceforge.net/) gives
me all the other variables too. I think inference computes \theta too.
>BTW you cannot infer without estimated parameters.

> I'm also curious why the original paper describes inference first,
> then estimation...
> Any hints?
>
>No. I found it confusing too. Anyone knows why?
Actually the inference which is explained in paper(which you are mentioning
above) is part of the expectation atep. We are doing the variational
inferencing there. At the end of this we have the present values of the
variational parameters being estimated. estimation mentioned in the next
section is the maximization step of the EM. I feel the order is correct.

>>>7. Is the LDA-C a 1-1 implementation of what is published in the
>>>paper? I was trying to read the code but for the first few passes over
>>>the code I don't see any direct mapping to most of the equations
>>>published in the paper.
>
>> I do not know. But it had comparable results in a short experiment.
>
> Ok. I'll rephrase a bit more to get more details.
>
> To what part of the paper does lda_mle() function refer to?
>
>I do not now the source. I didn't do much with lda-c
>
>Regards,
>Felix
regards
veena

--
Veena Srinivas,
PhD scholar,
Speech and Vision Lab