[Topic-models] LDA beginner's questions

Felix Endres 1980er at web.de
Wed Nov 5 05:12:20 EST 2008


> Hi,
> 
> Thanks a lot for replying.
> Some comments and questions inline...
 
>>>4. Why is p(z_n | \theta) given by \theta_i for the unique i such that
>>>z_n^i = 1 ?
>> Because \theta = { \theta_1, \theta_2, ..., \theta_n }, where each entry
>> gives
>> the probability for one topic. And for topic z_n^i = 1 it is given by
>> \theta_i.
> 
> So that means there's no possibility of a word belonging to one topic
> more than the others?
> If a word belongs to a topic, that words belongs to only this topic?
> Can this assignment change the next time we notice the same word later
> in the document?
> 
Yes there is. P(w_i|z_p = 1) is in general different from P(w_i|z_q = 1). In English: The word "belongs the most" to the topic that gives it the highest probablity to occur. This probability is different for topics p,q,...


>>>What steps do we make in order to make the LDA work correctly?
>> I do not understand the question.
> 
> The part you wrote below answered my question :)
> 
>>>Estimate parameters and then do inference, or the other way around? I
>>>think this is missing in the paper.
> 
>> First parameter estimation, then inference. You need the parameters for
>> inference.
> 
> Please correct me if I am wrong:
> the estimation works by estimating \alpha and \beta,
> while inference gives me the values of z ?
> 
Yes. And estimation with GibbsLDA (http://gibbslda.sourceforge.net/) gives me all the other variables too. I think inference computes \theta too.
BTW you cannot infer without estimated parameters.

> I'm also curious why the original paper describes inference first,
> then estimation...
> Any hints?
> 
No. I found it confusing too. Anyone knows why?

>>>7. Is the LDA-C a 1-1 implementation of what is published in the
>>>paper? I was trying to read the code but for the first few passes over
>>>the code I don't see any direct mapping to most of the equations
>>>published in the paper.
> 
>> I do not know. But it had comparable results in a short experiment.
> 
> Ok. I'll rephrase a bit more to get more details.
> 
> To what part of the paper does lda_mle() function refer to?
> 
I do not now the source. I didn't do much with lda-c

Regards,
Felix

P.S: Sorry for messing up the subject-line before.
_______________________________________________________________________
Jetzt neu! Schützen Sie Ihren PC mit McAfee und WEB.DE. 30 Tage
kostenlos testen. http://www.pc-sicherheit.web.de/startseite/?mc=022220



More information about the Topic-models mailing list