[Topic-models] asymmetric priors and IDF

Julien Velcin julien.velcin at univ-lyon2.fr
Thu Jun 16 04:12:27 EDT 2016


Thank you Shibamouli. I agree I can directly modify the code of MALLET 
as you said (by the way, the class name is ParallelTopicModel).

My question was more on:
a) has someone ever tested this idea?
b) does it worth a try?

I suspect that it won't change the results much for the frequent words 
will overcome the unfrequent ones all the same, whatever the priors.

Best regards,

Julien

> Shibamouli Lahiri <mailto:shibamoulilahiri at gmail.com>
> June 15, 2016 at 6:10 PM
> I'm not sure if this answers your question, but could you not simply 
> re-write the uniform prior with a non-uniform prior yourself?
>
> For example, in the version of Mallet that I have, you'll need to 
> change the source of  src/cc/mallet/topics/LDA.java.  And then write a 
> simple class that gets the IDF and feeds it into LDA.java.
>
>
> Regards,
> Shibamouli
>
>
>
>
>
>
> Julien Velcin <mailto:julien.velcin at univ-lyon2.fr>
> June 15, 2016 at 3:57 PM
> Dear topic modelers,
>
> I'm wondering whether someone has tried to use an asymmetric prior in 
> LDA for p(w/z), based on the inverse document frequency (IDF). We can 
> postulate that this kind of prior will lower the impact of stop words 
> and, therefore, results in topics of higher quality.
>
> By the way, if this is a good idea, which packages allow to (easily) 
> set up asymmetric priors? For instance, MALLET is based on symmetric 
> priors.
>
> Thank you,
>
> Julien

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/topic-models/attachments/20160616/b188686e/attachment.html>


More information about the Topic-models mailing list