[Topic-models] Mallet topic modeling: topics of variable length disregard the value set using --num-top-words parameter

sophie burkhardt soburkha at uni-mainz.de
Tue Oct 20 05:55:06 EDT 2015


Only words that have a count >0 are displayed. If you get less its because
there are no more words that occur with that topic. Especially with a large
number of topics, most topics will have few words.

2015-10-20 1:43 GMT+02:00 Sameendra Samarawickrama <smsamrc at gmail.com>:

> sorry for the repost but I'm really stuck here without a success. I tried
> this to a different data but same thing happens.
>
> I found that when applying LDA on to my dataset, number of topic words
> doesn't match the specified number of topic words using the
> "-num-topc-words" parameter.
>
> So for example if I run LDA with following configuration,
>
> mallet train-topics --input $posts.mallet --num-topics 1000 --num-top-words 50 --output-topic-keys topics.txt
>
> the topics in my "topics.txt" are of variable length, most of them doesn't
> even have 10 topic words when I'm supposed to have 50.
>
> This happens only when I'm trying to fit a large no. of topics (e.g., t >
> 500).
>
> Does anybody know why is this happening so?
>
> _______________________________________________
> Topic-models mailing list
> Topic-models at lists.cs.princeton.edu
> https://lists.cs.princeton.edu/mailman/listinfo/topic-models
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/topic-models/attachments/20151020/2a2feb86/attachment.html>


More information about the Topic-models mailing list