[Topic-models] Mallet topic modeling: topics of variable length disregard the value set using --num-top-words parameter

Sameendra Samarawickrama smsamrc at gmail.com
Mon Oct 19 19:43:46 EDT 2015


sorry for the repost but I'm really stuck here without a success. I tried
this to a different data but same thing happens.

I found that when applying LDA on to my dataset, number of topic words
doesn't match the specified number of topic words using the
"-num-topc-words" parameter.

So for example if I run LDA with following configuration,

mallet train-topics --input $posts.mallet --num-topics 1000
--num-top-words 50 --output-topic-keys topics.txt

the topics in my "topics.txt" are of variable length, most of them doesn't
even have 10 topic words when I'm supposed to have 50.

This happens only when I'm trying to fit a large no. of topics (e.g., t >
500).

Does anybody know why is this happening so?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/topic-models/attachments/20151020/10ab36e1/attachment.html>


More information about the Topic-models mailing list