[Topic-models] Optimizing LDA on Mallet

Normand Peladeau peladeau at provalisresearch.com
Sat Dec 3 12:52:03 EST 2016

We are trying to compare a topic modeling method against LDA created by
Mallet.  We want to optimize the LDA topic model as much as possible by
generating multiple TM solutions and comparing them using internal criteria.
I have a few questions regarding this task:


1)      Which parameters should we vary and what would be the range of those
values?  We are currently generating hundreds of models by changing the
alpha value and the number of iterations. Are there other parameters we
should vary to make sure we generate the best topic modeling as possible?  

2)      What are the internal criteria that may be used to choose a topic
model over another one? Topic consistency? What else?


Are there recommended values for obtaining good topics?  We are trying to
extract 50 topics from 10,000 short documents.










