[Topic-models] Similarity word LDA

Francesco Lisena francescolisena8 at gmail.com
Mon Jul 23 21:36:50 EDT 2012


Hi all...
I am a beginner at Mallet.
I'm working on the technique of TopicExtraction, using the class
ParallelTopicModel.
I have 30,000 documents and I am able to estimate my model, and print all
kind of report....
The model created, has these parameters:
- alpha = 0,1;
- beta = 0,01;
- numTopics= 300;
- numThreads=4;
- numIterations= 2000;
In addition, I use the function TokenSequenceNGrams to represent unigrams,
bigrams and trigrams.

My problem is:
I would like to create an algorithm that given as input a set of words
(query words), returns as output a set of related words, using my topic
model.

For example:
Input word: java, software
output: java_developer, eclipse, software_ engineer

Any idea? There are formulas in the literature? For example a similarity
P(w|Q) where Q is a set of query's word.????
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/topic-models/attachments/20120724/18236491/attachment.htm>


More information about the Topic-models mailing list