[Topic-models] Similarity word LDA
francescolisena8 at gmail.com
Mon Jul 23 21:36:50 EDT 2012
I am a beginner at Mallet.
I'm working on the technique of TopicExtraction, using the class
I have 30,000 documents and I am able to estimate my model, and print all
kind of report....
The model created, has these parameters:
- alpha = 0,1;
- beta = 0,01;
- numTopics= 300;
- numIterations= 2000;
In addition, I use the function TokenSequenceNGrams to represent unigrams,
bigrams and trigrams.
My problem is:
I would like to create an algorithm that given as input a set of words
(query words), returns as output a set of related words, using my topic
Input word: java, software
output: java_developer, eclipse, software_ engineer
Any idea? There are formulas in the literature? For example a similarity
P(w|Q) where Q is a set of query's word.????
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Topic-models