[Topic-models] Finding annotated datasets

Michael Röder roeder at informatik.uni-leipzig.de
Thu Jun 9 08:57:25 EDT 2016


Hi Devashish,
unfortunately, the blog "http://topics.labs.bluekiwi.de/" does not exist any more and I am sorry for any inconveniences this might have caused.
In the paper, a dataset is defined by three parts:1. a corpus2. topics that have been calculated using the corpus3. human ratings for the topics
You can find the topics (topics* files) and the human ratings (gold* files) used for our paper at: http://139.18.2.164/mroeder/palmetto/datasets/ (I will add the link to the Palmetto web page).However, because of their license I am not allowed to upload the corpora. You would need them to recreate the upper part of the table. If you are interested in that part, please write me a mail and I can describe how you could get them.
Since we did not create all datasets by ourself, I would like to remind you to cite the creators/providers of the dataset where appropriate. You can find the reference of their publications in our paper in the section that describes the datasets.
Cheers,Michael Röder
From: Devashish Deshpande <ashu.9412 at gmail.com>
Date: Wed, Jun 8, 2016 at 8:35 PM
Subject: [Topic-models] Finding annotated datasets
To: topic-models at lists.cs.princeton.edu


Hey everyone,

My name is Devashish Deshpande. I am a contributor to the
 Gensim open source topic modelling library in python and am currently 
working on a project to add the topic coherence pipeline as mentioned in this paper and demonstrated in this code to gensim. You can find my open PR here.

For
 the purpose of writing a blog post on this project and performing some 
benchmark testing, I wanted to reproduce table 2 from the above paper. However I was finding it hard to find the annotated datasets that were used 
for this. I did manage to find some links (eg the annotated movies dataset, RTL NYT, genomics) but none of them seem to be 
working. Is there any other place where I can download any of these datasets from?

Any help from will be greatly appreciated!

Thanks!
Devashish


_______________________________________________

Topic-models mailing list

Topic-models at lists.cs.princeton.edu

https://lists.cs.princeton.edu/mailman/listinfo/topic-models



 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/topic-models/attachments/20160609/0641f969/attachment.html>


More information about the Topic-models mailing list