[Topic-models] Origin of Topic Modeling

Normand Peladeau peladeau at provalisresearch.com
Mon Dec 11 22:27:31 EST 2017

Almost all papers I read on topic modeling will refer to David Blei's 2003
paper on LDA as the origin of topic modeling (at least under the current
name), while some will go back a little further, pointing to both LSA and
pLSA papers published in the 1990s as the original attempts to extract
topics from text corpus.  I wonder why nobody goes further back and mention
the work of Borko in Information sciences in the early 60s, those of Iker &
Harvey in psychology in the mid 60s, and several other ones in social
sciences and literature studies.


Is there any reason why we should not consider those earlier attempts? 


Did anybody care to compare topics extracted with LDA or other recent
techniques with those that one could obtain using techniques proposed more
than 50 years ago?  


Are we reinventing the wheel?  And is this wheel better than the old one?
Any opinion on this would be welcome?   I will be presenting a paper on this
"topic" next month at the HICSS conference and I would very much like to
know what experts have to say about this.


Normand Péladeau


