[Topic-models] Sequentially Updating Dynamic Topic Models as More Data Becomes Available

Thibaut Thonet thibaut.thonet at irit.fr
Mon Feb 27 10:38:59 EST 2017

Hi Nathan,

I think that the keyword you're looking for is 'online inference.' This 
means that the inference algorithm processes the data (e.g., the 
documents) and updates the model at each epoch, i.e., whenever new data 
is observed. Online inference is therefore especially useful when the 
data comes in a streaming, time-dependent fashion. On the other hand, 
the traditional inference in which the data is processed all at once is 
referred to as 'batch inference.'

I'm not very familiar with online inference for DTM, but after a quick 
research I found this paper that could be what you're looking for: 
http://dl.acm.org/citation.cfm?id=1835889. It introduces a topic model 
similar to DTM (i.e., which models topic evolution) and provides an 
online inference algorithm for it.

You may also want to have a look at this paper: 
http://dl.acm.org/citation.cfm?id=2883046. It describes a more efficient 
inference technique for DTM -- although it seems to be batch inference.



Le 24/02/2017 à 20:31, Nathan.A.Susanj at wellsfargo.com a écrit :
> Hello,
> I am new to this list. I work in the Enterprise Analytics group at 
> Wells Fargo, focusing mostly on text data. In one of my recent 
> projects, my team was asked to see if we could detect emerging risks 
> that arise from customer complaint narratives, and I have been 
> exploring using a dynamic topic model on the data as a way of seeing 
> how complaint topics evolve over time at the bank. So far this has 
> proved to be a very interesting application of the DTM.
> Recently, however, I have had trouble with trying to think how I might 
> be able to sequentially add on more data to my existing dynamic topic 
> model over time without updating the entire model. Does anyone know if 
> it is possible to “add on” to an existing dynamic model a new time 
> slice without relearning the weights (Betas, alphas and individual LDA 
> model parameters) of the earlier time slices. This would be beneficial 
> in our situation, because we have new complaint data coming in all the 
> time, and it would be nice for model consistencies sake if I could 
> look at how the new complaints cause the topics in my model to evolve 
> without requiring a new model built on the full dataset.
> Thanks, I appreciate any feedback or ideas (links to papers, etc.).
> Nathan Susanj
> Analytic Consultant
> Wells Fargo
> Enterprise Data & Analytics (EDA)
> _nathan.a.susanj at wellsfargo.com_ <mailto:nathan.a.susanj at wellsfargo.com>
> _______________________________________________
> Topic-models mailing list
> Topic-models at lists.cs.princeton.edu
> https://lists.cs.princeton.edu/mailman/listinfo/topic-models

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/topic-models/attachments/20170227/fc70e3e6/attachment.html>

More information about the Topic-models mailing list