[Topic-models] Training Classifier with multi-labeled data (Daniel Ramage)

David Blei david.blei at gmail.com
Fri Mar 26 10:48:57 EDT 2010


hi hong,

thanks for pointing us to your paper.  it was very interesting.

there is something that confused me.  i understood the perspective of
your model as a sophisticated slda with a binary vector response.
however, it wasn't clear to me how the components of the z variable
are interpretable as the "class" for each word or paragraph.  while
you set the dimension of z to be the same as the number of classes,
your predictive model (equation 1) is a multivariate logistic
regression from z-bar to y.  i don't see what ties the c-th component
of z to the c-th class in y.  is z observed in your data?  (from the
graphical model, it doesn't seem to be.)

if there's no conceptual block to it, i think it would be interesting
to explore the effect of the number of topics on your predictive
performance.

i'd also be interested to hear about the future work that you mention,
where you model label sparsity with lasso-style regularization.

thanks again for sending the paper.

best
dave

On Thu, Mar 4, 2010 at 3:39 PM, YANG,Shuang Hong <eeshyang at gmail.com> wrote:
> Hi All:
> A probably naive idea to tailor LDA for ambiguous data analysis such as
> multi-label classification is to use topics directly as class labels, say, k
> = #classes,  theta = the class mixture, z = the per-word class assignment.
> We explored this idea
> in http://www.cc.gatech.edu/~syang46/papers/NIPS09.pdf
> where similar to SLDA, a side variable Y = the label observation
> was augmented.
> This model is barely a different interpretation of SLDA, or alternatively
> could be viewed as Bayesian treatment to the multinomial-event-model-based
> naive Bayes classifier, yet it beats SVM on text classification (both normal
> text and short text such as web search queries) in our experiments -- the
> reported results use z as per-paragraph class assignment for normal texts,
> but we found using z as per-word class assignment gives similar performance.
> Any comment to this is greatly appreciated.
> Shang
>
> On Thu, Mar 4, 2010 at 12:00 PM,
> <topic-models-request at lists.cs.princeton.edu> wrote:
>>
>> From: Daniel Ramage <dramage at cs.stanford.edu>
>> To: Liu Bin <korolevbin at gmail.com>
>> Date: Wed, 03 Mar 2010 09:46:56 -0800
>> Subject: Re: [Topic-models] Training Classifier with multi-labeled data
>> Hi Bin,
>>
>> One option is to use Labeled LDA,
>> http://www.aclweb.org/anthology/D/D09/D09-1026.pdf which constrains each
>> document's topic distribution to align with the document's label space.
>>  Because the per-document topics in this model are actually observed, it's
>> less of a latent and more of a blatant dirichlet allocation.  It's
>> competitive with an SVM baseline in our experiments, but state of the art
>> discriminative models still beat it.
>>
>> dan
>>
>
> _______________________________________________
> Topic-models mailing list
> Topic-models at lists.cs.princeton.edu
> https://lists.cs.princeton.edu/mailman/listinfo/topic-models
>
>


More information about the Topic-models mailing list