[Topic-models] Exception when I am using Stanford Topic Modeling Toolbox v.0.4 for doing Labeled LDA

Zhang Wen zhangwen8277 at gmail.com
Sat Jul 14 15:40:08 EDT 2012


Hi ,

Recently, I got an exception when I am using the Standford TMT(version 
0.4) for doing Labeled LDA. The stack trace of exception is :

java.lang.ArrayIndexOutOfBoundsException: -1
         at scalanlp.stage.text.TermCounts$class.getDF(TermFilters.scala:64)
         at 
scalanlp.stage.text.TermCounts$$anon$2.getDF(TermFilters.scala:84)
         at 
scalanlp.stage.text.TermMinimumDocumentCountFilter$$anonfun$apply$4$$anonfun$apply$5$$anonfun$apply$6.apply(TermFilters.scala:172)
         at 
scalanlp.stage.text.TermMinimumDocumentCountFilter$$anonfun$apply$4$$anonfun$apply$5$$anonfun$apply$6.apply(TermFilters.scala:172)
         at 
scala.collection.TraversableLike$$anonfun$filter$1.apply(TraversableLike.scala:213)
         at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
         at 
scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:33)
         at 
scala.collection.TraversableLike$class.filter(TraversableLike.scala:212)
         at 
scala.collection.mutable.WrappedArray.filter(WrappedArray.scala:33)
         at 
scalanlp.stage.text.TermMinimumDocumentCountFilter$$anonfun$apply$4$$anonfun$apply$5.apply(TermFilters.scala:172)
         at 
scalanlp.stage.text.TermMinimumDocumentCountFilter$$anonfun$apply$4$$anonfun$apply$5.apply(TermFilters.scala:172)
         at scalanlp.stage.Item.map(Item.scala:32)
         at 
scalanlp.stage.text.TermMinimumDocumentCountFilter$$anonfun$apply$4.apply(TermFilters.scala:172)
         at 
scalanlp.stage.text.TermMinimumDocumentCountFilter$$anonfun$apply$4.apply(TermFilters.scala:172)
         at scala.collection.Iterator$$anon$19.next(Iterator.scala:335)
         at 
edu.stanford.nlp.tmt.data.Dataset$QueuedIterator.enqueue(Dataset.scala:80)
         at 
edu.stanford.nlp.tmt.data.Dataset$$anon$1$$anon$2.prepare(Dataset.scala:150)
         at 
edu.stanford.nlp.tmt.data.Dataset$$anon$1$$anon$2.<init>(Dataset.scala:131)
         at 
edu.stanford.nlp.tmt.data.Dataset$$anon$1.iterator(Dataset.scala:123)
         at 
edu.stanford.nlp.tmt.model.llda.LabeledLDADataset$class.iterator(LabeledLDADataset.scala:51)
         at 
edu.stanford.nlp.tmt.model.llda.LabeledLDADataset$$anon$1.iterator(LabeledLDADataset.scala:66)
         at 
edu.stanford.nlp.tmt.learn.ThreadedModeler.addData(ThreadedModeler.scala:87)
         at 
edu.stanford.nlp.tmt.learn.Modeler$class.train(Modeler.scala:108)
         at 
edu.stanford.nlp.tmt.learn.ThreadedModeler.train(ThreadedModeler.scala:35)
         at 
edu.stanford.nlp.tmt.stage.package$.TrainCVB0LabeledLDA(package.scala:83)
         at Main$$anon$1.<init>(cross-quest-10-lda-learn.scala:52)
         at Main$.main(cross-quest-10-lda-learn.scala:1)
         at Main.main(cross-quest-10-lda-learn.scala)
         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
         at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
         at java.lang.reflect.Method.invoke(Unknown Source)
         at 
scala.tools.nsc.util.ScalaClassLoader$$anonfun$run$1.apply(ScalaClassLoader.scala:78)
         at 
scala.tools.nsc.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:24)
         at 
scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.asContext(ScalaClassLoader.scala:88)
         at 
scala.tools.nsc.util.ScalaClassLoader$class.run(ScalaClassLoader.scala:78)
         at 
scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.run(ScalaClassLoader.scala:101)
         at scala.tools.nsc.ObjectRunner$.run(ObjectRunner.scala:33)
         at scala.tools.nsc.ObjectRunner$.runAndCatch(ObjectRunner.scala:40)
         at 
scala.tools.nsc.ScriptRunner.scala$tools$nsc$ScriptRunner$$runCompiled(ScriptRunner.scala:171)
         at 
scala.tools.nsc.ScriptRunner$$anonfun$runScript$1.apply(ScriptRunner.scala:188)
         at 
scala.tools.nsc.ScriptRunner$$anonfun$runScript$1.apply(ScriptRunner.scala:188)
         at 
scala.tools.nsc.ScriptRunner$$anonfun$withCompiledScript$1.apply$mcZ$sp(ScriptRunner.scala:157)
         at 
scala.tools.nsc.ScriptRunner$$anonfun$withCompiledScript$1.apply(ScriptRunner.scala:131)
         at 
scala.tools.nsc.ScriptRunner$$anonfun$withCompiledScript$1.apply(ScriptRunner.scala:131)
         at 
scala.tools.nsc.util.package$.waitingForThreads(package.scala:26)
         at 
scala.tools.nsc.ScriptRunner.withCompiledScript(ScriptRunner.scala:130)
         at scala.tools.nsc.ScriptRunner.runScript(ScriptRunner.scala:188)
         at 
scala.tools.nsc.ScriptRunner.runScriptAndCatch(ScriptRunner.scala:201)
         at 
scala.tools.nsc.MainGenericRunner.runTarget$1(MainGenericRunner.scala:58)
         at 
scala.tools.nsc.MainGenericRunner.process(MainGenericRunner.scala:80)
         at 
scala.tools.nsc.MainGenericRunner$.main(MainGenericRunner.scala:89)
         at scala.tools.nsc.MainGenericRunner.main(MainGenericRunner.scala)
         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
         at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
         at java.lang.reflect.Method.invoke(Unknown Source)
         at edu.stanford.nlp.tmt.TMTMain$.main(TMTMain.scala:57)
         at edu.stanford.nlp.tmt.TMTMain.main(TMTMain.scala)

The main operations I did are:

 1. I got the top terms by doing the LDA on a document.
 2. Create a CSV file that contains three columns. The first column is
    the sequence number, the second column are the top terms which are
    separated by space. The third the column is the document.
 3. Modify the example
    (http://nlp.stanford.edu/software/tmt/tmt-0.4/examples/example-6-llda-learn.scala)
    a little bit in order to adapt to my CSV file. Changed the indexes
    of column actually.
 4. Doing the Labeled LDA on the generated CSV file.


Can anyone please help me with this issue?  Thanks!

Best regards,
Zhang Wen





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/topic-models/attachments/20120714/2b833b7c/attachment.html>


More information about the Topic-models mailing list