[Topic-models] NaN (not a number) problem in Supervised topic model

Jon McAuliffe jon at mcauliffe.com
Fri May 16 19:38:18 EDT 2008


log(a+b) = log a + log(1 + exp(log b - log a))

is exact, not approximate. it just might not fix the underflow problem.
(but if it doesn't, log b and log a have vastly different exponents, and
you have problems bigger than getting log(a+b).)

j


On May 16, 2008, at 3:37 PM, Loulou AlSumait wrote:

> Hello Kevin and List,
>
> I work on a MATLAB version of LDA and had the same problem when  
> computing the log likelihood. You didn't mention how do you compute  
> it, but, from my experience, working in the log space didn't prevent  
> the numarical instability and the NaN problem remained . The  
> solution that worked (provided thankfully to us by David Blei) was  
> to use the approximation of log(a+b) from log(a) and log(b) instead  
> of computing the exact log of sums.
>
> I hope this information is relevant and useful.
>
> Best,
> Loulou
>
>
>
> On Fri, May 16, 2008 at 5:42 PM, Ben Wing <benwing at mail.utexas.edu>  
> wrote:
> NaN typically occurs when you do 0/0 or -INF + INF or something like
> this.  if you do some operation (e.g. division by zero or log of zero)
> that results in +INF or -INF, you can easily get NaN when you do
> further work with this.  a NaN log-likelihood might result from some
> earlier operation where you did log(0) due to underflow or something.
>
> there is probably a way in MATLAB to tell it to throw an error
> whenever you generate INF of NaN.  you can do this in C like this:
>
> #include <fenv.h>
> feenableexcept(FE_INVALID);
>
> ben
>
> On Fri, May 16, 2008 at 11:13 AM, David Blei <blei at cs.princeton.edu>  
> wrote:
> > hi kevin,
> >
> > typically, asserts don't indicate problems.  they are used to find  
> and
> > diagnose them.  if the likelihood ever is NaN then i want the  
> program to
> > stop running because there is something seriously wrong.
> >
> > i'm not sure what your problem might be, but the likelihood should  
> not be
> > NaN.
> >
> > best,
> > dave
> >
> >
> > On May 15, 2008, at 11:44 PM, kevin chen wrote:
> >>
> >> Dear all,
> >>
> >> I am implementing "Supervised topic model" in Matlab language,  
> following
> >> the paper by Blei and McAuliffe.
> >> I encounter a problem: after about 30 iterations of EM (corpus  
> level), the
> >> per-document log-likelihood becomes NaN (Not a number).
> >> In my code, the vbem (document level) iterates for 200 times  
> without
> >> convergence analysis. Is NaN problem due to too many vbem  
> iteration times?
> >>
> >> I have checked my code for several times, and couldn't find out  
> how to fix
> >> it.
> >>
> >> Interestingly, there is also a NaN check in Blei's code of LDA:
> >> assert(!isnan(likelihood));
> >> So is it a general problem in topic model related program?
> >>
> >> Thank you in advance.
> >>
> >> Best,
> >> Kevin
> >>
> >> _______________________________________________
> >> Topic-models mailing list
> >> Topic-models at lists.cs.princeton.edu
> >> https://lists.cs.princeton.edu/mailman/listinfo/topic-models
> >>
> >
> > _______________________________________________
> > Topic-models mailing list
> > Topic-models at lists.cs.princeton.edu
> > https://lists.cs.princeton.edu/mailman/listinfo/topic-models
> >
> >
> _______________________________________________
> Topic-models mailing list
> Topic-models at lists.cs.princeton.edu
> https://lists.cs.princeton.edu/mailman/listinfo/topic-models
>
>
>
> -- 
> Life is the mirror of your actions and the eco of your sayings!
> الحياة مرآة أعمالك وصدى أقوالك..
> إذا أردت أن يحبك الله فأحب غيرك..
> وإذا أردت أن يوقرك أحد فوقر غيرك..
> وإذا أردت أن يرحمك الله فارحم غيرك..
> وإذا أردت أن يسترك الله فاستر غيرك..
> _______________________________________________
> Topic-models mailing list
> Topic-models at lists.cs.princeton.edu
> https://lists.cs.princeton.edu/mailman/listinfo/topic-models

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2417 bytes
Desc: not available
URL: <http://lists.cs.princeton.edu/pipermail/topic-models/attachments/20080516/f0cb2fd4/attachment-0001.bin>


More information about the Topic-models mailing list