# [Topic-models] Does word-ordering matter in Gibbs sampling?

dan danwalkeriv at gmail.com
Thu Jul 27 18:56:10 EDT 2017

You can sample the topic assignments in any order that you want. It makes
the code slightly harder to write, but any order, including random order,
will work.

> Hi Dan,
> I would like to know how random scan Gibbs sampler can be used in LDA
> inference
>> In theory it shouldn't matter, a Gibbs sampler with infinite time and
>> machine precision would eventually mix well converge in distribution and
>> you would sample from every region of the support in proportion to it's
>> probability mass. In practice, I think you are right that it would be
>> possible for the data ordering to cause you to quickly enter a local
>> maximum that would be difficult (or impossible, given finite time and
>> machine precision) to ever exit from. One approach to mitigating this
>> problem would be to do a random sweep over the variables that you are
>> sampling. Another might be to use deterministic annealing. Charles Elkan
>> has some great descriptions about how deterministic annealing works in the
>> context of EM for mixture models (http://cseweb.ucsd.edu/~elkan
>> /250Bwinter2011/mixturemodels.pdf). I tried applying the same concepts
>> to a Gibbs sampler in my dissertation work and achieved some really
>> promising results (http://scholarsarchive.byu.ed
>> u/cgi/viewcontent.cgi?article=4529&context=etd). The advantage of DA
>> would be that it helps avoid all kinds of maxima, not just those caused by
>> scan order.
>> I also did a quick search and came across these relevant publications:
>> Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on
>> How Much (https://arxiv.org/pdf/1606.03432.pdf)
>> Implementing Random Scan Gibbs Samplers (https://link.springer.com/art
>> icle/10.1007/BF02736129)
>>
>>> Hi everyone,
>>>
>>> My apologies if this is an uninformed question, but in Gibbs sampling
>>> for LDA inference, aren’t the various counts of word-topic assignments
>>> updated word-by-word? Doesn’t this make it somewhat dependent on word
>>> ordering? For example, if word_1 is strongly associated with topic_1 and
>>> word_2 is strongly associated with topic_2, if I see a document {word_1,
>>> word_1, … (100 times), word_2, word_2, … (100 times), word_2}, then by the
>>> time I start seeing word_2, wouldn’t the algorithm be more inclined to
>>> think that it should be assigned to topic_1, compared to a scenario where I
>>> see the document {word_1, word_2, word_1, word_2, …}?
>>>
>>> Thank you,
>>> Eric
```