experimental-lda icon indicating copy to clipboard operation
experimental-lda copied to clipboard

Threading unsafe updater?

Open philipy1219 opened this issue 9 years ago • 1 comments

It seems that updater function in parallelLDA is threading unsafe. The task of updating words-topic distribution is split by word label, which makes the same words not be operated in the same time. But the updating function also contains topic array updating. Could it make the same topic operated by different thread in the same time?

The updating queue is pushed in model::sampling with the following code:

cbuff[nst*(w%ntt)+i].push(delta(w,old_topic, topic));

And then the updating function:

virtual int updater(int i)                  // updating sufficient statistics, can be outsourced to children
{
    do
    {
        for (int tn = 0; tn<nst; ++tn)
        {
            if (!(cbuff[i*nst + tn].empty()))
            {
                delta temp = cbuff[i*nst + tn].front();
                cbuff[i*nst + tn].pop();
                n_wk[temp.word][temp.old_topic] -= 1;
                n_wk[temp.word][temp.new_topic] += 1;
                n_k[temp.old_topic] -= 1;
                n_k[temp.new_topic] += 1;
                //n_wk[temp.word][temp.old_topic].fetch_add(-1);
                //n_wk[temp.word][temp.new_topic].fetch_add(+1);
                //n_k[temp.old_topic].fetch_add(-1);
                //n_k[temp.new_topic].fetch_add(+1);
            }
        }
    } while (!done[i]);

    return 0;
}       

Both word and topic related array are operated in the same thread.

philipy1219 avatar Aug 14 '15 07:08 philipy1219

Thanks for pointing this out. I had in mind that the topic array is atomic, i.e. std::atomic<unsigned> * n_k;. But it seems I commented out that one and kept the thread unsafe version. Fixed now.

manzilzaheer avatar Mar 26 '17 17:03 manzilzaheer