experimental-lda
experimental-lda copied to clipboard
Threading unsafe updater?
It seems that updater function in parallelLDA is threading unsafe. The task of updating words-topic distribution is split by word label, which makes the same words not be operated in the same time. But the updating function also contains topic array updating. Could it make the same topic operated by different thread in the same time?
The updating queue is pushed in model::sampling with the following code:
cbuff[nst*(w%ntt)+i].push(delta(w,old_topic, topic));
And then the updating function:
virtual int updater(int i) // updating sufficient statistics, can be outsourced to children
{
do
{
for (int tn = 0; tn<nst; ++tn)
{
if (!(cbuff[i*nst + tn].empty()))
{
delta temp = cbuff[i*nst + tn].front();
cbuff[i*nst + tn].pop();
n_wk[temp.word][temp.old_topic] -= 1;
n_wk[temp.word][temp.new_topic] += 1;
n_k[temp.old_topic] -= 1;
n_k[temp.new_topic] += 1;
//n_wk[temp.word][temp.old_topic].fetch_add(-1);
//n_wk[temp.word][temp.new_topic].fetch_add(+1);
//n_k[temp.old_topic].fetch_add(-1);
//n_k[temp.new_topic].fetch_add(+1);
}
}
} while (!done[i]);
return 0;
}
Both word and topic related array are operated in the same thread.
Thanks for pointing this out. I had in mind that the topic array is atomic, i.e. std::atomic<unsigned> * n_k;
. But it seems I commented out that one and kept the thread unsafe version. Fixed now.