apriltag icon indicating copy to clipboard operation
apriltag copied to clipboard

race condition in workpool

Open pmsoftware78 opened this issue 5 years ago • 2 comments

Hi All, I suggest to wait until all threads completed startup before returning in workerpool_create function, adding

        pthread_mutex_lock(&wp->mutex);

        while (wp->end_count < wp->nthreads) {
            pthread_cond_wait(&wp->endcond, &wp->mutex);
        }
        
        pthread_mutex_unlock(&wp->mutex);

and in workerpool_run function, wp->end_count editing should be protected (ok... this problem should not happen), inserting it after mutex lock:

        pthread_mutex_lock(&wp->mutex);
        wp->end_count = 0; 
        pthread_cond_broadcast(&wp->startcond);

Best Paolo

pmsoftware78 avatar Jan 21 '20 13:01 pmsoftware78

Hi Paolo, I'm not sure I see exactly what the problem is with the current code. What race condition are you seeing?

mkrogius avatar Dec 17 '20 22:12 mkrogius

Hi Max, urgh... so time ago! I try to remember all the details.... during workerpool_create all threads are created using pthread_create but not all threads at the end of this function can be started and/or not all threads are blocked on pthread_cond_wait, so this part of the code in worker_thread

pthread_mutex_lock(&wp->mutex);
        while (wp->taskspos == zarray_size(wp->tasks)) {
            wp->end_count++;  //< this 

and

void workerpool_run(workerpool_t *wp)
{
    if (wp->nthreads > 1) {
        wp->end_count = 0; //< this

could be execute in wrong order and wp->end_count will have wrong value.

I try to understand what you want to do and my idea is to wait that all threads be in a wait state

  pthread_mutex_lock(&wp->mutex);

        while (wp->end_count < wp->nthreads) {
            pthread_cond_wait(&wp->endcond, &wp->mutex);
        }
        
        pthread_mutex_unlock(&wp->mutex);

at the end of create and for extra security lock endcound access

// runs all added tasks, waits for them to complete.
void workerpool_run(workerpool_t *wp)
{
    if (wp->nthreads > 1) {

        pthread_mutex_lock(&wp->mutex);
        wp->end_count = 0; // FIX
        pthread_cond_broadcast(&wp->startcond);

        while (wp->end_count < wp->nthreads) {
//            printf("caught %d\n", wp->end_count);
            pthread_cond_wait(&wp->endcond, &wp->mutex);
        }

        pthread_mutex_unlock(&wp->mutex);

        wp->taskspos = 0;

        zarray_clear(wp->tasks);

    } else {
        workerpool_run_single(wp);
    }
}

I hope I have been clear enough.

Best Paolo

pmsoftware78 avatar Dec 18 '20 10:12 pmsoftware78