libag icon indicating copy to clipboard operation
libag copied to clipboard

Parallel thread-safe implementation of ag_search()

Open Theldus opened this issue 3 years ago • 0 comments

Description

In commit 7807763 a thread-safe version of the ag_search() routine was added, ag_search_ts(). It turns out that this version doesn't benefit from parallelism of worker threads, on the contrary: it serializes all parallel calls via mutex. It is indeed a simple solution, but it can have some performance penalties in parallel calls to ag_search_ts().

The current recommendation is to avoid calling the thread-safe version if the total number of paths to be analyzed is already known, something like:

#pragma omp parallel for
for (int i = 0; i < npaths; i++)
	result[i] = ag_search_ts(query, 1, paths[i], nresult[i]);

is much slower than something like:

result = ag_search(query, npaths, paths, nresult);

which can really benefit from worker threads. If that's not possible, use ag_search_ts() ;-).

Suggested implementation

As already explained in issue #2, ag(1) uses various global variables throughout the code, to assemble the tables, handle the current execution status, flags and so on. All of this makes simultaneous execution of ag_search() not possible.

A discussion of an approach in this regard follows below:

  • The ag_search() routine needs to create, allocate and configure data that will be used exclusively during the search, however the worker threads do not know in advance the data they will work on.

So one way to share this is through the work_queue_t structure (declared in search.h) which looks like this:

struct work_queue_t {
    char *path;
    struct work_queue_t *next;
};

the above structure can be 'tuned' to add new data, used individually during the search, without colliding with possible other searches, such as:

struct work_queue_t {
    struct search_data {
        char *path;

        /* Per search data. */
        char *query;
        int query_len;
        ... etc ...

        /* Sync. */
        pthread_mutex_t mtx_search;
        pthread_cond_t cond_search;
        size_t npaths;
        size_t pcount;
    } *search;
    struct work_queue_t *next;
};

Fields like mtx_search and cond_search would be needed by the worker thread to signal the ag_search thread when the work is complete.

To help with this, the pcount variable would be automatically incremented by each worker thread, to signal the progress of the work (which would be incremented even in case of failure). When finishing the work (pcount == npaths), the thread performing the last increment signals the ag_search (via cond_signal), which then wakes up and finishes processing the results.

The hard part of all this is decoupling everything that ag expects to be global and assigning it to each search, which perhaps requires some refactoring of ag.

Theldus avatar Jun 12 '21 02:06 Theldus