sdsl-lite icon indicating copy to clipboard operation
sdsl-lite copied to clipboard

thread safety

Open ekg opened this issue 10 years ago • 22 comments
trafficstars

I'm learning through trial and error that not all classes in sdsl-lite are thread safe (for instance). Is there documentation about what features of sdsl-lite are thread safe? I have an application that relies on read-only thread-safe succinct data structures.

ekg avatar Aug 31 '15 07:08 ekg

csa_sada is (as far as I remember) the only class which is not thread-safe in the read-only scenario, since it stores some decoded integers in a cache. We will definitely change this in the very near future.

simongog avatar Aug 31 '15 08:08 simongog

For a quick fix, you would have to replace the buffer m_psi_buf by a local variable in the rank_bwt function: https://github.com/simongog/sdsl-lite/blob/master/include/sdsl/csa_sada.hpp#L330

simongog avatar Aug 31 '15 08:08 simongog

I'm having thread-safety problems that superficially (in gdb) appear to be related to a wt_int<>. If I run a select operation on the wavelet tree in multiple threads I get:

sdsl-lite/install/include/sdsl/select_support_mcl.hpp:355: sdsl::select_support::size_type sdsl::select_support_mcl<t_bit_pattern, t_pattern_len>::select(sdsl::select_support::size_type) const [with unsigned char t_b = 0u; unsigned char t_pat_len = 1u; sdsl::select_support::size_type = long unsigned int]: Assertion `i > 0 and i <= m_arg_cnt' failed.

It doesn't seem like this could this be related to csa_sada, so I'm confused. It could be something in my application, but I haven't yet found anything and wanted to confirm it's not a limitation of sdsl-lite.

ekg avatar Aug 31 '15 08:08 ekg

I can avoid this error by placing a mutex lock on the code that calls select on the wt.

ekg avatar Aug 31 '15 08:08 ekg

Sorry, you are right; select on wt_int and wm_int are also not thread safe (a grep mutable include/sdsl/*.hpp ensured that). Here the two variables m_path_off and m_path_rank_off should be made local to the select procedure. Thanks for pointing this out!

simongog avatar Aug 31 '15 08:08 simongog

Wow, thanks! This had me stumped. I'll make a patch unless you beat me to it.

ekg avatar Aug 31 '15 08:08 ekg

Am I right in assuming that putting things local to the select procedure would be a problem for larger alphabets? I'm wondering if it's worth working around this by establishing a select support per thread.

ekg avatar Aug 31 '15 08:08 ekg

No, the size of the vector is log(alphabet size). So it should be not a problem ;) A really efficient solution is to implement a version of select which gets this additional memory from the thread.

simongog avatar Aug 31 '15 08:08 simongog

Cool, I've got a patch--- testing now.

ekg avatar Aug 31 '15 08:08 ekg

I'm also implementing the change in wm_int.hpp and wt_pc.hpp.

ekg avatar Aug 31 '15 10:08 ekg

Whoops, I'll leave wt_pc.hpp alone, that one's configurable.

ekg avatar Aug 31 '15 10:08 ekg

great

simongog avatar Aug 31 '15 10:08 simongog

I've run back into thread safety problems, presumably related to caching in select. Is there a way to configure sdsl-lite to not use caching?

ekg avatar Jul 10 '16 17:07 ekg

what select structure?

mpetri avatar Jul 11 '16 01:07 mpetri

Hi Erik,

could you please open a pull request with your changes to wm_int and wt_int? Thanks!

Simon

ps: We have a thread-safe redesign of the other non-thread safe part (csa_sada) in another branch, which will be merged in a few weeks. @mpetri : @ekg is referring to the select on pointerless wavelet trees

simongog avatar Jul 11 '16 06:07 simongog

@simongog have you pulled in the thread safe branch of csa_sada?

I'm re-making the appropriate changes for wm_int and wt_int.

ekg avatar Sep 02 '16 12:09 ekg

The referenced pull request includes roughly the same change that I had made before. Are there other aspects of the library which are still not thread safe?

ekg avatar Sep 02 '16 14:09 ekg

@ekg, thanks for the pull request. The branch which makes csa_sada thread safe is not yet merged. We did a major redesign of csa_sada and will it merge in the next weeks. (The redesigned version lives in the hyb_sd_vector_slow branch) Do you need csa_sada in your project?

simongog avatar Sep 02 '16 14:09 simongog

@simongog thanks for merging the pull request so quickly!

This fix lets me remove the lock on the one non-threadsafe part of the API that I'm targeting.

I don't need csa_sada, but I totally support making sdsl-lite thread safe for non-destructive operations on its data structures.

ekg avatar Sep 02 '16 14:09 ekg

Can you guys provide an update on the status of csa_sada's thread safety? I've been playing with it lately and the data structure seems to be locked when multiple threads are accessing it.

Parsoa avatar Mar 31 '20 09:03 Parsoa

Uh oh, I thought this had been resolved.

ekg avatar Mar 31 '20 10:03 ekg

Did more investigation and turns out something else was the source of problem. My bad. Please delete this if necessary. Thanks.

Parsoa avatar Apr 01 '20 07:04 Parsoa