presto icon indicating copy to clipboard operation
presto copied to clipboard

`rfifind` and OpenMP

Open wfarah opened this issue 2 years ago • 4 comments

Hi @scottransom

It seems to me like rfifind does not utilize any parallelism. I tried the command line with and without -ncpus 6 and it didn't make a difference in runtime. I still get Using 6 threads with OpenMP when I pass -ncpus 6.

In the source code, I see this: https://github.com/scottransom/presto/blob/master/src/rfifind.c#L121

But I don't see openMP calls or #pragma directives anywhere else.

Am I missing something?

Thanks!

wfarah avatar Oct 12 '22 21:10 wfarah

Hey Wael,

You are completely correct (unfortunately).

I added the command line options and included the appropriate headers for OpenMP many years ago in anticipation of a big parallel push. But my initial experiments with speed-ups were terrible. I think it is because I need to restructure how the code collects and merges the "birdies" after running search_fft. The current method is hugely serial and a major anti-parallel bottleneck. In general, I would think we should be able to get rfifind to parallelize quite nicely.

Unfortunately I haven't had the time to prioritize this work, though. So if you or a student wanted to tackle it, I think it would make a really nice computational project. A related addition I've been thinking about is adding generalized spectral kurtosis estimates...

Let me know if you have any thoughts as I'd love to move forward on this.

Scott

scottransom avatar Oct 12 '22 23:10 scottransom

Hi Scott,

Thanks for the response!

I guess it will be great to have rfifind parallelized. I probably won't have the time for it now, but maybe can find someone to take a look at this. I will let you know.

In the meantime, rfifind runtime is quite large. I'm running it on an 8bit file, 64us / 0.5 MHz resolution, 30 mins / 1344 channels, with the arguments:

rfifind -time 30.0 -timesig 10.0 -freqsig 4.0 -chanfrac 0.5 -intfrac 0.3

Is there any command line argument I can add that removes some of the processing to make it run faster? I am searching for FRBs, so probably won't need any FFT-based masking.

-- Wael

wfarah avatar Oct 13 '22 01:10 wfarah

What I usually do in situations like this is only worry about the bad channels and forget about the time-variable stuff. To do that, I would use just a small chunk of the data (maybe 5-10%?) and run rfifind on it. Then generate an ignorechan list from the rfifind results and pass that to the various prep* commands. If your data is in one long single file, you will have to chop it somehow, but I suspect you can handle that.

The ignorechan stuff is mentioned in the tutorial and also here: https://github.com/scottransom/presto/blob/master/FAQ.md#what-is-the-difference-between-using--ignorechan-and-explicitly-including-channels-that-you-want-to-zap-in-an-rfifind-mask-using--zapchan-is-one-preferred-over-the-other

scottransom avatar Oct 13 '22 14:10 scottransom

Just for reference, I should have put a link in for the FAQ entry about the lack of multi-CPU speed-up, in case others come here looking for answers: https://github.com/scottransom/presto/blob/master/FAQ.md#many-of-these-routines-are-really-slow-and-they-dont-seem-to-get-faster-using-the--ncpus-option--why-is-that

scottransom avatar Oct 13 '22 14:10 scottransom