Eureka improve the speed of the column-by-column background subtraction

Leonardo dos Santos to start work on this

Jun 23 '21 15:06 kevin218

I will take a look at this and see how to optimize it.

Jun 24 '21 11:06 ladsantos

So, I did a brief test here, and I saw that a for-loop to fit ~1500 polynomials should take only a fraction of a second, so that is not the bottleneck. I think the bottleneck happens because we have a while-loop nested inside the for-loop, and a bunch of if-/try-statements to check for bad pixels. Python is sluggish with nested loops.

What I suggest is, instead of using this nested while-loop, we could simply assign weights of zero to the bad pixels when fitting the polynomials (it's the parameter w when calling numpy.polyfit). This should do the trick. I will work on implementing this.

Jun 24 '21 12:06 ladsantos

This sounds great! In place of editing the existing routine, I suggest copying the code to a new routine and giving users the option to use either one. This will also allow us to compare the effectiveness of each method.

Jun 24 '21 19:06 kevin218

So, I have a bit of unsatisfying news. I realize that the while-loop was necessary for sigma-clipping, so it won't be trivial to remove the nested looping. I tried using Astropy's fitting modules with their implementation of sigma-clipping, and that was actually slower than the current implementation by a factor of 2-3.

I'm afraid that the code may already be at its limit for optimization, unless this background removal is implemented in a more optimized language, like C or Fortran. Or, there could be a clever way of fitting all integrations at the same time somehow, perhaps by scaling and concatenating all of the arrays together... But that would require an "open-heart surgery" in the code, particularly in how util.BGsubtraction works.

I can push the implementation in Astropy if you want, by the way. But it does not solve the issue 😬

Jun 25 '21 14:06 ladsantos

thanks for looking into this Leo and too bad it's proving to be a bit tricky! one naive question is, have you looked at using np.apply_along_axis() ? I think this function will apply a function of your choice along a whole axis (e.g. all the columns) and that it's coded up efficiently so that it's a lot faster than a normal for loop. That might be able to remove at least one of the loops.

If that doesn't work here I'm happy to give some pointers on how to implement a C extension

Jun 25 '21 16:06 lkreidberg

thanks for looking into this Leo and too bad it's proving to be a bit tricky! one naive question is, have you looked at using np.apply_along_axis() ? I think this function will apply a function of your choice along a whole axis (e.g. all the columns) and that it's coded up efficiently so that it's a lot faster than a normal for loop. That might be able to remove at least one of the loops.

If that doesn't work here I'm happy to give some pointers on how to implement a C extension

Oh, I didn't know about numpy.apply_along_axis(), I will give it a try and see what happens.

Jun 28 '21 20:06 ladsantos

Honestly, I haven't personally found the time spent on background subtraction to be prohibitive at this point. Do we still want to look into improving the runtime, or can we close this issue?

Apr 04 '22 23:04 taylorbell57

It's the slowest step within S3 and i have thoughts on how to speed it up for quick and dirty analyses, but it's also low on the priority list until we get real data.

Apr 12 '22 03:04 kevin218

One thing I don't think we've tried for this is using numba's jit decorator to allow for just-in-time compiled functions. I can't remember what all we use inside the relevant functions, but it's entirely possible that we could see some significant speed improvements by just adding @jit before some key lower-level functions

Dec 06 '22 21:12 taylorbell57

Good idea! Let's also add it to our C2 proposal.

Dec 07 '22 04:12 kevin218

Eureka Eureka copied to clipboard

improve the speed of the column-by-column background subtraction

Eureka
Eureka copied to clipboard