GalSim icon indicating copy to clipboard operation
GalSim copied to clipboard

time DFT vs. photon-shooting branch

Open rmandelb opened this issue 12 years ago • 10 comments

We would like to explicitly compare the timing for photon-shooting vs. DFT branches for various types of profiles. This will be useful for the challenge, but could also be used to give advice for general users.

rmandelb avatar Jun 27 '12 13:06 rmandelb

@barnabytprowe @rmjarvis : Revisiting this issue after many months, it occurs to me that demo7 gives us what we need to accumulate some statistics. The question is, do you think we need to try anything more complicated than that? (e.g., different object sizes, shears, etc.) If not, then perhaps we should just get some numbers for a few systems to make sure the trends are consistent, and then I volunteer to put a note on this into the documentation. One possibility would be to actually make a more full output (e.g., not just "time for all with a Gaussian PSF", which includes 4 realizations each for 5 different galaxy types - but rather to make a table giving typical times per galaxy for each combination of galaxy + PSF). So the table would have columns for galaxy type, PSF type, typical time for DFT per galaxy, typical time for photon-shooting per galaxy; or, galaxy type, PSF type, ratio of photon-shooting to DFT time. Of course, these vary by machine, but if we find consistent trends then we can at least say that.

Here are the numbers from my Mac (on branch 291 since that will very soon be merged into master):

Some timing statistics:
   Total time for setup steps = 0.040342
   Total time for regular fft drawing = 1.493045
   Total time for photon shooting = 6.441243
   Total time for adding noise = 0.113009

Breakdown by PSF type:
   Gaussian: Total time = 0.820329  (fft: 0.093543, phot: 0.696462)
   Moffat: Total time = 0.814391  (fft: 0.114034, phot: 0.665897)
   Double Gaussian: Total time = 0.749240  (fft: 0.098077, phot: 0.622378)
   OpticalPSF: Total time = 4.025285  (fft: 0.912957, phot: 3.082304)
   Kolmogorov * Airy: Total time = 1.678394  (fft: 0.274434, phot: 1.374202)

Breakdown by Galaxy type:
   Gaussian: Total time = 1.987578  (fft: 0.140212, phot: 1.812024)
   Exponential: Total time = 1.401381  (fft: 0.190116, phot: 1.181870)
   Devaucouleurs: Total time = 1.851686  (fft: 0.606009, phot: 1.216476)
   n=2.5 Sersic: Total time = 1.319074  (fft: 0.241372, phot: 1.047524)
   Bulge + Disk: Total time = 1.527920  (fft: 0.315336, phot: 1.183348)

And on the linux cluster:

Some timing statistics:
   Total time for setup steps = 0.040171
   Total time for regular fft drawing = 1.976626
   Total time for photon shooting = 6.869781
   Total time for adding noise = 0.145156

Breakdown by PSF type:
   Gaussian: Total time = 0.888854  (fft: 0.099969, phot: 0.753674)
   Moffat: Total time = 0.975501  (fft: 0.159562, phot: 0.771619)
   Double Gaussian: Total time = 0.936739  (fft: 0.133697, phot: 0.764892)
   OpticalPSF: Total time = 4.582617  (fft: 1.275994, phot: 3.271224)
   Kolmogorov * Airy: Total time = 1.648024  (fft: 0.307405, phot: 1.308373)

Breakdown by Galaxy type:
   Gaussian: Total time = 2.090442  (fft: 0.186935, phot: 1.861567)
   Exponential: Total time = 1.558956  (fft: 0.257984, phot: 1.264165)
   Devaucouleurs: Total time = 2.251794  (fft: 0.779652, phot: 1.435129)
   n=2.5 Sersic: Total time = 1.529226  (fft: 0.347546, phot: 1.146174)
   Bulge + Disk: Total time = 1.601317  (fft: 0.404509, phot: 1.162747)

rmandelb avatar Oct 12 '12 15:10 rmandelb

I think the number that is most relevant to have documented for people to reference is: At what S/N does photon shooting become faster than fft. (Or vice versa depending on how you look at it.) At very low S/N, all galaxy types will be faster to drawShoot than to draw. At high S/N it's the opposite. If we could tabulate the cross-over point for a number of different profiles, perhaps we could draw some rules of thumb to help guide people when they are making their scripts when to switch over from one draw method to the other.

rmjarvis avatar Oct 12 '12 17:10 rmjarvis

Okay, fair enough. This would be easy to test with a fairly simple script, e.g., a modification of demo7 that uses a few different S/N values. Perhaps one of us can make this (I'm happy to, but also swamped with various things so I'm slow) and the others can check it out so we see how system-dependent the conclusions are.

rmandelb avatar Oct 12 '12 17:10 rmandelb

The other independent variables that could affect the relative timing are the pixel size (relative to some size value of the profile) and the image size. Probably should start with the Nyquist pixel size and the image size that is automatically chosen for the profile, but then include some larger pixel scale values and both smaller and larger images to see whether and how much that affects the relative timings.

rmjarvis avatar Oct 12 '12 18:10 rmjarvis

I have a student working with me this summer, an engineer who's interested in doing some technical software work. I was thinking this project might be a good thing to have her start on--requires using GalSim, but not touching the underlying code yet. I wanted to check with those of you who may have thought about it already (@rmjarvis? @rmandelb?) to make sure I wasn't stepping on any toes & if you had any thoughts on updates this project description might need given the evolution of GalSim in the last almost-5 (!) years.

msimet avatar Jun 13 '17 00:06 msimet

What I would like from this issue is some kind of heuristic that could tell us for a given profile at what flux it will be faster to switch from photon shooting over to ffts.

My idea for this is add another method to drawImage called maybe phot_auto that would work like phot in terms of the noise profiles (i.e. the end result would include shot noise) but would internally use ffts when the flux was high enough and just manually add Poisson noise at the end.

This would be especially useful for things like LSST DESC. Currently they (@jchiang87 @cwwalter) are using photon shooting for their imsim simulations, but the bright stars in particular take a long time. It would be nice if they could automatically switch over to using FFTs instead of photon shooting and have that all be seamless.

The main thing I need to get this to work is a function that all GSObjects would implement that would return the cross-over flux where photon shooting and fft drawing takes about the same amount of time. I think anything that was within a factor of 2 of the correct value would be fine. So liberal uses of heuristics is acceptable.

rmjarvis avatar Jun 13 '17 04:06 rmjarvis

Okay, that sounds like a plan to me. So we'll start on working out the heuristic, with that level of accuracy in mind, and then as time allows start implementing functions to return those values. Thanks!

msimet avatar Jun 16 '17 00:06 msimet

Just an update on this project--we did not finish last summer, but my student is coming back again this summer, so we will plan to have a function implemented by the end of the SURF program in August.

msimet avatar May 21 '18 18:05 msimet

Great. Just a couple thoughts to possibly guide your investigations.

The FFT timing will probably (at least most of the time) be proportional to the size of the K-space image. You can check that directly with obj.drawFFT_makeKImage. Currently it returns a built k-space image along with a target real-space size (N). But to help enable this issue, we can break that up so just the bounds are returned rather than the image, so we can check how many pixels it would be without doing the memory allocation.

The photon timing of course is going to be proportional to the number of photons that need to be shot. This is normally the flux, but not necessarily. Some objects shoot some negative flux photons in addition to positive flux ones, so they need more to get the overall flux right. Anyway, there is a method obj._calculate_nphotons, which returns the number of photons that will be shot in a particular case.

So I think what would probably be sufficient for most classes is to figure out a proportionality constant for each case. t_fft / pixel and t_phot / photon. For some, this could be a stored constant, but some things like Convolve would need to do a quick calculation based on the components.

Then the code that would use this would call those two helper functions to get n_pixels and n_photons and multiply by these scaling factors to figure out which method will be faster.

Hopefully this all makes sense. Please do ping here with questions if any arise over the summer.

rmjarvis avatar May 21 '18 19:05 rmjarvis

After some discussion offline with @rmjarvis @jmeyers314 @msimet , I am going to have a student work on this issue over the summer. My plan (bearing in mind that the student involved is new to GalSim) is to build this up in a few steps:

  1. For a single fixed galaxy and PSF profile (e.g. a Sersic profile of a specific n, size, shape, convolved with a Kolmogorov of a particular FWHM and shape), write the code needed to find the quantities @rmjarvis suggested in a comment in this thread (t_fft/pixel and t_phot/photon).

  2. Decide on a list of parameters to vary for galaxy and PSF profiles, so that we can execute the code from (1) in an informative set of scenarios. Also decide how to sample the space (grid or something more sophisticated?).

I haven't yet considered interactions with any more complex parts of the code, e.g., ray-tracing atmospheric PSFs, silicon sensor models, chromatic effects, etc. Those could be stretch goals.

rmandelb avatar May 31 '19 03:05 rmandelb