synchrad icon indicating copy to clipboard operation
synchrad copied to clipboard

Avoid hitting the CPU memory limit on problems that need a large number of tracks

Open delaossa opened this issue 1 year ago • 2 comments

To resolve the coherent radiation of e.g. a beam through an undulator, one typically needs a large number of particles. However, the maximum number of tracks that I could load into Synchrad in a previous example was limited by the CPU memory available. The reason was that all the tracks needed to be stored in the CPU before passing them to Synchrad. Considering tracks with order 1M particles, described by 6 coordinates (doubles) and about 10k points per track, the memory allocation (~500 GB) was going easily over the top. Dumping the tracks to a file and passing the file to Synchrad was suggested to avoid this limitation, but in practice this solution turned to be rather slow https://github.com/hightower8083/synchrad/issues/30#issuecomment-1981639634.

For cases in which the tracks can be calculated through a function starting from the initial values of the beam, however, there is no need to allocate all the tracks prior running Synchrad, but just the initial values. In this branch -> https://github.com/delaossa/synchrad/tree/track_func I have tried this approach by letting calculate_spectrum accept the function that calculates the track as an optional argument. When this function, track_func, is passed together with the particleTracks (which in this case would only contain the initial coordinates of the particles), Synchrad will calculate the track on the fly before computing its radiation, and delete the track afterwards. This allows for reducing the prior memory allocation in the CPU by a factor Nt, with Nt the number of points in the track, and consequently, allow for a factor Nt larger number of particles.

I have added a new version of the Undulator_Example.ipynb to the branch to show how this feature can be used: https://github.com/delaossa/synchrad/blob/track_func/tutorials/Undulator_Example_with_track_function.ipynb

delaossa avatar Dec 06 '24 12:12 delaossa

hi Alberto @delaossa

Thanx for this interesting suggestion -- it looks great and I like the idea of adding a tracker option! I will need a bit of time to revise and test it, but from a quick glance at the code -- can you check if here https://github.com/delaossa/synchrad/blob/7443205a013e6dac9b9560db235982da9a648396/synchrad/calc.py#L267-L270 it will not replace the element of particleTracks with the data and will not store this data in the RAM by the end of calculation ? I guess it shouldn't, but can never be too sure about these python pointers to numpy data..

Also I think it'd be better to first finalize #28, as it it not yet all consistent. For this, I will need some help with testing it on the cluster -- do you still have access to this multi-node machine where #30 was originally spotted?

hightower8083 avatar Dec 17 '24 09:12 hightower8083

Hi Igor, I am glad that you like the idea. I can only say that it has been very useful to run examples with order million particles and beyond.

I have checked explicitly that the following piece of code modifies track (from a list of floats with the initial values to a list of numpy arrays with the tracks), but not the particleTracks elements.

for itr in calc_iterator:
    track = particleTracks[itr]
    if track_func is not None:
        track = track_func(track)
...
    del track

After the track is processed, it is deleted to finish one loop iteration. The corresponding particleTrack element remains stored in RAM, but that is OK since it only contains the initial values of the track.

This and other tests could be done using the example provided here: https://github.com/delaossa/synchrad/blob/track_func/tutorials/Undulator_Example_with_track_function.ipynb

And yes, let me know if I can help testing PR #28 in the cluster.

delaossa avatar Dec 17 '24 10:12 delaossa