THELI icon indicating copy to clipboard operation
THELI copied to clipboard

Background correction ignores RAM usage limit

Open seib2 opened this issue 3 years ago • 2 comments

Hi,

I'm facing an issue with THELI using all available RAM during background correction even when "Minimize memory usage" is enabled in the settings. See the screenshot below:

Screenshot from 2021-05-20 10-37-54

The limit is set at ~5GB but THELI is using >12GB (out of my total 16GB). The top process on the right shows that the usage is indeed due to THELI. This always results in a crash when the RAM runs out unless the process is aborted.

I am running the latest version 3.1.0, and running with full sudo rights with no errors or warnings. The same happens if I do not tick the "Minimize memory usage" box. I have checked that all settings are saved in-between runs, no issue.

Until this is fixed I will have to find a machine with more RAM :D

seib2 avatar May 20 '21 08:05 seib2

Hi,

a few comments: The "max usable memory" option is a very soft limit which the OS can often override, it will be removed in a future version. You are correct to check "minimize memory usage". This should normally work. The only reason I can think of is that your background correction strategy requires all images to be loaded into the RAM, for all CPUs, and that your data set is really large. You can try running this step with a single CPU, only, and see if that works.

However, in most cases, for optical cameras you don't need a sophisticated background modelling. It is sufficient to do subtract the sky just before the coaddition on individual images, rather than computing a running or global median.

You can contact me with more details at schirmer(at)mpia.de, and we sort it out over email.

mischa

schirmermischa avatar May 21 '21 07:05 schirmermischa

The same problem occurs when running with 1 CPU. There seems to be an issue with multi-chip background modelling - in this case, the 4 chips of LBC. The background correction starts with the first chip, then retains the first chip information in memory while doing the second, etc. This is probably not necessary, since the background-subtracted files for the completed chips are already saved by that point. When the process is re-started after crashing, the already-background-subtracted files (PAB) are not reconised as such - most likely because they are mixed with non-processed files.

However you are right that the background subtraction is not a crucial step for large datasets, we can discuss the details over email.

seib2 avatar May 21 '21 11:05 seib2