radis icon indicating copy to clipboard operation
radis copied to clipboard

Add auto option for chunksize based on nlines, ngridpoints, and available memory

Open mohyware opened this issue 9 months ago • 11 comments

Description

This PR introduces an 'auto' option for chunksize, allowing dynamic adjustment based on:

  • Number of spectral lines (nlines)
  • Grid points (ngridpoints)
  • Available system memory

This follows the version implemented in radis-benchmarks Fixes #776

mohyware avatar Mar 23 '25 08:03 mohyware

Hello, I think the chunksize option is not available when using the LDM algorithm. The latter being the default algorithm I think this does not help most of RADIS usage. Also, when running radis on python version <= 3.10, we use vaex to virtually remove any limit to the memory usage. (There are on-going plans to make vaex compatible up to python 3.12, see https://github.com/radis/radis/pull/698).

Please provide:

  • [ ] an example demonstrating the reduction of memory usage on python 3.11 or later
  • [ ] an example demonstrating the reduction of memory usage on python 3.10 or before (with vaex)

minouHub avatar Mar 24 '25 18:03 minouHub

I saw it here as a todo in performance section and thought it could be beneficial. But honestly even if its safer, the process feels slower with it, I see a progress bar, which doesn’t appear when just using the default chunksize 100,000. I’m not sure if the algorithm needs to be modified or if we should wait for full Vaex support. If this isn't critical, I can close it.

In python 3.11 or later:

https://github.com/user-attachments/assets/c57b93df-bff6-4f07-83c8-8e77f929e25c

mohyware avatar Mar 25 '25 03:03 mohyware

Thanks for the demo. By the way you can print the computation time with verbose=3. So it seems that even on Python 3.11 this does not improve the performance. Is there a reason to even keep the chunksize option?

minouHub avatar Mar 27 '25 07:03 minouHub

With auto chunksize (chunksize value became 30.000) image When chunksize is set to 1e7: image

Without chunksize option: image

I think it is for better memory handling as mentioned here

mohyware avatar Mar 27 '25 08:03 mohyware

Alright. Thanks for the computation. Can you demonstrate that the memory usage drops?

minouHub avatar Mar 27 '25 09:03 minouHub

Sure, I'll try a larger one like COâ‚‚. It will take some time since my internet isn't great.

mohyware avatar Mar 29 '25 02:03 mohyware

You can try with CH4 which is already a large database but not as large as CO2

minouHub avatar Mar 30 '25 17:03 minouHub

I don't see a reason to keep it as you mentioned.

I tried CH4 with:

  • No chunksize option → Max resident set size : 578412 kB

  • chunksize = 40000 → Max resident set size : 577912 kB

Both are (≈ 578 MB)

mohyware avatar Mar 31 '25 19:03 mohyware

Can you publish your test code? Or better, raise an issue showing how you get your max memory allocation.

minouHub avatar Apr 01 '25 09:04 minouHub

Execute the following code twice once with the option chunksize and once without using this command:

/usr/bin/time -v python file.py
from radis import SpectrumFactory, plot_diff

sf = SpectrumFactory(
    2150,
    2450, 
    molecule="CH4",
    isotope="1",
    wstep=0.002,
    chunksize=1e7, # chunksize option
)

sf.fetch_databank(
    source="hitemp"
) 

T = 1500.0  # K
p = 1.0  # bar
x = 0.8
l = 0.2  # cm
w_slit = 0.5  # cm-1


s_cpu = sf.eq_spectrum(
    name="CPU",
    Tgas=T,
    pressure=p,
    mole_fraction=x,
    path_length=l,
)

mohyware avatar Apr 05 '25 22:04 mohyware

Did this on my side. Python 3.10 and using vaex.

import psutil
import time
from radis import SpectrumFactory

for chunksize in [5e6, 1e7, 1e8, 1e9]:
    start_time = time.time()
    sf = SpectrumFactory(
        2150,
        2250, 
        molecule="CH4",
        isotope="1",
        wstep=0.002,
        verbose = 0,
        chunksize=chunksize, # chunksize option
    )
    
    sf.fetch_databank(
        source="hitemp", #database="2010",
    ) 
    
    T = 1500.0  # K
    p = 1.0  # bar
    x = 0.8
    l = 0.2  # cm
    
    
    s_cpu = sf.eq_spectrum(
        name="CPU",
        Tgas=T,
        pressure=p,
        mole_fraction=x,
        path_length=l,
    )
    process = psutil.Process()
    print(f"-------\nChunk: {chunksize:.0e}.")
    print(f"Memory usage: {process.memory_info().rss / (1024 * 1024)} MB")
    print(f"Time to run: {time.time() - start_time:.2f} seconds")

-------
Chunk: 5e+06.
Memory usage: 504.75390625 MB
Time to run: 10.78 seconds
-------
Chunk: 1e+07.
Memory usage: 506.9453125 MB
Time to run: 5.75 seconds
-------
Chunk: 1e+08.
Memory usage: 537.90234375 MB
Time to run: 1.25 seconds
-------
Chunk: 1e+09.
Memory usage: 568.8046875 MB
Time to run: 0.80 seconds

Seems to me that the gain in memory not relevant. @erwanp @dcmvdbekerom do you think there is a need to do a new example? Or any new idea to use chunksize efficiently?

minouHub avatar Apr 09 '25 12:04 minouHub