activity-browser icon indicating copy to clipboard operation
activity-browser copied to clipboard

Scenario LCA with high amount of scenarios leads to numpy error

Open ljlazar opened this issue 2 years ago • 4 comments

Using a high amount of scenarios in the scenario LCA calculation in AB leads to a numpy memory error eventhough enough memory is available:

numpy.core._exceptions._ArrayMemoryError: Unable to allocate 25.6 GiB for an array with shape (13013, 263703) and data type object

The error itself seems to be a linux-related problem which is fixable changing the overcommit mode: https://stackoverflow.com/questions/57507832/unable-to-allocate-array-with-shape-and-data-type

Did anyone else encounter this issue and is there maybe a permanent fix available in the AB code itself?

ljlazar avatar Jun 10 '22 09:06 ljlazar

If the issue is indeed what you're linking to on SO, it means it's a linux issue and can be fixed with $ echo 1 > /proc/sys/vm/overcommit_memory. I don't think AB should be allowed to mess with superuser commands and changing how memory allocation works on OS level.

The fix I'd suggest -assuming you don't want to use another OS as that also fixes the above issue- is to run the scenarios in batches. In addition to having the (13013, 263703) array, you'll also be producing many results that need to be kept in memory until everything is done, it'd be a waste to lose your results 3/4 way in because your RAM can't take it. With many scenarios, the calculation time lost will be minimal, though you'll need to be present as the user to save results/load the next batch.

Also, could you also tell me more about the array? Is that your scenario array? In that case; what are you even doing with 13000 scenarios? image

marc-vdm avatar Jun 10 '22 09:06 marc-vdm

Afaik overcommit would only work if the array is sparse, i.e. most of the values are zeros like in the thread from SO you linked. If that's really the case then maybe the implementation could be changed in the AB to use a sparse array. However I assume your array won't be sparse so enabling the overcommit setting will probably just crash your machine or the AB process. How much memory + swap does your machine have? Can you show us the output of free -m on your linux machine?

haasad avatar Jun 10 '22 10:06 haasad

If that's really the case then maybe the implementation could be changed in the AB to use a sparse array.

Right, mis-read that. The size of the array (not being square) indicates to me that it's just the input array from the SDF. Those files should not be sparse (we don't require users to put unchanged flows in the SDF) and the 263703 lines indicates to me that they don't have a sparse file (e.g. change everything). IIRC ecoinvent has >>500k lines to change everything.

marc-vdm avatar Jun 10 '22 10:06 marc-vdm

Thanks a lot for your answers! I tried to "misuse" the scenario functionality for uncertainty calculations, that's why it ends up in so many scenarios :). Therefore I modify the parameter scenario table outside AB and run the calculations with the scenario LCA function. In the case above I additionally used a SDF, but the modified parameter scenario table itself already leads to the mentioned error. The linux machine has around 230 GB of available memory:

total used free shared buff/cache available
Mem: 386612 150054 176711 2998 59847
Swap: 15258 159 15099

ljlazar avatar Jun 10 '22 13:06 ljlazar

Closing as stale

marc-vdm avatar Sep 14 '23 13:09 marc-vdm