eoxserver icon indicating copy to clipboard operation
eoxserver copied to clipboard

WCS GetCoverage for very large rasters runs into memory error in MapServer

Open lubojr opened this issue 2 years ago • 1 comments

For extremely large products from urban atlas 2012 collection, after sending a large WCS GetCoverage request: for example "wcs_size":" 16309 31633" and 8 bands of data, we probably hit some MapServer value overflow error, as following is logged from renderer:

MapServer: Dispatching.
msSmallCalloc(): Out of memory allocating -3308483712 bytes

note the negative bytes count

The size of raster is definitely too large to be handled by our available memory, but unfortunately no exception is being thrown in this case from the OWS call to Mapserver https://github.com/EOxServer/eoxserver/blob/b34da4e6094aad8908c43bf39b8bda4337a7af18/eoxserver/contrib/mapserver.py#L134

so the user gets just a 500 error as a response without any encoded exception text from mapserver.

If I instead decreased the scale of image requested, an expected exception is thrown reaching the memory limits of the machine:

MapServer: Dispatching.
MapServer: Dispatch took 1.483050 seconds.
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/eoxserver/services/views.py", line 75, in ows
    result = handler.handle(request)
  File "/usr/local/lib/python3.8/dist-packages/eoxserver/services/ows/wcs/basehandlers.py", line 323, in handle
    result_set = renderer.render(params)
  File "/usr/local/lib/python3.8/dist-packages/eoxserver/services/mapserver/wcs/coverage_renderer.py", line 168, in render
    raw_result = ms.dispatch(map_, request)
  File "/usr/local/lib/python3.8/dist-packages/eoxserver/contrib/mapserver.py", line 158, in dispatch
    raise MapServerException(message, locator, code)
eoxserver.contrib.mapserver.MapServerException: msDrawRasterLayerGDAL(): Memory allocation error. Allocating work image of size 9784x18977 failed.

Some investigation:

We use MapServer version 7.4.3 in our latest EOxServer (Ubuntu 20.04 base). I have tried locally to use a Ubuntu 21.10 base image which supports MapServer 7.6.3 release from 4/2021 and the situation was the same, so up to latest stable release, the issue was not fixed. I was not trying a custom built MapServer from current master though.

I think this error needs to be fixed at the source - at MapServer, but would need to be located and fixed.

mapfile_example.txt

Requests used:

https://sso.ua2012.pass.copernicus.eu/ows?service=WCS&version=2.0.1&request=GetCoverage&coverageId=urn%3Aeop%3AEUSI%3AEW02%3A103005000BF8CA00%3A054449393010_ms&SCALEFACTOR=0.3 returns expected exception in msDrawRasterLayerGDAL() https://sso.ua2012.pass.copernicus.eu/ows?service=WCS&version=2.0.1&request=GetCoverage&coverageId=urn%3Aeop%3AEUSI%3AEW02%3A103005000BF8CA00%3A054449393010_ms&SCALEFACTOR=0.4 falls into memory allocation error somewhere Leaving out the SCALEFACTOR parameter yield same result as SCALEFACTOR=0.4

lubojr avatar Jun 09 '22 17:06 lubojr

@lubojr

Thanks for the issue and analysis!

A quick calculation of 16309 * 31633 * 8 gives more than 4 GB, so I guess this naturally hits some memory allocation boundaries. The negative reporting is probably due to an integer overflow: the number reported is interpreted as signed, but should be reported as unsigned. So faulty printing. I wrote a small C program to be sure:

#include <stdio.h>

int main(int argc, char **argv) {
        unsigned int value = 4127220776;
        printf("As signed: %i\n", value);
        printf("As unsigned: %u\n", value);
        return 0;
}

Which results in:

As signed: -167746520
As unsigned: 4127220776

So I guess this is close enough.

Regarding the error raised in OWSDispatch: This is an issue of the interface between C and Python, errors are not easily propagated over the language barrier. So only a quite non-descript error message and class is then available.

But, in my opinion, the underlying issue is with MapServer itself. Which usually deals with map-renderings, i.e something that is somewhat guaranteed to fit in memory (PNGs/JPEGs in reasonable sizes) and thus keeps the whole output in memory, without caching to disk. I don't see the issue with the big allocations fixed in MapServer.

In my opinion, MapServer should not be used for WCS in EOxServer.

constantinius avatar Jun 09 '22 18:06 constantinius