geotiff.js
geotiff.js copied to clipboard
image.readRasters slow, even when fetching a small part of the image
Hi,
When trying to fetch a BBOX of 64x64 pixels from my GeoTIFF (of about 10k x 12k pixels), it doesn't seem to matter if I decode the whole image, or just that 64x64. Both are incredibly slow. Also subsequent calls to readRasters
seem to be equally slow. I can understand for an initial fetch of 64x64 pixels, the entire image needs to be decoded, but then any subsequent calls should be a lot faster. But that's not what I experience here. This is the code:
const tiff = await GeoTIFF.fromFile('./geotiff/huge_geo.tif');
const image = await tiff.getImage();
const pool = new GeoTIFF.Pool();
// Decode entire image
const data = await image.readRasters({pool: pool});
// Now just a small BBOX of 64x64 pixels
const results = await image.readRasters({bbox: [
117152.96,
473321.92,
120593.60,
476762.56
], width: 64, height: 64, pool: pool});
Using pool or not doesn't seem to help a lot either. Both will take around 1 minutes to decode. So I'm pretty sure I'm doing something wrong here. Any ideas?
Hi @MarByteBeep
First, since you are using fromFile
, I'm assuming that you are using the library from node.js. Is that correct?
Second: please provide more information of your file. A gdalinfo
would be perfect. I need to know more about the geospatial extents and internal layout. This may be provide clues to why this is that slow.
With the bbox
query, the subset read from the image may be way larger, the width
and height
are only used to specify the output size to which the read rasters are then resampled to. So, in theory, it is possible that the whole of the image is read and then squeezed into a 64x64 raster, explaining the very long decoding times while still ending up with a very small result.
Hope this helps.
Hi! Thanks a lot for the quick reply. Indeed I use NodeJS. I don't have gdainfo
here, but I have this:
{
width: 10574,
height: 12464,
tileWidth: 128,
tileHeight: 128,
samplesPerPixel: 1,
origin: [ 13611.60000000149, 618522.1000000015, 0 ],
resolution: [ 25, -25, 0 ],
bbox: [
13611.60000000149,
306922.1000000015,
277961.6000000015,
618522.1000000015
]
}
AFAIK the requested bbox is just 3,441x3,441 meters (EPSG:28992), whereas the whole image spans 264,350x311,600 meters.
I noticed by using QGIS that the image is LZW compressed. Reencoding it without compression, did not do a whole lot for the bottom line. Thanks for your help!
Add... And is there any way to simply fetch a bbox without resampling at all? With a resolution of 25x25 meters, suppose I want to fetch an area of 2500x2500 meters, how do I do that properly so that no resampling occurs?
const results = await image.readRasters({bbox: [
117152,
473321,
117152+2500,
473321+2500
], width: 100, height: 100, pool: pool});
I tried that, hoping that it would speed up things a lot. But alas
@MarByteBeep Okay, the resolution is 25m, so you are requesting an image of about 100x100 px.
You do not have to pass the output width
/height
, only if you want to override it. Also, if you are resampling, you might want to consider a specific resampling algorithm. The default (nearest
) should be quite fast, though.
And I don't think that the long waiting time has anything to do with resampling. More with the IO.
What is your performance when you are using another tool like gdal_translate
? (Do you maybe use the file via a network storage?)
Hi there,
When opening the same image in a tool like QGis, it opens within a fraction of a second. The image is stored locally on an SSD
You do not have to pass the output width/height, only if you want to override it.
You mean that when I do this:
const results = await image.readRasters({bbox: [
117152,
473321,
117152+2500,
473321+2500
], width: 100, height: 100, pool: pool});
That 100x100 is not needed as the library will infer that from the 2500x2500 meters? I.e.,
const results = await image.readRasters({bbox: [
117152,
473321,
117152+2500,
473321+2500
], pool: pool});
would yield the same result?
Anyway, if you have some other ideas on why this is that slow, please let me know.
Thanks!
That 100x100 is not needed as the library will infer that from the 2500x2500 meters? I.e.,
const results = await image.readRasters({bbox: [ 117152, 473321, 117152+2500, 473321+2500 ], pool: pool});
would yield the same result?
Yes, should be. You can check by getting the width
/height
attributes of the result.
Anyway, if you have some other ideas on why this is that slow, please let me know.
Have you tried to open via fromUrl
? Otherwise I'm out of clues.
Could you make this file available? Then I could have a look.
@constantinius I'm sorry but I missed this!
fromURL doesn't seem to accept a local file. I get a connect ECONNREFUSED 127.0.0.1:443
error.
The problematic file can be found here:
https://geodata.rivm.nl/downloads/lucht/rivm_nsl_20200715_gm_NO22018.zip
My example code:
"use strict";
const assert = require('assert');
const GeoTIFF = require('geotiff');
async function main() {
const tiff = await GeoTIFF.fromFile('./rivm_nsl_20200715_gm_NO22018.tif');
const image = await tiff.getImage();
const info = {
width: image.getWidth(),
height : image.getHeight(),
tileWidth : image.getTileWidth(),
tileHeight : image.getTileHeight(),
samplesPerPixel : image.getSamplesPerPixel(),
origin: image.getOrigin(),
resolution: image.getResolution(),
bbox: image.getBoundingBox()
};
console.log(info);
const pool = new GeoTIFF.Pool();
const results = await image.readRasters({bbox: [
117152,
473321,
117152+2500,
473321+2500
], pool: pool});
assert(results.width === 100);
assert(results.height === 100);
}
(async () => {
try {
await main();
} catch(err) {
console.error(err);
process.exit(1);
}
process.exit(0);
})();
All up to the image.readRasters()
-call goes very fast. And then it takes ages to get to the assert
check.
Any news on this? Did I perhaps use readRasters
incorrectly? I assumed bbox
is in geo coordinates. And those coordinates are valid in that map. However, results
contain the full map instead of a 100x100 section of it. So width/height are NOT 100x100 but 10574x12464
@constantinius so I ran the following two examples: first manually calculating the window and second use bbox
. The first example ran fine and fast. The second read the rasters over the whole image, not just the 100x100 bounding box. I believe this is an error.
const bbox = [
117152,
473321,
117152+2500,
473321+2500
];
// Result: 100x100, execution time fast => correct
{
const [oX, oY] = image.getOrigin();
const [imageResX, imageResY] = image.getResolution(image);
let wnd = [
Math.round((bbox[0] - oX) / imageResX),
Math.round((bbox[1] - oY) / imageResY),
Math.round((bbox[2] - oX) / imageResX),
Math.round((bbox[3] - oY) / imageResY),
];
wnd = [
Math.min(wnd[0], wnd[2]),
Math.min(wnd[1], wnd[3]),
Math.max(wnd[0], wnd[2]),
Math.max(wnd[1], wnd[3]),
];
const results = await image.readRasters({ window: wnd });
assert(results.width === 100);
assert(results.height === 100);
}
// Result: 10574x12464, execution time very slow => incorrect
{
const results = await image.readRasters({bbox: bbox});
assert(results.width === 100);
assert(results.height === 100);
}
It seems that this line in geotiff.js runs (for every pixel in the loaded block!) DataView.getUint8 (or getUint16, etc) to extract single pixels. I believe it may be more efficient to copy this data in larger blocks. I will attempt this and hope to open a PR. In our application, we read from many 1024x1024 pixel blocks in an ome-tiff, and reading each 1024x1024 tile takes much longer than expected.