geotiff.js icon indicating copy to clipboard operation
geotiff.js copied to clipboard

image.readRasters slow, even when fetching a small part of the image

Open MarByteBeep opened this issue 4 years ago • 10 comments

Hi,

When trying to fetch a BBOX of 64x64 pixels from my GeoTIFF (of about 10k x 12k pixels), it doesn't seem to matter if I decode the whole image, or just that 64x64. Both are incredibly slow. Also subsequent calls to readRasters seem to be equally slow. I can understand for an initial fetch of 64x64 pixels, the entire image needs to be decoded, but then any subsequent calls should be a lot faster. But that's not what I experience here. This is the code:

const tiff = await GeoTIFF.fromFile('./geotiff/huge_geo.tif');
const image = await tiff.getImage();
const pool = new GeoTIFF.Pool();

// Decode entire image
const data = await image.readRasters({pool: pool});

// Now just a small BBOX of 64x64 pixels
const results = await image.readRasters({bbox: [
	117152.96,
	473321.92,
	120593.60,
	476762.56
], width: 64, height: 64, pool: pool});

Using pool or not doesn't seem to help a lot either. Both will take around 1 minutes to decode. So I'm pretty sure I'm doing something wrong here. Any ideas?

MarByteBeep avatar Oct 01 '20 14:10 MarByteBeep

Hi @MarByteBeep

First, since you are using fromFile, I'm assuming that you are using the library from node.js. Is that correct?

Second: please provide more information of your file. A gdalinfo would be perfect. I need to know more about the geospatial extents and internal layout. This may be provide clues to why this is that slow.

With the bbox query, the subset read from the image may be way larger, the width and height are only used to specify the output size to which the read rasters are then resampled to. So, in theory, it is possible that the whole of the image is read and then squeezed into a 64x64 raster, explaining the very long decoding times while still ending up with a very small result.

Hope this helps.

constantinius avatar Oct 01 '20 14:10 constantinius

Hi! Thanks a lot for the quick reply. Indeed I use NodeJS. I don't have gdainfo here, but I have this:

{
  width: 10574,
  height: 12464,
  tileWidth: 128,
  tileHeight: 128,
  samplesPerPixel: 1,
  origin: [ 13611.60000000149, 618522.1000000015, 0 ],
  resolution: [ 25, -25, 0 ],
  bbox: [
    13611.60000000149,
    306922.1000000015,
    277961.6000000015,
    618522.1000000015
  ]
}

AFAIK the requested bbox is just 3,441x3,441 meters (EPSG:28992), whereas the whole image spans 264,350x311,600 meters.

I noticed by using QGIS that the image is LZW compressed. Reencoding it without compression, did not do a whole lot for the bottom line. Thanks for your help!

MarByteBeep avatar Oct 01 '20 15:10 MarByteBeep

Add... And is there any way to simply fetch a bbox without resampling at all? With a resolution of 25x25 meters, suppose I want to fetch an area of 2500x2500 meters, how do I do that properly so that no resampling occurs?

const results = await image.readRasters({bbox: [
  117152,
  473321,
  117152+2500,
  473321+2500
], width: 100, height: 100, pool: pool});

I tried that, hoping that it would speed up things a lot. But alas

MarByteBeep avatar Oct 01 '20 15:10 MarByteBeep

@MarByteBeep Okay, the resolution is 25m, so you are requesting an image of about 100x100 px.

You do not have to pass the output width/height, only if you want to override it. Also, if you are resampling, you might want to consider a specific resampling algorithm. The default (nearest) should be quite fast, though.

And I don't think that the long waiting time has anything to do with resampling. More with the IO.

What is your performance when you are using another tool like gdal_translate? (Do you maybe use the file via a network storage?)

constantinius avatar Oct 02 '20 14:10 constantinius

Hi there,

When opening the same image in a tool like QGis, it opens within a fraction of a second. The image is stored locally on an SSD

You do not have to pass the output width/height, only if you want to override it.

You mean that when I do this:

const results = await image.readRasters({bbox: [
  117152,
  473321,
  117152+2500,
  473321+2500
], width: 100, height: 100, pool: pool});

That 100x100 is not needed as the library will infer that from the 2500x2500 meters? I.e.,

const results = await image.readRasters({bbox: [
  117152,
  473321,
  117152+2500,
  473321+2500
], pool: pool});

would yield the same result?

Anyway, if you have some other ideas on why this is that slow, please let me know.

Thanks!

MarByteBeep avatar Oct 03 '20 09:10 MarByteBeep

That 100x100 is not needed as the library will infer that from the 2500x2500 meters? I.e.,

const results = await image.readRasters({bbox: [
  117152,
  473321,
  117152+2500,
  473321+2500
], pool: pool});

would yield the same result?

Yes, should be. You can check by getting the width/height attributes of the result.

Anyway, if you have some other ideas on why this is that slow, please let me know.

Have you tried to open via fromUrl? Otherwise I'm out of clues.

Could you make this file available? Then I could have a look.

constantinius avatar Oct 05 '20 09:10 constantinius

@constantinius I'm sorry but I missed this!

fromURL doesn't seem to accept a local file. I get a connect ECONNREFUSED 127.0.0.1:443 error.

The problematic file can be found here:

https://geodata.rivm.nl/downloads/lucht/rivm_nsl_20200715_gm_NO22018.zip

My example code:

"use strict";
const assert = require('assert');
const GeoTIFF = require('geotiff');

async function main() {
	const tiff = await GeoTIFF.fromFile('./rivm_nsl_20200715_gm_NO22018.tif');
	const image = await tiff.getImage();

	const info = {
		width: image.getWidth(),
		height : image.getHeight(),
		tileWidth : image.getTileWidth(),
		tileHeight : image.getTileHeight(),
		samplesPerPixel : image.getSamplesPerPixel(),
		origin: image.getOrigin(),
		resolution: image.getResolution(),
		bbox: image.getBoundingBox()
	};
	console.log(info);

	const pool = new GeoTIFF.Pool();

	const results = await image.readRasters({bbox: [
		117152,
		473321,
		117152+2500,
		473321+2500
	], pool: pool});

	assert(results.width === 100);
	assert(results.height === 100);
}

(async () => {
	try {
		await main();
	} catch(err) {
		console.error(err);
		process.exit(1);
	}
	process.exit(0);
})();

All up to the image.readRasters()-call goes very fast. And then it takes ages to get to the assert check.

MarByteBeep avatar Oct 29 '20 20:10 MarByteBeep

Any news on this? Did I perhaps use readRasters incorrectly? I assumed bbox is in geo coordinates. And those coordinates are valid in that map. However, results contain the full map instead of a 100x100 section of it. So width/height are NOT 100x100 but 10574x12464

MarByteBeep avatar Aug 30 '21 09:08 MarByteBeep

@constantinius so I ran the following two examples: first manually calculating the window and second use bbox. The first example ran fine and fast. The second read the rasters over the whole image, not just the 100x100 bounding box. I believe this is an error.

const bbox = [
	117152,
	473321,
	117152+2500,
	473321+2500
];

// Result: 100x100, execution time fast => correct
{
	const [oX, oY] = image.getOrigin();
	const [imageResX, imageResY] = image.getResolution(image);

	let wnd = [
		Math.round((bbox[0] - oX) / imageResX),
		Math.round((bbox[1] - oY) / imageResY),
		Math.round((bbox[2] - oX) / imageResX),
		Math.round((bbox[3] - oY) / imageResY),
	];
	wnd = [
		Math.min(wnd[0], wnd[2]),
		Math.min(wnd[1], wnd[3]),
		Math.max(wnd[0], wnd[2]),
		Math.max(wnd[1], wnd[3]),
	];

	const results = await image.readRasters({ window: wnd });
	assert(results.width === 100);
	assert(results.height === 100);
}

// Result: 10574x12464, execution time very slow => incorrect
{
	const results = await image.readRasters({bbox: bbox});

	assert(results.width === 100);
	assert(results.height === 100);
}

MarByteBeep avatar Aug 30 '21 13:08 MarByteBeep

It seems that this line in geotiff.js runs (for every pixel in the loaded block!) DataView.getUint8 (or getUint16, etc) to extract single pixels. I believe it may be more efficient to copy this data in larger blocks. I will attempt this and hope to open a PR. In our application, we read from many 1024x1024 pixel blocks in an ome-tiff, and reading each 1024x1024 tile takes much longer than expected.

thejohnhoffer avatar Jan 19 '23 17:01 thejohnhoffer