geoblaze
geoblaze copied to clipboard
geoblaze.sum produces unexpected result in 2.7.0
Describe the bug
One of my unit tests is failing after upgrading to geoblaze 2.7.0. This is with geoblaze.sum, so maybe not directly related to the vrm work.
Fails - https://codesandbox.io/p/sandbox/geoblaze-2-7-sum-test-vfdvgt Passes - https://codesandbox.io/p/sandbox/geoblaze-2-6-1-sum-test-tj3zzp
To Reproduce Steps to reproduce the behavior:
- Go to playground link
- open terminal
- npm install
- npm test
Expected behavior Test should return value of 2, it uses a polygon that should filter the raster to only the top right cell. The polygon doesn't completely encompass the top right pixel, maybe that has something to do with it. The fact that the returned value is 1/4 of the expected and it's a 2x2 raster is interesting.
Desktop (please complete the following information): node v20 (and probably others)
Good catch! I clicked on the Fails Codesandbox and it looks like I don't have permission to view it??
Fail sandbox permissions should be fixed
This test also started failing. Previously geoblaze would reliably throw if the feature was smaller than a pixel. Perhaps it now always comes back with at least one pixel? Which sounds kind of like the vrm minimal option for geoblaze.stats. Though geoblaze.sum isn't currently documented to use or give access to this feature.
test("feature smaller than a pixel should throw", async () => {
const url = "http://127.0.0.1:8080/data/in/feature_abyssopelagic_cog.tif";
const feature: Feature<Polygon> = {
type: "Feature",
geometry: {
type: "Polygon",
coordinates: [
[
[-64.98190014112853, 32.28094566811551],
[-64.97915355909728, 32.328245479189846],
[-64.93314831007392, 32.32186289702012],
[-64.94207470167548, 32.281816439778545],
[-64.98190014112853, 32.28094566811551],
],
],
bbox: [
-64.98190014112853, 32.28094566811551, -64.93314831007392,
32.328245479189846,
],
},
properties: {},
id: "1",
bbox: [
-64.98190014112853, 32.28094566811551, -64.93314831007392,
32.328245479189846,
],
};
try {
await geoblaze.sum(url, feature);
} catch (err) {
return;
}
throw new Error("should not reach here, feature smaller than pixel");
});
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ Failed Tests 1 ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
FAIL |e2e| src/toolbox/geoblaze/geoblaze.test.e2e.ts > geoblaze cog test > feature smaller than a pixel should throw
Error: should not reach here, feature smaller than pixel
❯ src/toolbox/geoblaze/geoblaze.test.e2e.ts:90:11
88| return;
89| }
90| throw new Error("should not reach here, feature smaller than pixel");
| ^
91| });
To be clear what we do in production code is if geoblaze.sum throws with a value not found error (because feature is smaller than a pixel), then we fallback to use geoblaze.identify to pull a value.
Your diagnosis is spot on. Sum is now by default using minimal virtual resampling with rescaled results (weighted based on subpixel size) ... although I'm now (in large part based on this convo) starting to doubt if that's the best default behavior for geoblaze.sum. It feels like it might be counter-intuitive. Perhaps default resampling makes sense for min, max, median and mean, but it feels extra complicated for geoblaze.sum.
Do you have a strong opinion on what the default geoblaze.sum behavior should be? In either case, it's my intention to allow a user to be able to call geoblaze.sum with specific vrm or rescale values. It's on the short-term roadmap :-)
In any case, I need to update the documentation to properly fully document the default behavior. Thank for flagging this.
Thanks. @DanielJDufour, my first thought is this seems like a breaking change, more than expected for a minor release?
My instinct is I think the default behavior should be no virtual resampling. For the test in the sandbox, I would expect that I just get back the top right cell value overlapped by the feature, the entire cell value, no resampling.
I like the two extra options provided by the stats function to control it. That seems like a decent way. Adding it to the other individual functions like sum sounds reasonable.
The special case is when the feature is smaller than a pixel. Still in that case, I like the idea of it being off by default and I can control whether to turn it on { vrm: "minimal" }, and whether I want it to resample (rescale: true} and return a smaller value, or not resample {rescale: false} and return the full value (which is what I assume it would do if my polygon didn't quite cross the centerline of the top row in my test?)
Sounds good. I think we are in agreement. And yes, minor release shouldn't have a breaking change. I'll cut a new patch release within the next couple days that turns off virtual resampling by default for sum (returning to previous behavior).
Thanks for the feedback! I really struggled with how to design this API and I'm feeling better about it now.
changelog was updated for v2.7.0: https://github.com/GeoTIFF/geoblaze/releases/tag/v2.7.0
(I still need to cut a new release rolling back geoblaze.sum)
I've published a new version that rolls back the functionality for geoblaze.sum, so virtual resampling is turned off by default for geoblaze.sum (more here: https://github.com/GeoTIFF/geoblaze/releases/tag/v2.8.0)
Behavior seems to be as advertised, closing