grass
grass copied to clipboard
r.watershed: Handle large datasets greater than INT32_MAX in cells
This PR fixes an unintended integer overflow by the int size_array(..., int, int) function. The new return type of this function is size_t, and takes size_t nrows and ncols. Before this PR, r.watershed was not able to allocate 4 * 43994 * 49529 (about 8.1 GB) because of this integer overflow.
Failed line: https://github.com/OSGeo/grass/blob/35123df1a67ba99306bff26ae296fdfa2ca69b92/raster/r.watershed/ram/init_vars.c#L149
Error:
SECTION 1a (of 4): Initiating Memory.
Current region rows: 43994, cols: 49529
ERROR: G_malloc: unable to allocate 18446744065248018020 bytes of memory at
raster/r.watershed/ram/init_vars.c:149
WARNING: Subprocess failed with exit code 1
Test code:
#include <stdio.h>
int main(){
int nrows = 43994;
int ncols = 49529;
/* expected size */
size_t size_expected = sizeof(int) * nrows * ncols;
/* before this PR with size_array() */
size_t size_overflow_1 = sizeof(int) * (int)(nrows * ncols);
/* int size_array() => size_t */
size_t size_overflow_2 = sizeof(int) * (size_t)(nrows * ncols);
/* int nrows & ncols for size_array() => size_t */
size_t size_overflow_3 = sizeof(int) * (int)((size_t)nrows * (size_t)ncols);
/* this PR */
size_t size_fixed = sizeof(int) * (size_t)((size_t)nrows * (size_t)ncols);
printf("size_expected = %zu\n", size_expected);
printf("size_overflow_1 = %zu\n", size_overflow_1);
printf("size_overflow_2 = %zu\n", size_overflow_2);
printf("size_overflow_3 = %zu\n", size_overflow_3);
printf("size_fixed = %zu\n", size_fixed);
}
Output:
size_expected = 8715915304
size_overflow_1 = 18446744065245597736
size_overflow_2 = 18446744065245597736
size_overflow_3 = 18446744065245597736
size_fixed = 8715915304
Whenever we multiply rows and columns, they should be size_t before multiplication, so the reasoning here fits that.
Can you provide documentation for size_array while at it?
Unfortunately, it still cannot handle larger inputs. Some arrays and variables need to be converted to size_t as well. I'm working on it..., but that can mean more memory consumption for smaller datasets depending on the size of size_t (8 vs. 4 for int in my 64-bit Linux).
Verified identical outputs before and after this PR.
For our records, we still have these issues with the -m flag: https://github.com/OSGeo/grass/issues/2222 and https://github.com/OSGeo/grass/pull/2482, but they are not related to this PR.
@HuidaeCho Are you planning to merge this PR?
@HuidaeCho Are you planning to merge this PR?
Do you know how to re-run expired failed tests without making a small commit just to revive them?
@HuidaeCho Are you planning to merge this PR?
Do you know how to re-run expired failed tests without making a small commit just to revive them?
You can always rebase.
For more recently run jobs you can re-run, by going to "Details" and in the upper right side you have:
@HuidaeCho would you like to do a rebase for a fresh CI (101 PRs behind), then merge? If I do it I would appear as co-author. This PR looks ready to go
@HuidaeCho would you like to do a rebase for a fresh CI (101 PRs behind), then merge? If I do it I would appear as co-author. This PR looks ready to go
@echoix Just rebased it.
...If I do it I would appear as co-author...
But the co-author is determined (only) by the text in the commit message, no? I think in these cases, the auto-generated co-author should be deleted by whoever is merging it.
...If I do it I would appear as co-author...
But the co-author is determined (only) by the text in the commit message, no? I think in these cases, the auto-generated co-author should be deleted by whoever is merging it.
Ya, that's what I would have done if it wasn't him, that was still active. The PR already had some rebases done, so I assumed the cleanliness of the commits was important for the author. And it also indirectly asked for a little review to know if everything was still alright with the PR.