grass
grass copied to clipboard
r.proj: Removed extra commented error. Added define around OpenMP code.
Removed old commented error code as it was changed to a warning. Also, added _OPENMP check around OpenMP code in preparation for rework of parallelization.
@metzm would you mind to review this PR?
@BadAssassin please separate out 96f5576 ("v.out.vtk: Address bug #864 - No double datatype") into a separate pull request.
That is best done by creating in git a new feature branch. This is explained here: https://github.com/OSGeo/grass/blob/master/CONTRIBUTING.md#creating-a-new-feature-branch
Have you checked https://trac.osgeo.org/grass/ticket/1710 and https://trac.osgeo.org/grass/ticket/1680 ?
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Thursday, September 3, 2020 11:49 AM, Markus Metz [email protected] wrote:
Have you checked https://trac.osgeo.org/grass/ticket/1710 and https://trac.osgeo.org/grass/ticket/1680 ?
Sorry for my ignorance, but I have not. I'm an older GRASS dev who had to take a break for several years. I'm still getting used to the new workflow and git.
This is great information, and at first glance, I believe will require a different solution. You may disregard my PR until I've had a chance to rework it.
I will also separate my PRs as requested in a previous message. Still learning git after years of resistance.
[...] I'm an older GRASS dev who had to take a break for several years. I'm still getting used to the new workflow and git.
Great to have you back! Older GRASS devs are an invaluable source of knowledge and information about the inner workings of GRASS.
I tested your PR by getting the diff with wget https://github.com/OSGeo/grass/pull/948.diff, applied the patch, tested it with valgrind and found the reason for the segfaults in the 8 year old trac tickets:
==23269== Syscall param write(buf) points to uninitialised byte(s)
==23269== at 0x4BAA537: write (in /usr/lib64/libc-2.31.so)
==23269== by 0x40774C: readcell (readcell.c:89)
==23269== by 0x404296: main (main.c:495)
==23269== Address 0xc43f210 is 25,040 bytes inside a block of size 1,605,632 alloc'd
==23269== at 0x483A809: malloc (vg_replace_malloc.c:307)
==23269== by 0x489A082: G__malloc (alloc.c:39)
==23269== by 0x407674: readcell (readcell.c:72)
==23269== by 0x404296: main (main.c:495)
I am trying to debug this memory violation. A fix for this bug could be included in your PR, therefore I would rather not disregard your PR, but instead use it as an incentive to fix a long-known bug.
Maybe the culprit is at https://github.com/OSGeo/grass/blob/master/raster/r.proj/r.proj.h#L15
typedef FCELL block[BDIM][BDIM];
it should rather be
typedef FCELL block[BDIM * BDIM];
with according adjustments of the CVAL macro at https://github.com/OSGeo/grass/blob/master/raster/r.proj/r.proj.h#L75
and in https://github.com/OSGeo/grass/blob/master/raster/r.proj/readcell.c#L95
Untested, just a guess from the valgrind output.
The valgrind output was only mentioning uninitialized bytes which can be avoided with
diff --git a/raster/r.proj/readcell.c b/raster/r.proj/readcell.c
index 7f2fb79ef..6956ee0a5 100644
--- a/raster/r.proj/readcell.c
+++ b/raster/r.proj/readcell.c
@@ -70,6 +70,7 @@ struct cache *readcell(int fdi, const char *size)
c->refs[i] = -1;
tmpbuf = (FCELL *) G_malloc(nx * sizeof(block));
+ Rast_set_f_null_value(tmpbuf, nx * BDIM * BDIM);
for (row = 0; row < nrows; row += BDIM) {
int x, y;
thus it was the wrong track.
Testing the PR on a laptop with 2 real and 4 virtual cores, the parallelization with openmp actually made r.proj slower. It took nearly twice as long and used 4x more CPU time compared to the non-openmp version.
Thus this PR would make r.proj for many users slower and would reintroduce an old known bug.
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Saturday, September 5, 2020 1:06 PM, Markus Metz [email protected] wrote:
The valgrind output was only mentioning uninitialized bytes which can be avoided with
diff --git a/raster/r.proj/readcell.c b/raster/r.proj/readcell.c index 7f2fb79ef..6956ee0a5 100644 --- a/raster/r.proj/readcell.c +++ b/raster/r.proj/readcell.c @@ -70,6 +70,7 @@ struct cache *readcell(int fdi, const char *size) c->refs[i] = -1;
tmpbuf = (FCELL *) G_malloc(nx * sizeof(block));
Rast_set_f_null_value(tmpbuf, nx * BDIM * BDIM);
for (row = 0; row < nrows; row += BDIM) { int x, y;
thus it was the wrong track.
Testing the PR on a laptop with 2 real and 4 virtual cores, the parallelization with openmp actually made r.proj slower. It took nearly twice as long and used 4x more CPU time compared to the non-openmp version. Thus this PR would make r.proj for many users slower and would reintroduce an old known bug.
This was not the case for me (I was using more cores), but this is a known side effect of the 'ordered' keyword by essentially forcing sequence. I have since learned of better solutions, but I need to do a little analysis, first.
Thank you for confirming issues. I'll try this in a VM with various core setups in the future before submitting another PR.
There seems to be useful changes still here even considering the OpenMP part needs reworking. I suggest to split the work into multiple branches (and PRs).
@BadAssassin Do you plan to split your PR into several ones as suggested above?
@neteler close without merging because old, outdated and suggesting changes that should be two separate PRs?
@neteler close without merging because old, outdated and suggesting changes that should be two separate PRs?
Closing because of this. It can be reopened or reworked one day if needed