grass
grass copied to clipboard
v.surf.rst: Cross-validation OpenMP support
This finishes the implementation of OpenMP support for cross-validation (#2644). A critical block is added to segmen2d_parallel.c to make the writing to the database in serial. main.c is modified to allow parallel computing with nprocs>1. Two simple test scripts are used to run cross-validation with nprocs=1 and nprocs>1. The results of these tests are the same.
Compilation errors:
2024-04-15T17:35:03.7536227Z point2d.c: In function ‘IL_check_at_points_2d_cvdev’:
2024-04-15T17:35:03.7537488Z point2d.c:213:19: error: unused variable ‘mm’ [-Werror=unused-variable]
2024-04-15T17:35:03.7538241Z 213 | int /* n1, */ mm, m;
2024-04-15T17:35:03.7538696Z | ^~
2024-04-15T17:35:03.7539546Z point2d.c:210:12: error: unused variable ‘north’ [-Werror=unused-variable]
2024-04-15T17:35:03.7540477Z 210 | double north = data->ymax;
2024-04-15T17:35:03.7541011Z | ^~~~~
2024-04-15T17:35:03.7541851Z point2d.c:208:12: error: unused variable ‘east’ [-Werror=unused-variable]
2024-04-15T17:35:03.7542805Z 208 | double east = data->xmax;
2024-04-15T17:35:03.7543408Z | ^~~~
2024-04-15T17:35:03.7664436Z cc1: all warnings being treated as errors
2024-04-15T17:37:05.2813633Z main.c: In function ‘main’:
2024-04-15T17:37:05.2814492Z main.c:466:5: error: too few arguments to function ‘IL_init_func_2d’
2024-04-15T17:37:05.2815104Z 466 | IL_init_func_2d(¶ms, IL_grid_calc_2d, IL_matrix_create,
2024-04-15T17:37:05.2815541Z | ^~~~~~~~~~~~~~~
2024-04-15T17:37:05.2815830Z In file included from main.c:43:
2024-04-15T17:37:05.2816586Z /home/runner/work/grass/grass/dist.x86_64-pc-linux-gnu/include/grass/interpf.h:153:6: note: declared here
2024-04-15T17:37:05.2817450Z 153 | void IL_init_func_2d(struct interp_params *, grid_calc_fn *, matrix_create_fn *,
2024-04-15T17:37:05.2817964Z | ^~~~~~~~~~~~~~~
2024-04-15T17:37:05.2846707Z make[4]: *** [../../include/Make/Compile.make:32: OBJ.x86_64-pc-linux-gnu/main.o] Error 1
2024-04-15T17:38:28.9886495Z main.c:660:12: error: comparison between pointer and integer [-Werror]
2024-04-15T17:38:28.9887414Z 660 | if (cv != NULL || create_devi != NULL) {
2024-04-15T17:38:28.9888056Z | ^~
2024-04-15T17:38:28.9888873Z main.c:660:35: error: comparison between pointer and integer [-Werror]
2024-04-15T17:38:28.9889786Z 660 | if (cv != NULL || create_devi != NULL) {
2024-04-15T17:38:28.9890443Z | ^~
2024-04-15T17:38:29.0160707Z cc1: all warnings being treated as errors
2024-04-15T17:38:29.0177415Z make[3]: *** [../../include/Make/Compile.make:32: OBJ.x86_64-pc-linux-gnu/main.o] Error 1
@cyliang368 Good to see you working on this! May I recommend using pre-commit, which will take care of formatting and other useful checks locally.
You may also turn on -Wall -Wextra -Wpedantic -Wvla compiler flags as well (those are checked on some of the CI runners), thus saving some CI time.
@petrasovaa @nilason I didn't notice that warnings are treated as errors and that I should use pre-commit. Thanks for the advice!
I wouldn't consider a critical block parallel because it restricts execution of them to one thread at a time and performance won't be much different from serial. Have you checked speedup and efficiency? Any plots to prove performance improvement?
Ideally, it would be best to create a parallel version of
lib/rst/interp_float/point2d.cjust likelib/rst/interp_float/segment2d_parallel.c.
@HuidaeCho I split the IL_check_at_points_2d function into two functions. One is used for calculation, whereas the other is used to write values into the map. Now, when running cross-validation or creating deviation, the parallel for loop in lib/rst/interp_float/segment2d_parallel.c computes everything in parallel except for file writing.
With this implementation, we get higher speedup and efficiency (green lines in the figures below). I modified some functions, files, and the structure of loops for this implementation. Please review them. Thanks!
Before this can be merged, we need a test, at least just testing the cross-validation results are the same as before (reference based on results before this change) and that they are the same for different number of cores. The test can be added to v.surf.rst/testsuite/test_vsurfrst.py.
Before this can be merged, we need a test, at least just testing the cross-validation results are the same as before (reference based on results before this change) and that they are the same for different number of cores. The test can be added to v.surf.rst/testsuite/test_vsurfrst.py.
@petrasovaa I added tests for parallelized cross-validation and deviation-creation. I used the results from IL_interp_segments_2d, which I didn't change, to check the results from IL_interp_segments_2d_parallel. The results are consistent, no matter whether single thread or multi-threads is used. Please check https://github.com/OSGeo/grass/pull/3590/commits/d4b8da7122c31dda61f592a2166aecb608b0899e.
I wouldn't consider a critical block parallel because it restricts execution of them to one thread at a time and performance won't be much different from serial. Have you checked speedup and efficiency? Any plots to prove performance improvement? Ideally, it would be best to create a parallel version of
lib/rst/interp_float/point2d.cjust likelib/rst/interp_float/segment2d_parallel.c.@HuidaeCho I split the
IL_check_at_points_2dfunction into two functions. One is used for calculation, whereas the other is used to write values into the map. Now, when running cross-validation or creating deviation, the parallel for loop inlib/rst/interp_float/segment2d_parallel.ccomputes everything in parallel except for file writing.With this implementation, we get higher speedup and efficiency (green lines in the figures below). I modified some functions, files, and the structure of loops for this implementation. Please review them. Thanks!
Looks great! I would keep the original point2d.c and copy it to point2d_parallel.c just like segment2d_parallel.c for consistency and just in case. This PR now breaks backward compatibility: 1) adding new init functions (new arguments; IL_write_point_2d is new, but is it necessary?), 2) changing pass-by-value to pass-by-reference. Are these all necessary and can we avoid doing these?
The code for IL_write_point_2d repeats three times in point2d.c. Given the fact that it already repeats twice before this PR in point2d.c, I would say, we don't even need to add this new function and argument to IL_init_func_2d. Why and when would we pass another function than this code? Have you considered calling this function within the library from the three places without adding it as an argument to IL_init_func_2d?
I wouldn't consider a critical block parallel because it restricts execution of them to one thread at a time and performance won't be much different from serial. Have you checked speedup and efficiency? Any plots to prove performance improvement? Ideally, it would be best to create a parallel version of
lib/rst/interp_float/point2d.cjust likelib/rst/interp_float/segment2d_parallel.c.@HuidaeCho I split the
IL_check_at_points_2dfunction into two functions. One is used for calculation, whereas the other is used to write values into the map. Now, when running cross-validation or creating deviation, the parallel for loop inlib/rst/interp_float/segment2d_parallel.ccomputes everything in parallel except for file writing. With this implementation, we get higher speedup and efficiency (green lines in the figures below). I modified some functions, files, and the structure of loops for this implementation. Please review them. Thanks!![]()
Looks great! I would keep the original
point2d.cand copy it topoint2d_parallel.cjust likesegment2d_parallel.cfor consistency and just in case. This PR now breaks backward compatibility: 1) adding new init functions (new arguments;IL_write_point_2dis new, but is it necessary?), 2) changing pass-by-value to pass-by-reference. Are these all necessary and can we avoid doing these?The code for
IL_write_point_2drepeats three times inpoint2d.c. Given the fact that it already repeats twice before this PR inpoint2d.c, I would say, we don't even need to add this new function and argument toIL_init_func_2d. Why and when would we pass another function than this code? Have you considered calling this function within the library from the three places without adding it as an argument toIL_init_func_2d?
Thanks for reviewing my PR. These comments gave me some ideas on how to improve it. Based on these two questions, I pushed new commits, and I wrote down my thoughts here.
1. adding new init functions (new arguments; IL_write_point_2d is new, but is it necessary?
It is not necessary to add a new init function and argument. However, to make them clear, I had thought I shouldn't make a function do two tasks. IL_check_at_points_2d did two things: compute the deviations and write values to the database. Therefore, I split it into two functions.
In the new commit, I included IL_write_point_2d into IL_write_point_2d_parallel so that we don't need to change the definition and usage of IL_init_func_2d.
2. changing pass-by-value to pass-by-reference. Are these all necessary and can we avoid doing these?
This is necessary in my implementation. This question is related to the previous question. IL_check_at_points_2d function actually does two things. Computing should be parallelized while writing out should be in serial. I can only figure out three ways to save the outputs: (1) make this function return values, (2) pass-by-reference and save the values by this pointer, and (3) directly write them out to the database. Since we want to parallelize the function, we cannot use (3). (1) does not seem to be preferred, so I chose (2).
This is my first time contributing to open-source code, so I am unsure whether these are preferred practices. Please review them and give me advice. Thanks a lot.
In this case I am fine with breaking the backwards compatibility, the API is not widely used, so I don't expect any problems.
(2) pass-by-reference and save the values by this pointer
@cyliang368 Does this still apply? I cannot find anywhere in IL_check_at_points_2d() that updates skip_point.
xx = points[m - 1].x - skip_point->x;
yy = points[m - 1].y - skip_point->y;
...
zz = skip_point->z + zmin;
...
xmm = skip_point->x * dnorm + params->x_orig + west;
ymm = skip_point->y * dnorm + params->y_orig + south;
skip_point seems read-only now? Have things changed?
(2) pass-by-reference and save the values by this pointer
@cyliang368 Does this still apply? I cannot find anywhere in
IL_check_at_points_2d()that updatesskip_point.xx = points[m - 1].x - skip_point->x; yy = points[m - 1].y - skip_point->y; ... zz = skip_point->z + zmin; ... xmm = skip_point->x * dnorm + params->x_orig + west; ymm = skip_point->y * dnorm + params->y_orig + south;
skip_pointseems read-only now? Have things changed?
Never mind. You use it in IL_check_at_points_2d_cvdev(). ;-)
I'm running a new run of CI with the last 7 days of linter upgrades made, then merge

