grass icon indicating copy to clipboard operation
grass copied to clipboard

lib/gis: Add a helper function to determine the number of threads for OpenMP

Open cyliang368 opened this issue 1 year ago • 8 comments

Based on the discussion in https://github.com/OSGeo/grass/pull/3917, this PR adds a function to make the parser determine the number of threads, mainly for the standard option G_OPT_M_NPROCS.

  • nprocs > 0 for min(nprocs, max_threads_on_device)
  • nprocs <= 0 for max(max_threads_on_device + nprocs, 1) (e.g., nprocs=0 for max_threads_on_device, nprocs=-1 for max_threads_on_device - 1, and nprocs=-10000 with 24 threads for 1, etc.)

cyliang368 avatar Jun 25 '24 21:06 cyliang368

This PR failed on four tests that ran t.rast.what in parallel. Here is a part of its detail:

======================================================================
FAIL: test_row_stdout_where_parallel_cat (__main__.TestRasterWhat)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "temporal/t.rast.what/testsuite/test_what.py", line 273, in test_row_stdout_where_parallel_cat
    self.assertLooksLike(text, str(t_rast_what.outputs.stdout))
  File "etc/python/grass/gunittest/case.py", line 202, in assertLooksLike
    self.fail(self._formatMessage(msg, standardMsg))
AssertionError: "cat|x|y|start|end|value
1|[115](https://github.com/OSGeo/grass/actions/runs/9669690114/job/26676756189?pr=3929#step:14:116).0043586274|36.3593955783|2001-04-01 00:00:00|2001-07-01 00:00:00|200
2|79.6816763826|45.2391522853|2001-04-01 00:00:00|2001-07-01 00:00:00|200
3|97.4892579600|79.2347263950|2001-04-01 00:00:00|2001-07-01 00:00:00|200
1|115.0043586274|36.3593955783|2001-07-01 00:00:00|2001-10-01 00:00:00|300
2|79.6816763826|45.2391522853|2001-07-01 00:00:00|2001-10-01 00:00:00|300
3|97.4892579600|79.2347263950|2001-07-01 00:00:00|2001-10-01 00:00:00|300
1|115.0043586274|36.3593955783|2001-10-01 00:00:00|2002-01-01 00:00:00|400
2|79.6816763826|45.2391522853|2001-10-01 00:00:00|2002-01-01 00:00:00|400
3|97.4892579600|79.2347263950|2001-10-01 00:00:00|2002-01-01 00:00:00|400
" does not correspond with "cat|x|y|start|end|value
1|115.0043586274|36.3593955783|2001-04-01 00:00:00|2001-07-01 00:00:00|200
1|115.0043586274|36.3593955783|2001-07-01 00:00:00|2001-10-01 00:00:00|300
1|115.0043586274|36.3593955783|2001-10-01 00:00:00|2002-01-01 00:00:00|400
2|79.6816763826|45.2391522853|2001-04-01 00:00:00|2001-07-01 00:00:00|200
2|79.6816763826|45.2391522853|2001-07-01 00:00:00|2001-10-01 00:00:00|300
2|79.6816763826|45.2391522853|2001-10-01 00:00:00|2002-01-01 00:00:00|400
3|97.4892579600|79.2347263950|2001-04-01 00:00:00|2001-07-01 00:00:00|200
3|97.4892579600|79.2347263950|2001-07-01 00:00:00|2001-10-01 00:00:00|300
3|97.4892579600|79.2347263950|2001-10-01 00:00:00|2002-01-01 00:00:00|400
"

----------------------------------------------------------------------
Ran 17 tests in 38.755s
FAILED (failures=4)
========================================================================
FAILED ./temporal/t.rast.what/testsuite/test_what.py (4 tests failed)

In the test, the args in SimpleModule still go through the parser I modified in this PR. Although OpenMP is not supported, this Python module (t.rast.what) can still be parallelized by subprocess in Python. Without OpenMP, nprocs=4 is passed to this Python module before this PR, but nprocs is changed to 1 in this PR before it is passed to this Python module. That's why the tests failed.

Original: t.rast.what nprocs=4 -> parser (nothing happens) -> nprocs=4 main function in Python This PR: t.rast.what nprocs=4 -> parser (nprocs is set to 1) -> nprocs=1 main function in Python

https://github.com/OSGeo/grass/blob/093895168e8e67595eceb00a747d837157b0085a/temporal/t.rast.what/testsuite/test_what.py#L275-L301

My questions are:

  1. Should we let Python modules run in parallel by subprocess when OpenMP is not supported?
  2. If we want to do so, how can I avoid the situation I mentioned above?

@HuidaeCho @wenzeslaus @marisn Do you have any suggestions?

cyliang368 avatar Jun 27 '24 02:06 cyliang368

Perhaps the function should be called in the particular C tools, that would avoid the problem with Python tools.

Also, there is the NPROCS variable, the function should take into account: https://github.com/OSGeo/grass/blob/main/lib/gis/parser_standard_options.c#L765

petrasovaa avatar Jun 27 '24 19:06 petrasovaa

Perhaps the function should be called in the particular C tools, that would avoid the problem with Python tools.

I checked and found that every C module parallelized by OpenMP has the keyword "parallel", which I think it should. Thus, I use this keyword to distinguish whether a multithread/multiprocess is spawned by C or Python.

Also, there is the NPROCS variable, the function should take into account: https://github.com/OSGeo/grass/blob/main/lib/gis/parser_standard_options.c#L765

The parser runs after options are defined, so the function can also handle an environment variable.

cyliang368 avatar Jun 27 '24 21:06 cyliang368

Perhaps the function should be called in the particular C tools, that would avoid the problem with Python tools.

I checked and found that every C module parallelized by OpenMP has the keyword "parallel", which I think it should. Thus, I use this keyword to distinguish whether a multithread/multiprocess is spawned by C or Python.

That's probably not a good strategy, the parallel keyword was meant for any parallelization, not just openmp, we would have to have a new keyword, but I am not sure using a keyword is good idea anyway.

Also, there is the NPROCS variable, the function should take into account: https://github.com/OSGeo/grass/blob/main/lib/gis/parser_standard_options.c#L765

The parser runs after options are defined, so the function can also handle an environment variable.

petrasovaa avatar Jun 28 '24 06:06 petrasovaa

Perhaps the function should be called in the particular C tools...

That I think is a good approach. Other values require this approach too. RGB colors are one example, but even the current string provided to nprocs needs to be converted to integer.

wenzeslaus avatar Jun 28 '24 06:06 wenzeslaus

I left the parser unchanged and just added a helper function instead. A C module can call this function if it is needed.

That's probably not a good strategy, the parallel keyword was meant for any parallelization, not just openmp, we would have to have a new keyword, but I am not sure using a keyword is good idea anyway.

Though this PR does not use the keyword now, I think new keywords should be considered. As you said, parallelizations could be done by different libraries. New keywords to indicate which libraries/methods are used can help with documentation and maintenance.

cyliang368 avatar Jun 28 '24 19:06 cyliang368

New keywords to indicate which libraries/methods are used can help with documentation and maintenance.

Agreed. I think this will work well when we add description for each keyword, an extended version of what we have for topics (topic keywords).

wenzeslaus avatar Jun 29 '24 15:06 wenzeslaus

Should the function just return number of threads as an integer?

You are right. I make it return the value now, and it won't change the answer of the nprocs option.

Can you please provide a breakdown how this will behave with the default value for nprocs and the possibility to change that using g.gisenv or settings in GUI? (Previously mentioned by @petrasovaa)

Let's take r.texture as an example.

  1. The option objects are defined here in main.c under r.texture. https://github.com/OSGeo/grass/blob/b5bfcd576541c1b4d43187c4d66ff0024167bab1/raster/r.texture/main.c#L107-L113

  2. The G_define_standard_option function is called. https://github.com/OSGeo/grass/blob/b5bfcd576541c1b4d43187c4d66ff0024167bab1/lib/gis/parser_standard_options.c#L139

  3. The environment variable is fetched here by G_getenv_nofatal. https://github.com/OSGeo/grass/blob/b5bfcd576541c1b4d43187c4d66ff0024167bab1/lib/gis/parser_standard_options.c#L757-L770

  4. Then, the G_parser is called after all option and flag are defined. https://github.com/OSGeo/grass/blob/b5bfcd576541c1b4d43187c4d66ff0024167bab1/raster/r.texture/main.c#L166-L167

I put the helper function in G_parser in the previous commits. Even if the environment variable changes, the parser will get it and handle it. However, we want particular C modules instead of the parser to call it, so it does not matter now. That's my understanding. I hope this is what you need.

cyliang368 avatar Jun 29 '24 18:06 cyliang368

I'm totally fine without over-subscription checking. I removed it in the new commit.

cyliang368 avatar Aug 02 '24 00:08 cyliang368

@HuidaeCho Given how you were involved in this PR, please review or remove yourself from the review.

Otherwise, this is ready to merge.

wenzeslaus avatar Aug 05 '24 18:08 wenzeslaus

@HuidaeCho Given how you were involved in this PR, please review or remove yourself from the review.

Otherwise, this is ready to merge.

He has addressed all my comments and the PR looks good. Sorry for the delay!

HuidaeCho avatar Aug 05 '24 18:08 HuidaeCho