grass icon indicating copy to clipboard operation
grass copied to clipboard

[Bug] [help-needed] gs.create_project doesn't seem enough to get a working session on Windows

Open echoix opened this issue 1 year ago • 2 comments

Describe the bug

I've been working in the background to have pytest collect code coverage on Windows too, and since the time impact is a bit steep, I worked to have tests running with multiple workers (pytest-xdist) like our other platforms.

I found some unexpected behaviour once I got this working (expect for failures), that seems to have been going unnoticed. A test fixture to create a temporary project and passing env variables as a dict doesn't seem to work correctly.

The pattern I observed is that the first call/test to compiled module, per xdist runner (each have a separate process, and temporary directory( ), will fail with ERROR: No active GRASS session: GISRC environment variable not set but subsequent tests on the same worker will be fine.

Originally, I found that on every run, general/g.version/tests/g_version_test.py::test_c_flag would fail with the error above, but then general/g.version/tests/g_version_test.py::test_e_flag and the others wouldn't fail. To be clear, there's no reason why printing out a static copyright text would ever need to fail because of missing env vars or config files. And there's no reason that only the c flag should fail at all for g.version, and not others at the same time. So it's a good example for the issue here. It highlights an isolation problem, or where one test impacts other subsequent ones. In these first runs, I never saw a problem with the first worker (gw0), as the first tests to run (alphabetically) are expected to fail on non-Linux.

I first tried to see if initializing the fixture at each test instead of once per module. Then looking inside the gs.create_project(tmp_path, project) and with gs.setup.init(tmp_path / project, env=os.environ.copy()) as session: to see if a env copy was missing or something was permanently changing an env var when running a first time. I also tried using monkey patch to delete and replace back the GISBASE env var to see if I get another message, as this one too is used early on. It didn't seem to change anything.

So following that intuition, I added a plugin to shuffle the order of tests. If it was the first call that even if failing, would "fix" the setup for subsequent tests, I should see failures at the beginning for each worker, and then be passing after that. And it is what is happening. I also get different tests failing between each rerun of the same workflow.

For now, I didn't try adding the random plugin to other OSes to see if there's an impact. I also didn't explore to see if the differences between our Windows builds on CI and installing from the OSGeo4W grass-dev packages also impacted this, as the initial goal was to make our CI work. (I'm waiting for #4121 to cross the finish line before sending in my work that independently did the same and changed the CI to use the same "official" script, and the same used for releases. Work is done, can use the ccache already scripted, and uploads the compiled package as an artifact if we wanted to test it locally or split the test execution in a separate job).

To reproduce

  1. Change the Windows CI workflow to install pytest-xdist and pytest-randomly with pip in the OSGeo4W shell.
  2. Similar to our other workflows, run pytest with multiple workers and for tests not having the needs_solor_run marker by using the --numprocesses auto and -m "not needs_solo_run" (notice the double quotes for cmd.exe)
  3. See failures.
  4. Try without the pytest-randomly plugin to see the c_flag (copyright) test fail, the original problem.

Expected behavior

Screenshots

  • Original run:

    • https://github.com/echoix/grass/actions/runs/11194443183/job/31120924334?pr=240#step:12:36
    • https://github.com/echoix/grass/actions/runs/11194443183/job/31120924334?pr=240#step:12:378
    • image image image
  • Three reruns with random plugin installed, the third one is the most interesting at it has 16 failures instead of one or two:

    • (1) https://github.com/echoix/grass/actions/runs/11201388093/job/31136249290#step:12:37
    • (1) image image image
    • (2) https://github.com/echoix/grass/actions/runs/11201388093/job/31138753695#step:12:35
    • (2) image image image
    • (3) https://github.com/echoix/grass/actions/runs/11201388093/job/31139004569#step:12:67
    • (3) image image image image image image

System description

  • Operating System: Windows server 2019 on GitHub actions
  • GRASS GIS version: grass dev from a PR from my fork, https://github.com/echoix/grass/pull/240, last merge from main into that fork was up to day on the day of these runs, https://github.com/OSGeo/grass/commit/a16a02f7eb1aec90bcf7055858ee729e45905b67

Pytest version and plugins shown in the screenshots and action logs

Additional context

  • All of the same pytest tests are passing on Windows when running serially.

echoix avatar Oct 09 '24 16:10 echoix

Is the basic session setup working independently from pytest? I'm wondering how general versus pytest-specific this is?

wenzeslaus avatar Oct 09 '24 18:10 wenzeslaus

Of course there's pytest related, but I'm not sure how to find out what you're asking for. Do you have an idea of how to trigger something similar, and see that the cleanup isn't done?

echoix avatar Oct 09 '24 19:10 echoix

I'm not sure if this is helpful, but I also cannot get gs.create_project (and, therefore, gs.create_location) to run on my Windows laptop. When I run:

import subprocess
import sys

# Ask GRASS GIS where its Python packages are.
sys.path.append(
    subprocess.check_output([grass84, "--config", "python_path"], text=True, shell=True).strip()
)

# Import GRASS packages
import grass.script as gs
import grass.jupyter as gj

# create project
gs.create_project("test", epsg="32619")

I get the error:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[7], line 1
----> 1 gs.create_project("test", epsg="32619")

File [C:\OSGeo4W\apps\grass\grass84\etc\python\grass\script\core.py:1817](file:///C:/OSGeo4W/apps/grass/grass84/etc/python/grass/script/core.py#line=1816), in create_project(path, name, epsg, proj4, filename, wkt, datum, datum_trans, desc, overwrite)
   1814     kwargs["datum_trans"] = datum_trans
   1816 if epsg:
-> 1817     ps = pipe_command(
   1818         "g.proj",
   1819         quiet=True,
   1820         flags="t",
   1821         epsg=epsg,
   1822         project=mapset_path.location,
   1823         stderr=PIPE,
   1824         env=env,
   1825         **kwargs,
   1826     )
   1827 elif proj4:
   1828     ps = pipe_command(
   1829         "g.proj",
   1830         quiet=True,
   (...)
   1836         **kwargs,
   1837     )

File [C:\OSGeo4W\apps\grass\grass84\etc\python\grass\script\core.py:510](file:///C:/OSGeo4W/apps/grass/grass84/etc/python/grass/script/core.py#line=509), in pipe_command(*args, **kwargs)
    491 """Passes all arguments to start_command(), but also adds
    492 "stdout = PIPE". Returns the Popen object.
    493 
   (...)
    507 :return: Popen object
    508 """
    509 kwargs["stdout"] = PIPE
--> 510 return start_command(*args, **kwargs)

File [C:\OSGeo4W\apps\grass\grass84\etc\python\grass\script\core.py:431](file:///C:/OSGeo4W/apps/grass/grass84/etc/python/grass/script/core.py#line=430), in start_command(prog, flags, overwrite, quiet, verbose, superquiet, **kwargs)
    425     sys.stderr.write(
    426         "D1/{}: {}.start_command(): {}\n".format(
    427             debug_level(), __name__, " ".join(args)
    428         )
    429     )
    430     sys.stderr.flush()
--> 431 return Popen(args, **popts)

File [C:\OSGeo4W\apps\grass\grass84\etc\python\grass\script\core.py:59](file:///C:/OSGeo4W/apps/grass/grass84/etc/python/grass/script/core.py#line=58), in Popen.__init__(self, args, **kwargs)
     57 cmd = shutil.which(args[0])
     58 if cmd is None:
---> 59     raise OSError(_("Cannot find the executable {0}").format(args[0]))
     60 args = [cmd] + args[1:]
     61 name, ext = os.path.splitext(cmd)

OSError: Cannot find the executable g.proj

The same error occurs for gs.create_location.

chaedri avatar Jan 10 '25 18:01 chaedri

In that setup, are any C-based modules able to run at all? Not finding a grass executable seems like the same thing as when the env vars/gisrc and all these things aren't set up. Did we have some big changes that weren't in grass84?

echoix avatar Jan 10 '25 22:01 echoix

I don't think any C-based models will run on Windows before gs.setup.init() is called by the gj.init function in the example above. I don't know about big changes that weren't in grass84 but @wenzeslaus and I added gs.create_project() in 8.4 and also wrote the test so perhaps that's why this is a new issue?

chaedri avatar Jan 13 '25 15:01 chaedri

In the same notebook, does gj.init work?

petrasovaa avatar Jan 15 '25 16:01 petrasovaa

Yes, gj.init works on an existing project.

chaedri avatar Jan 15 '25 16:01 chaedri

Can you try:

os.environ.get("GISBASE")

petrasovaa avatar Jan 15 '25 22:01 petrasovaa

A long summary (oxymoron) of what I learned about this problem/behaviour throughout the time, without opening code.

The first call to a C-based module from Python on windows will fail the test, but will have done something special somewhere that makes the following work. When doing with pytest-xdist (multiple workers), it's the first usage of a c library tool per worker that will fail.

It is often hidden because of xfail for another part of the test, and that we always run in the same order. Shuffling the order with a plugin makes this very obvious. For this, use the pytest-randomly plugin.

Later on, I've precised that the problem probably have to do with env vars. Using a c tool to set env vars won't make them set in Python. It's somewhat possible to define env vars from a parent process to a child one, but not from a child one back to the parent one. It's not really possible to "reload" env vars if they got changed externally (of python). So, using g.gisenv, or having a C tool change env vars won't have an impact back to the parent.

From another exploration, I learned that the new feature of passing the optional env vars is not always done consistently, or some subproccess calls (or what will end up in a subprocess call) doesn't always accept the optional env parameter, and having it would/might have solved some issues.

On another tangent, to help work around this, I've learned that using monkeypatch and setting again the env vars inside a gs.setup.init() context manager helps multiple issues in tests. Using monkeypatch will revert when leaving that fixture. See #4784 for an example of it used, and #5349 for having a fixture using different scope. Monkeypatching is not really solving the problem, as it's applicable in pytest.

echoix avatar Jun 19 '25 22:06 echoix

A long summary (oxymoron) of what I learned...

:-)

The first call to a C-based module from Python on windows will fail the test, but will have done something special somewhere that makes the following work. When doing with pytest-xdist (multiple workers), it's the first usage of a c library tool per worker that will fail. ... Shuffling the order with a plugin makes this very obvious. ...

What you are saying sounds convincing, although strange, but it's Windows, so why not? To address this, I created #5914.

Later on, I've precised that the problem probably have to do with env vars. Using a c tool to set env vars won't make them set in Python. ... So, using g.gisenv, or having a C tool change env vars won't have an impact back to the parent.

This is not what is happening. g.gisenv variables are different from your system variables. In the distance past, they may have used only the system ones, but exactly for the reasons you describe, someone came up with g.gisenv, GISRC system variable, and what I call session file (aka GISRC file). The documentation makes the distinction:

https://grass.osgeo.org/grass-stable/manuals/variables.html https://grass.osgeo.org/grass-devel/manuals/variables.html

Some variables probably should be moved back to be system variables, DEBUG seems like a candidate, but that's unrelated here.

From another exploration, I learned that the new feature of passing the optional env vars is not always done consistently, or some subproccess calls (or what will end up in a subprocess call) doesn't always accept the optional env parameter, and having it would/might have solved some issues.

The issue is that the places are not obvious in the code, so someone really needs to find those...even a trivial message call needs that. I was doing it so far on as needed basis. If you have a example or test where this is missing, adding it is somewhat straightforward. I think the recent use of shutil.which with path (#5717), env for debug_level (#5920), and partially also message changes (#5849) should now cover a lot. I don't know how many tests are still using os.environ even with grass.script.setup.init, but I guess that still common.

...I've learned that using monkeypatch and setting again the env vars inside a gs.setup.init() context manager helps multiple issues in tests. Using monkeypatch will revert when leaving that fixture...

That's a good workaround. Perhaps we should always have a test like this even in the future to make sure os.environ still works when we apply env=os.environ.copy() everywhere for tests.

wenzeslaus avatar Jun 20 '25 15:06 wenzeslaus

Now, we have #5717, #5849, #5854, #5875, and #5920 merged. Can you please try again @chaedri?

Also, please see what GISBASE and PATH have before calling create_project:

print(os.environ.get("GISBASE"))
print(os.environ.get("PATH"))

wenzeslaus avatar Jun 24 '25 14:06 wenzeslaus

I installed the dev version (8.5.0dev) through OSGeo4W (the daily build). I used the recommended Windows instructions for running GRASS in Jupyter Notebooks where you launch Jupyter from the OSGeo4W shell. Before calling create_project, GISBASE and PATH are, respectively:

None

C:\OSGeo4W\apps\qt5\bin;C:\OSGeo4W\apps\Python312\Scripts;C:\OSGeo4W\bin;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\system32\WBem

create_project now works great for me though!

chaedri avatar Jun 25 '25 14:06 chaedri

Yesterday it was a holiday, so I had time, but the PRs were just merged for the last ones, so the daily build was 17h old. So I didn't have the chance yet

echoix avatar Jun 25 '25 14:06 echoix