scikit-build-core icon indicating copy to clipboard operation
scikit-build-core copied to clipboard

Support semicolon-separated lists in SKBUILD_CMAKE_ARGS

Open vyasr opened this issue 1 year ago • 13 comments

This is essentially the converse of https://github.com/scikit-build/scikit-build-core/pull/727. The logic for splitting up items in SKBUILD_CMAKE_ARGS does not account for the possibility of entries that are themselves semicolon-separated lists. Consider

SKBUILD_CMAKE_ARGS="-DCMAKE_PREFIX_PATH='/path/to/a;/path/to/b';-DCMAKE_MODULE_PATH='/path/to/c;/path/to/d'" python -m pip wheel /path/to/project

vyasr avatar Oct 08 '24 16:10 vyasr

This was recently implemented: https://github.com/scikit-build/scikit-build-core/pull/921. How it works for environment variable definition, that I am not so sure though, can you test it against the main branch?

LecrisUT avatar Oct 08 '24 16:10 LecrisUT

Sure I can test that, although I wouldn't expect that to work. That is implementing the behavior of CMake defines, not the more general case of arbitrary CMake arguments. That being said, if that works it would address the example I showed above since both CMAKE_PREFIX_PATH and CMAKE_MODULE_PATH could be handled via SKBUILD_CMAKE_DEFINE, which #921 would fix. It wouldn't work if I wanted to pass something like --debug-find as a CMake argument, though.

vyasr avatar Oct 08 '24 19:10 vyasr

It wouldn't work if I wanted to pass something like --debug-find as a CMake argument, though.

Why wouldn't it? Like passing multiple args doesn't work or passing a value with semicolon doesn't work? If the former that would seem like a bug, at least through my reading of the documentation, didn't check the implementation. If the latter, we should check what usecases other than SKBUILD_CMAKE_DEFINE it applies to.

Btw CMAKE_PREFIX_PATH and CMAKE_MODULE_PATH could be problematic. I don't think we properly forward those options to append/prepend with the scikit-build-core defined ones, we should discuss how these should be handled, probably in conjunction with #880

LecrisUT avatar Oct 08 '24 19:10 LecrisUT

Apologies I thought I included information on this in the original issue. The problem is on this line for parsing out the environment variable. The split on the semicolon does not account for the possibility of the semicolon being inside quotes. For example, if I take the C API example from the getting started page and add

message("CMAKE_PREFIX_PATH=${CMAKE_PREFIX_PATH}")                                                                                                                                     
message("CMAKE_MODULE_PATH=${CMAKE_MODULE_PATH}")     

right after the python_add_library call in CMakeLists.txt, here's what I get (eliding unnecessary details with ...):

(scikit_build_core_dev) vyasr-dt% SKBUILD_CMAKE_ARGS="-DCMAKE_PREFIX_PATH='/path/to/a;/path/to/b';-DCMAKE_MODULE_PATH='/path/to/c;/path/to/d'"  python -m pip wheel --disable-pip-version-check . -v --no-build-isolation
Processing /home/vyasr/local/testing/skbc/skbuild_cmake_args
...
*** Configuring CMake...
  CMake Warning:
    Ignoring extra path from command line:

     "/path/to/d'"
...
  -- Found Python: /home/vyasr/miniconda3/envs/scikit_build_core_dev/bin/python (found version "3.10.13") found components: Interpreter Development.Module
  CMAKE_PREFIX_PATH='/path/to/a
  CMAKE_MODULE_PATH='/path/to/c
  -- Configuring done (0.2s)
  -- Generating done (0.0s)
  -- Build files have been written to: /tmp/tmp0z7d375p/build
...
Successfully built example

vyasr avatar Oct 18 '24 20:10 vyasr

I think you're probably right that CMAKE_PREFIX_PATH and CMAKE_MODULE_PATH aren't good examples here though because we may not support setting them in this way without conflicting with scikit-build-core's methods of setting them. The more general question is, how do you set any variable via SKBUILD_CMAKE_ARGS that could be a list? It doesn't have to be a built-in CMake variable or property, it could be something that is specific to your build e.g. SKBUILD_CMAKE_ARGS="-DMY_VAR='a;b;c;'".

vyasr avatar Oct 18 '24 21:10 vyasr

But for the case of SKBUILD_CMAKE_ARGS we want to encourage the usage of SKBUILD_CMAKE_DEFINE and we have documented that an escape \ is needed ^1 although indeed an example for passing a list and list-of-list is missing. Would be pretty straightforward to expand the documentation there. Defining a specific environment variable to listen to should be supported though:

[tool.scikit-build.cmake.define]
SOME_DEFINE = {env="SOME_DEFINE", default="EMPTY"}

But I think what you really want there is to parse the quotation marks in -D<Var>="<Value>" and pass the contents as-is. Could be quite a complication to do the escaping, wonder if there is some more clever way of doing it. Maybe using argparse and white space as separators instead?

LecrisUT avatar Oct 21 '24 09:10 LecrisUT

Before @henryiii updated #727 to use shlex (which was a great solution in that case) I was using a regex to capture this. In this instance maybe that's the only way forward? Regular expressions do seem like the right tool in this case; I am skeptical that any parsing engine we tried to build would be any more foolproof or less complicated.

vyasr avatar Oct 30 '24 23:10 vyasr

What about deprecating the ; split syntax? Internally it should work fine if we change the splitting there to also be shlex. Or we could do both, try a regex for the ; split and then do shlex on top.

Do you have a regex101 link of the regex you think of using to test some edge cases?

LecrisUT avatar Oct 31 '24 04:10 LecrisUT

How about something like this?

In [25]: import re
    ...: cmake_args = "-DCMAKE_PREFIX_PATH='/path/to/a;/path/to/b';-DCMAKE_MODULE_PATH='/path/to/c;/path/to/d'"
    ...: [x for x in re.split('''((?:[^;"']|"[^"]*"|'[^']*')+)''', cmake_args) if (stripped := x.strip()) and not re.match(";+", stripped)]
Out[25]: 
["-DCMAKE_PREFIX_PATH='/path/to/a;/path/to/b'",
 "-DCMAKE_MODULE_PATH='/path/to/c;/path/to/d'"]

https://regex101.com/r/auHBbH/1

vyasr avatar Oct 31 '24 21:10 vyasr

I'm not sure what syntax you're suggesting to deprecate since the core syntax is CMake's and not SKBC's.

vyasr avatar Oct 31 '24 21:10 vyasr

The deprecation would be in your example the ; between -DCMAKE_PREFIX_PATH and -DCMAKE_MODULE_PATH

LecrisUT avatar Oct 31 '24 22:10 LecrisUT

Since I just opened and closed #944, let me add to this:

  1. Use of CMAKE_PREFIX_PATH is required in certain package managers like Nix, Guix, Spack where every package (python or not) lives in its own prefix, and pip is not used to setup the build dependencies, but Nix/Guix/Spack is.
  2. The good old way of shlex.split(os.environ.get("SKBUILD_CMAKE_DEFINE")) is both simpler and more robust, this is what scikit-build did.

I guess a better solution would be in the style of #944 where parsing is avoided by passing -Ccmake.define.FOO=BAR. For --debug-find (which I would really like to be able to use...) that does not suffice.

Can I instead have -Ccmake.arg=--debug-find and -Ccmake.arg=-DFOO:BOOL=BAR etc?

haampie avatar Nov 12 '24 21:11 haampie

The last question on here was answered in https://github.com/scikit-build/scikit-build-core/issues/944#issuecomment-2471594487 and below.

vyasr avatar Apr 01 '25 21:04 vyasr