torch-mlir
torch-mlir copied to clipboard
Python bindings build failure on Windows: Generating Lazy Tensor Core IR Nodes
Generating Lazy Tensor Core IR Nodes fails on Windows, more specifically build_tools\autogen_ltc_backend.py
.
It fails on multiple points so I list the issues and fixes step-by-step:
1. subprocess.check_output
behavior on Windows
subprocess.check_output
requires shell=True
on Windows when calling built-in command:
[101/142] Generating Lazy Tensor Core IR Nodes
FAILED: tools/torch-mlir/generated_backend.hash tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/LazyNativeFunctions.cpp tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/RegisterLazy.cpp tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/shape_inference.cpp D:/packages/torch-mlir/build/tools/torch-mlir/generated_backend.hash D:/packages/torch-mlir/build/tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/LazyNativeFunctions.cpp D:/packages/torch-mlir/build/tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/RegisterLazy.cpp D:/packages/torch-mlir/build/tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/shape_inference.cpp
cmd.exe /C "cd /D D:\packages\torch-mlir\build\tools\torch-mlir\python\torch_mlir\csrc\base_lazy_backend && C:\Users\ryuta\AppData\Local\Programs\Python\Python39\python.exe D:/packages/torch-mlir/torch-mlir/build_tools/autogen_ltc_backend.py -b D:/packages/torch-mlir/build/tools/torch-mlir"
Traceback (most recent call last):
File "D:\packages\torch-mlir\torch-mlir\build_tools\autogen_ltc_backend.py", line 537, in <module>
main(args)
File "D:\packages\torch-mlir\torch-mlir\build_tools\autogen_ltc_backend.py", line 498, in main
generator()
File "D:\packages\torch-mlir\torch-mlir\build_tools\autogen_ltc_backend.py", line 481, in __call__
self.generate_native_functions()
File "D:\packages\torch-mlir\torch-mlir\build_tools\autogen_ltc_backend.py", line 230, in generate_native_functions
for op in subprocess.check_output(
File "C:\Users\ryuta\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 424, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "C:\Users\ryuta\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 505, in run
with Popen(*popenargs, **kwargs) as process:
File "C:\Users\ryuta\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 951, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Users\ryuta\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 1420, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified
https://github.com/llvm/torch-mlir/blob/c0630da678ea015ca8757340633c654d5539adac/build_tools/autogen_ltc_backend.py#L221-L225
This can be fixed by adding shell=True
op[6:]
for op in subprocess.check_output(
cmd,
encoding="utf-8",
shell=True
)
2. grep
on Windows (or lack thereof)
This one is pretty simple. Windows does not have grep
:
[103/142] Generating Lazy Tensor Core IR Nodes
FAILED: tools/torch-mlir/generated_backend.hash tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/LazyNativeFunctions.cpp tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/RegisterLazy.cpp tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/shape_inference.cpp D:/packages/torch-mlir/build/tools/torch-mlir/generated_backend.hash D:/packages/torch-mlir/build/tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/LazyNativeFunctions.cpp D:/packages/torch-mlir/build/tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/RegisterLazy.cpp D:/packages/torch-mlir/build/tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/shape_inference.cpp
cmd.exe /C "cd /D D:\packages\torch-mlir\build\tools\torch-mlir\python\torch_mlir\csrc\base_lazy_backend && C:\Users\ryuta\AppData\Local\Programs\Python\Python39\python.exe D:/packages/torch-mlir/torch-mlir/build_tools/autogen_ltc_backend.py -b D:/packages/torch-mlir/build/tools/torch-mlir"
'grep' is not recognized as an internal or external command,
operable program or batch file.
Traceback (most recent call last):
File "D:\packages\torch-mlir\torch-mlir\build_tools\autogen_ltc_backend.py", line 537, in <module>
main(args)
File "D:\packages\torch-mlir\torch-mlir\build_tools\autogen_ltc_backend.py", line 498, in main
generator()
File "D:\packages\torch-mlir\torch-mlir\build_tools\autogen_ltc_backend.py", line 481, in __call__
self.generate_native_functions()
File "D:\packages\torch-mlir\torch-mlir\build_tools\autogen_ltc_backend.py", line 230, in generate_native_functions
for op in subprocess.check_output(
File "C:\Users\ryuta\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 424, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "C:\Users\ryuta\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['grep', '-o', 'aten::[0-9a-zA-Z_\\.]\\+', 'D:\\packages\\torch-mlir\\torch-mlir\\include\\torch-mlir\\Dialect\\Torch\\IR\\GeneratedTorchOps.td']' returned non-zero exit status 1.
https://github.com/llvm/torch-mlir/blob/c0630da678ea015ca8757340633c654d5539adac/build_tools/autogen_ltc_backend.py#L215-L218
It is possible to define the command similar to grep
on both powershell.exe
and cmd.exe
:
if psutil.Process(os.getpid()).name() == "powershell.exe":
cmd = ["powershell", "-Command", "& {"
+ f"(Get-Content {self.torch_ops_file} | Select-String -Pattern \"aten::[0-9a-zA-Z_\.]+\").Matches.Value"
+ "}"]
else:
cmd = f"for /f \"tokens=7 delims=` \" %a in ('findstr /R /C:\"aten::\" {self.torch_ops_file}') do @echo %a"
3. yaml.scanner.ScannerError
This one was a bit confusing at first glance.
[104/142] Generating Lazy Tensor Core IR Nodes
FAILED: tools/torch-mlir/generated_backend.hash tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/LazyNativeFunctions.cpp tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/RegisterLazy.cpp tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/shape_inference.cpp D:/packages/torch-mlir/build/tools/torch-mlir/generated_backend.hash D:/packages/torch-mlir/build/tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/LazyNativeFunctions.cpp D:/packages/torch-mlir/build/tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/RegisterLazy.cpp D:/packages/torch-mlir/build/tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/shape_inference.cpp
cmd.exe /C "cd /D D:\packages\torch-mlir\build\tools\torch-mlir\python\torch_mlir\csrc\base_lazy_backend && C:\Users\ryuta\AppData\Local\Programs\Python\Python39\python.exe D:/packages/torch-mlir/torch-mlir/build_tools/autogen_ltc_backend.py -b D:/packages/torch-mlir/build/tools/torch-mlir"
Traceback (most recent call last):
File "D:\packages\torch-mlir\torch-mlir\build_tools\autogen_ltc_backend.py", line 536, in <module>
main(args)
File "D:\packages\torch-mlir\torch-mlir\build_tools\autogen_ltc_backend.py", line 497, in main
generator()
File "D:\packages\torch-mlir\torch-mlir\build_tools\autogen_ltc_backend.py", line 481, in __call__
self.generate_shape_inference()
File "D:\packages\torch-mlir\torch-mlir\build_tools\autogen_ltc_backend.py", line 341, in generate_shape_inference
parsed_backend_yaml = parse_backend_yaml(
File "d:\packages\pytorch\pytorch\torchgen\gen_backend_stubs.py", line 58, in parse_backend_yaml
yaml_values = yaml.load(f, Loader=YamlLoader)
File "C:\Users\ryuta\AppData\Local\Programs\Python\Python39\lib\site-packages\yaml\__init__.py", line 81, in load
return loader.get_single_data()
File "C:\Users\ryuta\AppData\Local\Programs\Python\Python39\lib\site-packages\yaml\constructor.py", line 49, in get_single_data
node = self.get_single_node()
File "yaml\_yaml.pyx", line 673, in yaml._yaml.CParser.get_single_node
File "yaml\_yaml.pyx", line 687, in yaml._yaml.CParser._compose_document
File "yaml\_yaml.pyx", line 731, in yaml._yaml.CParser._compose_node
File "yaml\_yaml.pyx", line 845, in yaml._yaml.CParser._compose_mapping_node
File "yaml\_yaml.pyx", line 729, in yaml._yaml.CParser._compose_node
File "yaml\_yaml.pyx", line 808, in yaml._yaml.CParser._compose_sequence_node
File "yaml\_yaml.pyx", line 860, in yaml._yaml.CParser._parse_next_event
yaml.scanner.ScannerError: while scanning a simple key
in "D:\packages\torch-mlir\build\tools\torch-mlir\generated_native_functions.yaml", line 41, column 1
could not find expected ':'
in "D:\packages\torch-mlir\build\tools\torch-mlir\generated_native_functions.yaml", line 42, column 1
It turns out os.linesep
on Windows is \r\n
https://github.com/llvm/torch-mlir/blob/c0630da678ea015ca8757340633c654d5539adac/build_tools/autogen_ltc_backend.py#L221-L227
and split(os.linesep)
fails to split the string returned by the command which is delimited by \n
.
Removing os.linesep
fixes the problem.
Please feel free to submit PRs for these. Thanks for getting the Windows build going.
I unfortunately do not have any windows machines to help debug or test this. But like @powderluv said, please feel free to submit a PR with the fixes and I'll be sure to review it