mach-nix
mach-nix copied to clipboard
pytorchWithCuda possible?
In the following code, torch
resolves to the package pytorch
. Is there a way to make it resolve to pytorchWithCuda
instead?
devShell = mach-nix.mkPythonShell rec {
requirements = ''
torch
jupyterlab
torchvision
matplotlib
'';
providers = {
torch = "nixpkgs";
};
};
A complete example is given here
It uses pytorchWIthCuda but yields the error:
Automatic extraction of 'pname' from python package source /nix/store/hfgww019flw6byvczdiwcz2qm13fgg7zpython3.7-pytorch-1.7.1 failed.
Please manually specify 'pname'
I succeed in getting an environment with a gpu-enabled pytorch with something like this ...
myPython = (pkgs.python37.withPackages (p: with p; [
pytorchWithCuda
jupyterlab
torchvision
matplotlib
])).override (_ : { ignoreCollisions = true; });
myShell = pkgs.mkShell rec {
buildInputs = [
myPython
pkgs.conda
];
shellHook = ''
jupyter lab --notebook-dir=~/
'';
};
...However it doesn't use mach-nix.
The complete code is here (same repo, different branch)
Not sure how to supply the desired pname
or why this issue pops up.
torch seems to be one pkgs that needs more effort the other pypi pkgs
You could just use overridesPost
in mach-nix to enable cuda for pytorch:
let
mach-nix = import (
builtins.fetchGit {
url = "https://github.com/DavHau/mach-nix/";
ref = "refs/tags/3.1.1";
}
) {
python = "python37";
};
in
mach-nix.mkPython {
requirements = ''
torch
'';
providers.torch = "nixpkgs";
overridesPost = [(curr: prev: {
torch = prev.torch.override {
cudaSupport = true;
};
})];
}
I cannot verify, since I don't have a GPU available right now.
Should we add this to the examples.md?
3.1.1 / 3.2.0 results in:
-- Performing Test COMPILER_SUPPORTS_NO_AVX256_SPLIT - Success
Traceback (most recent call last):
File "/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/build/source/tools/codegen/gen.py", line 14, in <module>
from tools.codegen.model import *
File "/build/source/tools/codegen/model.py", line 39, in <module>
@dataclass(frozen=True)
File "/nix/store/afswsnr2198p7c5np5b939ki49c3z63s-python3.7-dataclasses-0.6/lib/python3.7/site-packages/dataclasses.py", line 950, in wrap
return _process_class(cls, init, repr, eq, order, unsafe_hash, frozen)
File "/nix/store/afswsnr2198p7c5np5b939ki49c3z63s-python3.7-dataclasses-0.6/lib/python3.7/site-packages/dataclasses.py", line 801, in _process_class
for name, type in cls_annotations.items()]
File "/nix/store/afswsnr2198p7c5np5b939ki49c3z63s-python3.7-dataclasses-0.6/lib/python3.7/site-packages/dataclasses.py", line 801, in <listcomp>
for name, type in cls_annotations.items()]
File "/nix/store/afswsnr2198p7c5np5b939ki49c3z63s-python3.7-dataclasses-0.6/lib/python3.7/site-packages/dataclasses.py", line 659, in _get_field
if (_is_classvar(a_type, typing)
File "/nix/store/afswsnr2198p7c5np5b939ki49c3z63s-python3.7-dataclasses-0.6/lib/python3.7/site-packages/dataclasses.py", line 550, in _is_classvar
return type(a_type) is typing._ClassVar
AttributeError: module 'typing' has no attribute '_ClassVar'
--
CMake Error at cmake/Codegen.cmake:202 (message):
Failed to get generated_cpp list
Call Stack (most recent call first):
caffe2/CMakeLists.txt:2 (include)
-- Configuring incomplete, errors occurred!
See also "/build/source/build/CMakeFiles/CMakeOutput.log".
See also "/build/source/build/CMakeFiles/CMakeError.log".
Traceback (most recent call last):
File "setup.py", line 717, in <module>
build_deps()
File "setup.py", line 313, in build_deps
cmake=cmake)
File "/build/source/tools/build_pytorch_libs.py", line 59, in build_caffe2
rerun_cmake)
File "/build/source/tools/setup_helpers/cmake.py", line 329, in generate
self.run(args, env=my_env)
File "/build/source/tools/setup_helpers/cmake.py", line 141, in run
check_call(command, cwd=self.build_dir, env=env)
File "/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/lib/python3.7/subprocess.py", line 363, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '-GNinja', '-DBUILD_DOCS=', '-DBUILD_NAMEDTENSOR=1', '-DBUILD_PYTHON=True', '-DBUILD_TEST=True', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_INCLUDE_PATH=/nix/store/x6fbphhsaj71ssb4dhy3xpr0v69sxafq-blas-3-dev/include:/nix/store/7y59s8b8mi42w1v7p2qmwcd5jbj57sqz-openblas-0.3.12-dev/include:/nix/store/cvalgpm8km90bc2rxd1xzbjz4a6srvky-numactl-2.0.14/include:/nix/store/i7nss4wvb4i3458zccbpq9lf478mhz49-libffi-3.3-dev/include:/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/include:/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/include', '-DCMAKE_INSTALL_PREFIX=/build/source/torch', '-DCMAKE_LIBRARY_PATH=/nix/store/fm4y1vb1kjf8z1zy82mmhz88dj7dff36-blas-3/lib:/nix/store/njh6a13rlg0nq4hhvarw8smzzjr6jjq5-openblas-0.3.12/lib:/nix/store/cvalgpm8km90bc2rxd1xzbjz4a6srvky-numactl-2.0.14/lib:/nix/store/dzwq4mnbaj5f30gkrc80618l6xlbzwdj-libffi-3.3/lib:/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/lib:/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/lib:/nix/store/cnzvlisa06k5rl9a9z8zmjkn7ba6bp2n-libGL-1.3.2/lib:/nix/store/4s6bvh4c39fb1ihd0966j6f72zirf576-libICE-1.0.10/lib:/nix/store/y20lx0kjwf3slxknrc40zdifjlfh4ijh-libSM-1.2.3/lib:/nix/store/cmzbw5bk1yva5zk4y61jjz9l3190q7a5-libX11-1.6.12/lib:/nix/store/622b1nj4bqhx8vl56215vp7b7apxn5px-libXext-1.3.4/lib:/nix/store/0k979a89ix8xz02jid2g475pwwclzp0c-libXrender-0.9.10/lib:/nix/store/a6rnjp15qgp8a699dlffqj94hzy1nldg-glibc-2.32/lib:/nix/store/0ds5gvys9awz8ab2mybyfhy7532yrhxa-glib-2.66.2/lib:/nix/store/7vaig2i7pna01zygjx4ij3ci5phyhlan-ncurses-6.2-abi5-compat/lib:/nix/store/51hq0xxp9nng3xxfz7dpkhb9lzy7sz84-gcc-9.3.0-lib/lib', '-DCMAKE_PREFIX_PATH=/nix/store/x6fbphhsaj71ssb4dhy3xpr0v69sxafq-blas-3-dev:/nix/store/fm4y1vb1kjf8z1zy82mmhz88dj7dff36-blas-3:/nix/store/7y59s8b8mi42w1v7p2qmwcd5jbj57sqz-openblas-0.3.12-dev:/nix/store/njh6a13rlg0nq4hhvarw8smzzjr6jjq5-openblas-0.3.12:/nix/store/cvalgpm8km90bc2rxd1xzbjz4a6srvky-numactl-2.0.14:/nix/store/1s1jrg4c78psbv2jzwz7s168z1sbk9bf-python3.7-cffi-1.14.3-dev:/nix/store/i7nss4wvb4i3458zccbpq9lf478mhz49-libffi-3.3-dev:/nix/store/dzwq4mnbaj5f30gkrc80618l6xlbzwdj-libffi-3.3:/nix/store/6giaa06r2dj7hnmlrdp8i3707m02ypx0-python3.7-pycparser-2.20:/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9:/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9:/nix/store/39pg386bn9v8s7k8vxilg6qw1sbyg05n-python3.7-cffi-1.14.3:/nix/store/d7pmiypga59psnv6g3rax0yw2f2yy6f7-python3.7-click-7.1.2:/nix/store/w6fpl4v7s0ychkg9vrzr3lal5nqmzm8b-python3.7-numpy-1.19.4:/nix/store/cnzvlisa06k5rl9a9z8zmjkn7ba6bp2n-libGL-1.3.2:/nix/store/4s6bvh4c39fb1ihd0966j6f72zirf576-libICE-1.0.10:/nix/store/y20lx0kjwf3slxknrc40zdifjlfh4ijh-libSM-1.2.3:/nix/store/cmzbw5bk1yva5zk4y61jjz9l3190q7a5-libX11-1.6.12:/nix/store/622b1nj4bqhx8vl56215vp7b7apxn5px-libXext-1.3.4:/nix/store/0k979a89ix8xz02jid2g475pwwclzp0c-libXrender-0.9.10:/nix/store/a6rnjp15qgp8a699dlffqj94hzy1nldg-glibc-2.32:/nix/store/0ds5gvys9awz8ab2mybyfhy7532yrhxa-glib-2.66.2:/nix/store/7vaig2i7pna01zygjx4ij3ci5phyhlan-ncurses-6.2-abi5-compat:/nix/store/51hq0xxp9nng3xxfz7dpkhb9lzy7sz84-gcc-9.3.0-lib:/nix/store/ys476m00hh1c6maaq3zpqpfqahacwyp3-python3.7-PyYAML-5.3.1:/nix/store/c3620zk6b3qy42yxv070jngarif0lr9c-python3.7-typing-extensions-3.7.4.3:/nix/store/s434dgl2vx5w1nfjvhypgmrnv2di0zjp-python3.7-Pillow-7.2.0:/nix/store/cxj8cvc7rf2jh792cjfsdjps35lprjgs-python3.7-olefile-0.46:/nix/store/8fpdmdk0sm5alpigxmf0smig0ca6bljk-python3.7-six-1.15.0:/nix/store/axj1zmza6m6w3kdj2nnnjwmklklzz3x7-python3.7-future-0.18.2:/nix/store/jprkahny25dzapj0z7i4kj3cv0rm3hj3-python3.7-tensorflow-tensorboard-1.15.0:/nix/store/xy2kprv7l768jqsazpbyx86sv8nwhjdm-python3.7-Werkzeug-1.0.1:/nix/store/9b1dd71knik83rjjvzgxgf41syi7wf1w-python3.7-itsdangerous-1.1.0:/nix/store/0j51lpyxyyjjqz43najzcaqic08na9ck-python3.7-protobuf-3.14.0-dev:/nix/store/fa5bycnsfkdlms8vv6z594ydvnyc6p4l-python3.7-google-apputils-0.4.2:/nix/store/d5b4mls4fjh1i5yc9851cl84hbyljcpq-python3.7-pytz-2020.1:/nix/store/y490dbqkxh8rxxnp5v56kpic25fc9n5n-python3.7-python-gflags-3.1.2:/nix/store/ky0dgzjxa5d03wxapb4w47g07rfnh27i-python3.7-python-dateutil-2.8.1:/nix/store/ngk96p2yjwf6l5221vh4665hk2d357qm-python3.7-setuptools_scm-4.1.2:/nix/store/6pl98g9cxlwnxh9lzhs88bsznlvrxgzg-python3.7-mox-0.5.3:/nix/store/gkbd359wym4dzxs5jcpb30ma3rdb042s-python3.7-protobuf-3.14.0:/nix/store/c88vj9z0wgzpswi9jwakfawcjkbd236i-python3.7-Markdown-3.2.2:/nix/store/qd0xh9kqz3srdbr53pgl9s7xbdkiyc18-python3.7-setuptools-47.3.1:/nix/store/lj9iscpl3ppj2x8a0i2dl3hll03mxzyj-python3.7-importlib-metadata-1.7.0:/nix/store/zbxm6aqvsdc2vxzg766hfbq0dr71m8db-python3.7-zipp-3.1.0:/nix/store/xr55w2kx636zm7s93q8h77lv4zmcawwb-python3.7-more-itertools-8.4.0:/nix/store/myrwv9hvw93m0421krkkrx7mss394a3f-python3.7-grpcio-1.33.2:/nix/store/6v946n068m8xbrpil2a1k0k1vpbgw3sd-python3.7-absl-py-0.9.0:/nix/store/dxdd6wjbcipfzifw3ik330jym9k3y4dl-python3.7-wheel-0.34.2:/nix/store/afswsnr2198p7c5np5b939ki49c3z63s-python3.7-dataclasses-0.6', '-DNUMPY_INCLUDE_DIR=/nix/store/w6fpl4v7s0ychkg9vrzr3lal5nqmzm8b-python3.7-numpy-1.19.4/lib/python3.7/site-packages/numpy/core/include', '-DPYTHON_EXECUTABLE=/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/bin/python3.7', '-DPYTHON_INCLUDE_DIR=/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/include/python3.7m', '-DPYTHON_LIBRARY=/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/lib/libpython3.7m.so.1.0', '-DTORCH_BUILD_VERSION=1.7.0', '-DUSE_MKL=', '-DUSE_MKLDNN=1', '-DUSE_MKLDNN_CBLAS=1', '-DUSE_NUMPY=True', '-DUSE_SYSTEM_NCCL=1', '/build/source']' returned non-zero exit status 1.
So I have torch with cuda kind of working, using the following flake:
{
inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
inputs.flake-utils.url = "github:numtide/flake-utils";
inputs.mach-nix.url = "github:DavHau/mach-nix";
inputs.yolov7.url = "github:WongKinYiu/yolov7";
inputs.yolov7.flake = false;
outputs = { self, nixpkgs, flake-utils, mach-nix, ... } @ inputs:
flake-utils.lib.eachDefaultSystem (system:
let
pkgs = import nixpkgs {
inherit system;
config = {
allowUnfree = true;
cudaSupport = true;
};
};
python = "python310";
machNix = import mach-nix {
inherit python;
inherit pkgs;
};
python-yolo-env = machNix.mkPython rec {
requirements = builtins.readFile (inputs.yolov7 + "/requirements.txt"); #builtins.readFile ./requirements.txt;
providers = {
_default = "nixpkgs";
opencv-python = "wheel";
thop = "sdist";
};
};
in
rec {
packages = {
inherit python-yolo-env;
};
devShells.default = pkgs.mkShell {
nativeBuildInputs = [ python-yolo-env ];
};
}
);
}
However, there are a few quirks/bugs:
-
torch.__version__
yields'1.11.0'
, indicating that cuda is not supported. However,torch.cuda.is_available() == True
, andtorch.cuda.device_count()>0
is true as well, e.g. despite the off version name torch recognises cuda. -
torchvision
to the contrary does report cuda support in its version (torchvision.__version__ == '0.12.0+cu113'
), but it does report an error upon firstimport torchvision
:
/nix/store/m05fwxipvjc51b411p2gj5djqz0c7apb-python3-3.10.5-env/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory
warn(f"Failed to load image Python extension: {e}")
This is particularly wild, since
-
torch.__file__
reports'/nix/store/m05fwxipvjc51b411p2gj5djqz0c7apb-python3-3.10.5-env/lib/python3.10/site-packages/torch/__init__.py'
, and - '/nix/store/m05fwxipvjc51b411p2gj5djqz0c7apb-python3-3.10.5-env/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so' exists (note the difference, the
libtorch_cuda_cu.so
is not in thetorch/lib
directory).
* '/nix/store/m05fwxipvjc51b411p2gj5djqz0c7apb-python3-3.10.5-env/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so' exists (note the difference, the `libtorch_cuda_cu.so` is **not** in the `torch/lib` directory).
Those files are generated under torch
when it is built with BUILD_SPLIT_CUDA=1
similar to BUILD_NAMEDTENSOR = setBool true;
in torch
. (https://discuss.pytorch.org/t/no-libtorch-cuda-cpp-so-available-when-build-pytorch-from-source/159864)
So you have to put BUILD_SPLIT_CUDA=1
using overridePythonAttrs
.