rust icon indicating copy to clipboard operation
rust copied to clipboard

GPU features does not work on Windows native (v0.20.0 with TF v2.11.0)

Open dskkato opened this issue 2 years ago • 4 comments

> # clone this repository
> git checkout v0.20.0
> cargo b --features=tensorflow_gpu
    Blocking waiting for file lock on build directory
   Compiling tensorflow-sys v0.23.0 (D:\workspace\tensorflow\rust\tensorflow-sys)
error: failed to run custom build command for `tensorflow-sys v0.23.0 (D:\workspace\tensorflow\rust\tensorflow-sys)`

Caused by:
  process didn't exit successfully: `D:\workspace\tensorflow\rust\target\debug\build\tensorflow-sys-21babaa47e976cd4\build-script-build` (exit code: 101)
  --- stdout
  cargo:rerun-if-env-changed=TENSORFLOW_NO_PKG_CONFIG
  cargo:rerun-if-env-changed=PKG_CONFIG_x86_64-pc-windows-msvc
  cargo:rerun-if-env-changed=PKG_CONFIG_x86_64_pc_windows_msvc
  cargo:rerun-if-env-changed=HOST_PKG_CONFIG
  cargo:rerun-if-env-changed=PKG_CONFIG
  cargo:rerun-if-env-changed=TENSORFLOW_STATIC
  cargo:rerun-if-env-changed=TENSORFLOW_DYNAMIC
  cargo:rerun-if-env-changed=PKG_CONFIG_ALL_STATIC
  cargo:rerun-if-env-changed=PKG_CONFIG_ALL_DYNAMIC
  cargo:rerun-if-env-changed=PKG_CONFIG_PATH_x86_64-pc-windows-msvc
  cargo:rerun-if-env-changed=PKG_CONFIG_PATH_x86_64_pc_windows_msvc
  cargo:rerun-if-env-changed=HOST_PKG_CONFIG_PATH
  cargo:rerun-if-env-changed=PKG_CONFIG_PATH
  cargo:rerun-if-env-changed=PKG_CONFIG_LIBDIR_x86_64-pc-windows-msvc
  cargo:rerun-if-env-changed=PKG_CONFIG_LIBDIR_x86_64_pc_windows_msvc
  cargo:rerun-if-env-changed=HOST_PKG_CONFIG_LIBDIR
  cargo:rerun-if-env-changed=PKG_CONFIG_LIBDIR
  cargo:rerun-if-env-changed=PKG_CONFIG_SYSROOT_DIR_x86_64-pc-windows-msvc
  cargo:rerun-if-env-changed=PKG_CONFIG_SYSROOT_DIR_x86_64_pc_windows_msvc
  cargo:rerun-if-env-changed=HOST_PKG_CONFIG_SYSROOT_DIR
  cargo:rerun-if-env-changed=PKG_CONFIG_SYSROOT_DIR
  tensorflow-sys/build.rs:208: binary_url = "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-gpu-windows-x86_64-2.11.0.zip"
  tensorflow-sys/build.rs:212: base_name = "libtensorflow-gpu-windows-x86_64-2.11.0"
  tensorflow-sys/build.rs:221: file_name = "D:\\workspace\\tensorflow\\rust\\target\\debug\\build\\tensorflow-sys-8a83f4e9e3fa6695\\out\\libtensorflow-gpu-windows-x86_64-2.11.0.zip"

  --- stderr
  thread 'main' panicked at 'Unexpected response code 404 for https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-gpu-windows-x86_64-2.11.0.zip', tensorflow-sys\build.rs:235:13
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

As is shown in the build log, this error was triggered with missing prebuilt tensorflow with GPU support in v2.11.0.

One possible solution is to downgrade the tensorflow to v2.10.0 as bellow, but this is usually difficult for Rust users.

diff --git a/tensorflow-sys/build.rs b/tensorflow-sys/build.rs
index 9779b2ee1..7c60916df 100644
--- a/tensorflow-sys/build.rs
+++ b/tensorflow-sys/build.rs
@@ -24,8 +24,8 @@ const REPOSITORY: &str = "https://github.com/tensorflow/tensorflow.git";
 const FRAMEWORK_TARGET: &str = "tensorflow:libtensorflow_framework";
 const TARGET: &str = "tensorflow:libtensorflow";
 // `VERSION` and `TAG` are separate because the tag is not always `'v' + VERSION`.
-const VERSION: &str = "2.11.0";
-const TAG: &str = "v2.11.0";
+const VERSION: &str = "2.10.0";
+const TAG: &str = "v2.10.0";
 const MIN_BAZEL: &str = "3.7.2";
 
 macro_rules! get(($name:expr) => (ok!(env::var($name))));

dskkato avatar Feb 25 '23 05:02 dskkato

I just asked the TensorFlow folks to fix the missing download: https://github.com/tensorflow/tensorflow/issues/59828

adamcrume avatar Feb 28 '23 03:02 adamcrume

Thank you for your inquiry about this issue. As far as I can tell from reading the following forum, I think this is due to the fact that Windows native CUDA support was discontinued in 2.10 and moved to the plugin method in 2.11 and later.

https://discuss.tensorflow.org/t/2-10-last-version-to-support-native-windows-gpu/12404

The corresponding C-API is probably PluggableDevice, and I think we need to resolve the following Issue.

https://github.com/tensorflow/rust/issues/381

dskkato avatar Mar 01 '23 14:03 dskkato

Addressing #381 seems reasonable, as long as we guard it behind a feature flag that makes it clear it's experimental.

It looks like building from source or running under WSL are also options.

adamcrume avatar Mar 03 '23 04:03 adamcrume

Alternatively, windows users can use tf 2.10 by putting both tensorflow.lib and tensorflow.dll somewhere in the PATH. tensorflow-sys build scripts will take them into account.

dskkato avatar Mar 10 '23 06:03 dskkato