tch-rs icon indicating copy to clipboard operation
tch-rs copied to clipboard

Provide a way to use libtorch from a system-wide location

Open nagisa opened this issue 3 years ago • 7 comments

So I have a system-wide installation of libtorch that is built exactly according to my needs. However there isn't a way to have torch-sys to "just" link from a system-wide location, as far as I can tell.

It also appears to attempt to infer availability of features such as cuda from availability of .so files, which can lead to obscure errors at runtime. Ideally the code would be able to say that it requires cuda to operate (via Cargo features?) and torch-sys would then unconditionally link to everything required to get cuda working. In that situation if the libraries are not available, build would fail at linking stage – much earlier than it otherwise might do today.

nagisa avatar Apr 20 '21 11:04 nagisa

Did you try using the LIBTORCH environment variable mentioned in the libtorch manual install part of the readme? The goal of this environment variable is to accomodate for such system-wide install but maybe it didn't work in your use case?

LaurentMazare avatar Apr 20 '21 12:04 LaurentMazare

I would like to avoid having to set up an environment variable. Primarily because it needs to be done every time a development environment is set-up (which can be as often as multiple times every day), and because the failure mode for failing to set the environment variable is torch-sys downloading its own copy of the libraries behind the developers' back. I understand that transparently downloading the library is for the sake of seamless first-time experience, but I would like an option to unconditionally disable this behaviour (preferably in a way that's fail-proof, which an environment variable isn't).

I also would like to avoid having to set the environment variable system-wide permanently. These are easy to forget and can be a source of nasty surprises later down the line.

As far as ideal solutions go, I would prefer a pkg-config based solution, but that's complicated by the upstream project not installing any. Second best option IME is to just blindly output -ls hoping that the libraries are there.

IMHO today there's just one way to achieve this – cargo features. I would imagine it looking a lot like this:

[features]
default = ["download-as-a-last-resort", "cuda"]
download-as-a-last-resort = []
cuda = []
...

and then people can rely somewhat more on build.rs not downloading libtorch if the crate dependencies are specified as such:

torch-sys = { version = "...", default-features=false, features = ["cuda"] }
tch = { version = "...", default-features=false, features = ["cuda"] }

etc.

nagisa avatar Apr 20 '21 12:04 nagisa

I'm also using a system-wide pytorch installation, and had to set LIBTORCH=/usr.

Perhaps just checking for libtorch.so in /usr/lib/ would be enough.

rmsc avatar Apr 21 '21 10:04 rmsc

System wide installation location doesn't necessarily have to be directly under /usr/lib, so checking for files there is probably not at all sufficient. The libraries may appear under /usr/local/lib for example. Or any other directory configured to contain system-wide libraries. Many linux systems, and in turn docker containers, utilize ldconfig to specify where to find certain libraries when loading the application. Some others may rely on DT_RPATH.

nagisa avatar Apr 21 '21 11:04 nagisa

I've submitted a PR (#344) which should allow using a system-wide installation. I've only added support for Linux, as I don't have access to the other common OS's.

@nagisa I'm not a fan of using package features for specifying what is in the end a system architecture. That IMO should be detected at build time and not hardcoded. As an example, this probably wouldn't build in my laptop (or perhaps worse, download MBs worth of unneeded stuff):

tch = { ... features = ["cuda"] }

rmsc avatar Apr 21 '21 11:04 rmsc

@nagisa that's a good point. It's fairly easy to add additional paths to the search, but perhaps ldconfig is a better alternative.

rmsc avatar Apr 21 '21 11:04 rmsc

Got it. I think the PR is an improvement either way. As far as cuda support is concerned, after I filed this issue I had a person reach out to me saying that the current rules are indeed very opaque and that they are having trouble getting cuda support to work too.

But that perhaps should be a separate issue anyway.

nagisa avatar Apr 21 '21 11:04 nagisa

Closing this as #344 has been merged for quite some time now.

LaurentMazare avatar Apr 25 '23 15:04 LaurentMazare