Can i request Tensorflow-gpu without AVX but with support for Compute Capability 3.0, using CUDA 9.0 CuDNN 7?
The newer tensorflows are built with AVX but support Compute Capability 3.0. Tensorflow 1.5.0 does not use AVX but does not support Compute Capability 3.0 (requires 3.5).
I am in a strange place where my CPU cannot use AVX but i have a compute capabilty 3.0 card. But if i pip install tensorflow-gpu it crashes on import because it apparently uses an AVX instruction to import it even though i dont need my CPU as i will be using gpu.
I tried installing from source using a devel docker image but it was still compiling over 24 hours later; it didn't give me the option to choose compute capability so i think it was attempting them all.
I'm in a similar predicament, I have a decent gfx card (nvidia gtx 1080) but with an older Xeon without the AVX instructions.
You don't need to use a docker image. You can directly follow instructions under https://www.tensorflow.org/install/install_sources on your machine. The build took a couple of hours in my case.
I afraid of that even tensorflow 1.10 build from source with cuda / cdnn support still crashes (core dumped). Here's the guilty:
predictions = model_reloaded.predict(xdata[8100:8101,...],batch_size=1, verbose = 0)
Xeon E5520, GTX960+4Gb, cuda 9.2/cudnn, ubuntu 16.04
@evdcush: You seem to be able to build these wheels. Could you possibly take a look at this? TensorFlow 1.12.0 GPU (CUDA 9.0, cuDNN 7.1), Python 3.6.*, Buntu 18.04 would be really cool. No AVX, SSE4 or MKL. I'd even throw in $5, if it helps :P
I'm using an Intel Pentium G4560 CPU and GTX 650 graphics card.
Feel free to download and test https://github.com/schrepfler/tensorflow-community-wheels/releases/tag/v1.12.0 Was built by my friend @pasikon
Thanks for the heads up, but it seems the wheel doesn't support Cuda compute capability 3.0. I get the following warning:
Ignoring visible gpu device (device: 0, name: GeForce GTX 650, pci bus id: 0000:01:00.0, compute capability: 3.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 6.1.
I see, not sure about that as I have a 1080, sorry.
@codesoap for 1.12 min compute capability is 3.5. you need to use some older tf version
@pasikon Do you have a source on that? Upon research I found wheels of tensorflow 1.12.0 for Windows with Cuda compute capability 3.0 support here.
I guess anyone who is affected by this issue could try to get tensorflow-gpu working with these builds, if they had a Windows machine...
@codesoap I'm not 100% sure but compiler configuration defaults to cc 3.5-7 and compilation for 3.0 is failing. have you tried to use tf 1.4.0? its no avx and default tf-gpu wheel might work for your card. I mean do you really need 1.12?
Sorry for the delayed response. I have now tried older tensorflow releases and finally got something to work:
- tensorflow-gpu 1.3.0
- keras 2.0.9
- cuda-8.0
- cuDNN 6.0
Got a ~7x speedup over my CPU (with tensorflow 1.3.0). Quite nice. Thanks for your help, @pasikon! It would still be nice to have a more recent build, that utilizes the GPU on my machine, but the current solution is much better than using CPU only.
Hello, my configuration is: CPU: i7 920 (no avx) GPU: GTX 670 (3.0 capabilities) CUDA: 8.0
Until some days ago I used tensorflow-gpu 1.4, which worked with no issues, but now I need some features of latest versions (namely new gradients that have been defined). I set up a build environment inside an nvidia/cuda docker and successfully built a tensorflow 1.12 - cuda 9.0 - capability 3.0 - no avx wheel, but even if I built it with 3.0 support I still get the "minimum 3.5 required" runtime error. Shouldn't it warn the user if you are trying to compile for an architecture that is not supported? It gave me no errors during configure and build. Is that something that can I address in some way?
I'm in a similar situation. Setup:
-Intel Core i5-4670K 3.4 GHz Quad-Core
- Gigabyte GeForce GTX 770 (compute capability 3.0)
- Cuda 9.0.176
- cuDNN 7.0
- nccl 2.3.x
- Ubuntu 18.04
Most recently, I've tried the Docker GPU build: https://www.tensorflow.org/install/source#gpu_support_2, but that doesn't compile for compute capability 3.0. I've also tried building from source according to the official instructions with:
- bazel 0.13.x
- tensorflow r1.8
- Cuda 9.0.176
- cuDNN 7.4.x
- NCCL 2.3.x
And another couple tries with this guide: https://stackoverflow.com/questions/39023581/tensorflow-cuda-compute-capability-3-0-the-minimum-required-cuda-capability-is/50592978. Note that the versions for cuDNN and NCCL that I used were slightly more recent. Would that really make a difference?
It seems like a lot of people are in the same situation as I am. Probably had some old hw laying around and a GPU and wanted to get tensorflow up and running. Is their any reason AVX needs to be added if your doing GPU only.. does it offload some work which is needed? It would be great if the powers at be would a build that we can use pip for tensorflow-gpu-no-avx ? I know their are some people compiling it and I have tried w/o any luck.
thoughts?
Good idea
Le lun. 7 janv. 2019 22:13, Gateway [email protected] a écrit :
It seems like a lot of people are in the same situation as I am. Probably had some old hw laying around and a GPU and wanted to get tensorflow up and running. Is their any reason AVX needs to be added if your doing GPU only.. does it offload some work which is needed? It would be great if the powers at be would a build that we can use pip for tensorflow-gpu-no-avx ? I know their are some people compiling it and I have tried w/o any luck.
thoughts?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/yaroslavvb/tensorflow-community-wheels/issues/69#issuecomment-452083718, or mute the thread https://github.com/notifications/unsubscribe-auth/AAqVRtEBUPZJpFGXmHvutdjFY3uidukLks5vA7hkgaJpZM4U02nH .
Its not only the matter of AVX, I have compiled and run TF 1.12 with no AVX without any problem. The only problem here is compute capability (CC) of old GPUs people are having, it seems for the new TF versions min CC version is 6.0. Why you just don't try older one like 1.4 which doesn't require any AVX? (And maybe it requires lower CC). For those who want 1.12 with no AVX and GPU CC 6.0 or above there is this one that I compiled and its working: https://github.com/schrepfler/tensorflow-community-wheels/releases/tag/v1.12.0
Its not only the matter of AVX, I have compiled and run TF 1.12 with no AVX without any problem. The only problem here is compute capability (CC) of old GPUs people are having, it seems for the new TF versions min CC version is 6.0. Why you just don't try older one like 1.4 which doesn't require any AVX? (And maybe it requires lower CC). For those who want 1.12 with no AVX and GPU CC 6.0 or above there is this one that I compiled and its working: https://github.com/schrepfler/tensorflow-community-wheels/releases/tag/v1.12.0
Thanks, I did take the time to look around, and eventually I just ordered a core i5 for my mb (needed some extra cpu juice for some deepflow2 and deepmatching since some of the stuff I needed was cpu intensive. Cheers and hopefully people find your link and are able to use the latest 1.12
You can specify the Cuda compute capabilities at build time, I build no AVX TF 1.12 with CUDA compute capability 3.0,3.5 and 6.1 on Linux Mac and Windows in the past 5 days using CUDA 10 and CUDNN 7.4
You can specify the Cuda compute capabilities at build time, I build no AVX TF 1.12 with CUDA compute capability 3.0,3.5 and 6.1 on Linux Mac and Windows in the past 5 days using CUDA 10 and CUDNN 7.4
Can you provide binaries for Windows and Mac, please?
If I get time I can do that, where can I upload?
If I get time I can do that, where can I upload?
GitHub
OK I will see what I can do in the morning it’s 8:52 pm in Adelaide. I hope to make the upload by 1pm tomorrow.
OK I will see what I can do in the morning it’s 8:52 pm in Adelaide. I hope to make the upload by 1pm tomorrow.
thanks
I'm very interested in this binary still for non-AVX instructions and compute capability 3.0 for Linux
Sorry I didn’t get time to do this today it was a very busy day. I will check the packages now
Or choose your files https://github.com/samhodge/tensorflow-community-wheels/upload# Yowza, that’s a big file. Try again with a file smaller than 25MB.
Any ideas?
On Fri, Feb 22, 2019 at 1:14 AM TaakoMagnusen [email protected] wrote:
I'm very interested bin this binary still for non-AVX compute capability 3.0
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/yaroslavvb/tensorflow-community-wheels/issues/69#issuecomment-466025328, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTooiQe-yporLRmfulO3vUlOGfTeQXks5vPrDGgaJpZM4U02nH .
-- Sam Hodge Director
Kognat Proprietary Limited ACN 623 943 304 Mobile : +61417801006 [email protected]
try via git tools or command line instead web interface
OK
I am uploading to my private google drive.
Nah forget that I am uploading to github via the CLI
This will take some time the rate is 24Kb/s
Sorry this is CUDA 10, but that upgrade doesnt depend on CUDA compute capability or AVX/AVX2 so it should be good for older hardware.
Bless you for your efforts!
this exceeds GitHub's file size limit of 100.00 MB
I will go back to the original plan of my Google Drive.
mac is done https://drive.google.com/open?id=1GT0TUH_q96vDEU0cjC6bccqDdgkMjFdf Python 3.6 CUDA 10.0 CUDNN 7.4
mac is done https://drive.google.com/open?id=1GT0TUH_q96vDEU0cjC6bccqDdgkMjFdf
Thank you very much
@samhodge so you also have the Linux version?
I believe I do
here is the Win64 version, tested on Windows 10 with CUDA 10 and CUDNN 7.4.22 https://drive.google.com/open?id=1dWlAqVqcCZmH3q3vefo8ohVhDxBTRp0H
I need to build it out, shouldnt be too long for the Linux version.
Just forgot about the layers to the Linux build, missing ASM for boringSSL being one of them, just working on that now, not sure what the outcome will be regarding NCCL.
Seems like I have made it work, dont use this for anything mission critical, it seems that not having the assembly optimisation of boringssl is a bit of a security risk, but having no AVX instructions it is the only thing possible, hit another snag with SWIG, but I continue on.
I have run it about five times, seems like it cannot find libcublas.so.10.0 even though it is right there in the LD_LIBRARY_PATH, I adjusted the string and will try again.
Got it uploading now.
https://drive.google.com/file/d/18uLlegoLagk3PDUJkXJZpfs0cBfxuBrX/view?usp=sharing Linux Python 3.6 CUDA 10.0 CUDNN 7.4.22
You are a god! Can't wait to try this install this weekend
https://drive.google.com/file/d/18uLlegoLagk3PDUJkXJZpfs0cBfxuBrX/view?usp=sharing Linux Python 3.6 CUDA 10.0 CUDNN 7.4.22
I try to build on i5 760, it take me 5 hr and still fail Everything work fine after I use your compile. Thank you very much !
Thanks for the feedback, to be honest I don’t think NCCL is sorted in the MacOS version.
https://drive.google.com/file/d/18uLlegoLagk3PDUJkXJZpfs0cBfxuBrX/view?usp=sharing Linux Python 3.6 CUDA 10.0 CUDNN 7.4.22
samhodge, you are my hero. It works perfectly on a Pentium g4400 that doesn't have avx.
@markldn
I am only too glad it helped someone out.
Sam Many thanks... 2013 GTX 670/4 (CC3) and an old 2008 Dell T5400 Dual Xeon Workstation E5430 with 16GB RAM 18.04 Bionic... still quite capable )) Nice work!
Awesome news @edaustin
Oh problem ))))
2019-04-22 13:42:12.573196: F tensorflow/core/platform/cpu_feature_guard.cc:37] The TensorFlow library was compiled to use SSE4.2 instructions, but these aren't available on your machine.
Followed by a crash!
Could you build and modify the image file: tensorflow/core/platform/cpu_feature_guard.cc
So, that the FATAL is changed to WARNING, or better still nothing ))))
Many thanks!
Tensorflow-GPU 1.13.2 for Python 3.6 Linux_X86_64 (built in Ubuntu 18.04) • No AVX. No SSE4.1. No SSE4.2. • CUDA 10.0. CUDNN 7.6.4. Compute Capability 5.2.
Google Drive Download: https://drive.google.com/file/d/1N75Iu_8_zS2D4w4Dzs0xYcp9y1pvVgU3/view?usp=sharing
- Currently building same, but with SSE4.1 and SSE4.2. Will share once done ...