tensorflow-community-wheels icon indicating copy to clipboard operation
tensorflow-community-wheels copied to clipboard

Can i request Tensorflow-gpu without AVX but with support for Compute Capability 3.0, using CUDA 9.0 CuDNN 7?

Open TaakoMagnusen opened this issue 7 years ago • 55 comments

The newer tensorflows are built with AVX but support Compute Capability 3.0. Tensorflow 1.5.0 does not use AVX but does not support Compute Capability 3.0 (requires 3.5).

I am in a strange place where my CPU cannot use AVX but i have a compute capabilty 3.0 card. But if i pip install tensorflow-gpu it crashes on import because it apparently uses an AVX instruction to import it even though i dont need my CPU as i will be using gpu.

I tried installing from source using a devel docker image but it was still compiling over 24 hours later; it didn't give me the option to choose compute capability so i think it was attempting them all.

TaakoMagnusen avatar Jun 23 '18 16:06 TaakoMagnusen

I'm in a similar predicament, I have a decent gfx card (nvidia gtx 1080) but with an older Xeon without the AVX instructions.

schrepfler avatar Jul 23 '18 22:07 schrepfler

You don't need to use a docker image. You can directly follow instructions under https://www.tensorflow.org/install/install_sources on your machine. The build took a couple of hours in my case.

maxhgerlach avatar Jul 26 '18 14:07 maxhgerlach

I afraid of that even tensorflow 1.10 build from source with cuda / cdnn support still crashes (core dumped). Here's the guilty: predictions = model_reloaded.predict(xdata[8100:8101,...],batch_size=1, verbose = 0)

Xeon E5520, GTX960+4Gb, cuda 9.2/cudnn, ubuntu 16.04

jeanpat avatar Oct 31 '18 15:10 jeanpat

@evdcush: You seem to be able to build these wheels. Could you possibly take a look at this? TensorFlow 1.12.0 GPU (CUDA 9.0, cuDNN 7.1), Python 3.6.*, Buntu 18.04 would be really cool. No AVX, SSE4 or MKL. I'd even throw in $5, if it helps :P

I'm using an Intel Pentium G4560 CPU and GTX 650 graphics card.

codesoap avatar Nov 23 '18 17:11 codesoap

Feel free to download and test https://github.com/schrepfler/tensorflow-community-wheels/releases/tag/v1.12.0 Was built by my friend @pasikon

schrepfler avatar Nov 24 '18 12:11 schrepfler

Thanks for the heads up, but it seems the wheel doesn't support Cuda compute capability 3.0. I get the following warning: Ignoring visible gpu device (device: 0, name: GeForce GTX 650, pci bus id: 0000:01:00.0, compute capability: 3.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 6.1.

codesoap avatar Nov 24 '18 19:11 codesoap

I see, not sure about that as I have a 1080, sorry.

schrepfler avatar Nov 24 '18 19:11 schrepfler

@codesoap for 1.12 min compute capability is 3.5. you need to use some older tf version

pasikon avatar Nov 25 '18 11:11 pasikon

@pasikon Do you have a source on that? Upon research I found wheels of tensorflow 1.12.0 for Windows with Cuda compute capability 3.0 support here.

I guess anyone who is affected by this issue could try to get tensorflow-gpu working with these builds, if they had a Windows machine...

codesoap avatar Nov 25 '18 13:11 codesoap

@codesoap I'm not 100% sure but compiler configuration defaults to cc 3.5-7 and compilation for 3.0 is failing. have you tried to use tf 1.4.0? its no avx and default tf-gpu wheel might work for your card. I mean do you really need 1.12?

pasikon avatar Nov 26 '18 09:11 pasikon

Sorry for the delayed response. I have now tried older tensorflow releases and finally got something to work:

  • tensorflow-gpu 1.3.0
  • keras 2.0.9
  • cuda-8.0
  • cuDNN 6.0

Got a ~7x speedup over my CPU (with tensorflow 1.3.0). Quite nice. Thanks for your help, @pasikon! It would still be nice to have a more recent build, that utilizes the GPU on my machine, but the current solution is much better than using CPU only.

codesoap avatar Dec 01 '18 16:12 codesoap

Hello, my configuration is: CPU: i7 920 (no avx) GPU: GTX 670 (3.0 capabilities) CUDA: 8.0

Until some days ago I used tensorflow-gpu 1.4, which worked with no issues, but now I need some features of latest versions (namely new gradients that have been defined). I set up a build environment inside an nvidia/cuda docker and successfully built a tensorflow 1.12 - cuda 9.0 - capability 3.0 - no avx wheel, but even if I built it with 3.0 support I still get the "minimum 3.5 required" runtime error. Shouldn't it warn the user if you are trying to compile for an architecture that is not supported? It gave me no errors during configure and build. Is that something that can I address in some way?

edoardogiacomello avatar Dec 03 '18 10:12 edoardogiacomello

I'm in a similar situation. Setup:

-Intel Core i5-4670K 3.4 GHz Quad-Core

  • Gigabyte GeForce GTX 770 (compute capability 3.0)
  • Cuda 9.0.176
  • cuDNN 7.0
  • nccl 2.3.x
  • Ubuntu 18.04

Most recently, I've tried the Docker GPU build: https://www.tensorflow.org/install/source#gpu_support_2, but that doesn't compile for compute capability 3.0. I've also tried building from source according to the official instructions with:

  • bazel 0.13.x
  • tensorflow r1.8
  • Cuda 9.0.176
  • cuDNN 7.4.x
  • NCCL 2.3.x

And another couple tries with this guide: https://stackoverflow.com/questions/39023581/tensorflow-cuda-compute-capability-3-0-the-minimum-required-cuda-capability-is/50592978. Note that the versions for cuDNN and NCCL that I used were slightly more recent. Would that really make a difference?

mbaroody avatar Dec 03 '18 15:12 mbaroody

It seems like a lot of people are in the same situation as I am. Probably had some old hw laying around and a GPU and wanted to get tensorflow up and running. Is their any reason AVX needs to be added if your doing GPU only.. does it offload some work which is needed? It would be great if the powers at be would a build that we can use pip for tensorflow-gpu-no-avx ? I know their are some people compiling it and I have tried w/o any luck.

thoughts?

gateway avatar Jan 07 '19 21:01 gateway

Good idea

Le lun. 7 janv. 2019 22:13, Gateway [email protected] a écrit :

It seems like a lot of people are in the same situation as I am. Probably had some old hw laying around and a GPU and wanted to get tensorflow up and running. Is their any reason AVX needs to be added if your doing GPU only.. does it offload some work which is needed? It would be great if the powers at be would a build that we can use pip for tensorflow-gpu-no-avx ? I know their are some people compiling it and I have tried w/o any luck.

thoughts?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/yaroslavvb/tensorflow-community-wheels/issues/69#issuecomment-452083718, or mute the thread https://github.com/notifications/unsubscribe-auth/AAqVRtEBUPZJpFGXmHvutdjFY3uidukLks5vA7hkgaJpZM4U02nH .

jeanpat avatar Jan 08 '19 12:01 jeanpat

Its not only the matter of AVX, I have compiled and run TF 1.12 with no AVX without any problem. The only problem here is compute capability (CC) of old GPUs people are having, it seems for the new TF versions min CC version is 6.0. Why you just don't try older one like 1.4 which doesn't require any AVX? (And maybe it requires lower CC). For those who want 1.12 with no AVX and GPU CC 6.0 or above there is this one that I compiled and its working: https://github.com/schrepfler/tensorflow-community-wheels/releases/tag/v1.12.0

pasikon avatar Jan 08 '19 12:01 pasikon

Its not only the matter of AVX, I have compiled and run TF 1.12 with no AVX without any problem. The only problem here is compute capability (CC) of old GPUs people are having, it seems for the new TF versions min CC version is 6.0. Why you just don't try older one like 1.4 which doesn't require any AVX? (And maybe it requires lower CC). For those who want 1.12 with no AVX and GPU CC 6.0 or above there is this one that I compiled and its working: https://github.com/schrepfler/tensorflow-community-wheels/releases/tag/v1.12.0

Thanks, I did take the time to look around, and eventually I just ordered a core i5 for my mb (needed some extra cpu juice for some deepflow2 and deepmatching since some of the stuff I needed was cpu intensive. Cheers and hopefully people find your link and are able to use the latest 1.12

gateway avatar Jan 08 '19 19:01 gateway

You can specify the Cuda compute capabilities at build time, I build no AVX TF 1.12 with CUDA compute capability 3.0,3.5 and 6.1 on Linux Mac and Windows in the past 5 days using CUDA 10 and CUDNN 7.4

samhodge avatar Feb 13 '19 10:02 samhodge

You can specify the Cuda compute capabilities at build time, I build no AVX TF 1.12 with CUDA compute capability 3.0,3.5 and 6.1 on Linux Mac and Windows in the past 5 days using CUDA 10 and CUDNN 7.4

Can you provide binaries for Windows and Mac, please?

dimka11 avatar Feb 21 '19 10:02 dimka11

If I get time I can do that, where can I upload?

samhodge avatar Feb 21 '19 10:02 samhodge

If I get time I can do that, where can I upload?

GitHub

dimka11 avatar Feb 21 '19 10:02 dimka11

OK I will see what I can do in the morning it’s 8:52 pm in Adelaide. I hope to make the upload by 1pm tomorrow.

samhodge avatar Feb 21 '19 10:02 samhodge

OK I will see what I can do in the morning it’s 8:52 pm in Adelaide. I hope to make the upload by 1pm tomorrow.

thanks

dimka11 avatar Feb 21 '19 10:02 dimka11

I'm very interested in this binary still for non-AVX instructions and compute capability 3.0 for Linux

TaakoMagnusen avatar Feb 21 '19 14:02 TaakoMagnusen

Sorry I didn’t get time to do this today it was a very busy day. I will check the packages now

samhodge avatar Feb 22 '19 06:02 samhodge

Or choose your files https://github.com/samhodge/tensorflow-community-wheels/upload# Yowza, that’s a big file. Try again with a file smaller than 25MB.

Any ideas?

On Fri, Feb 22, 2019 at 1:14 AM TaakoMagnusen [email protected] wrote:

I'm very interested bin this binary still for non-AVX compute capability 3.0

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/yaroslavvb/tensorflow-community-wheels/issues/69#issuecomment-466025328, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTooiQe-yporLRmfulO3vUlOGfTeQXks5vPrDGgaJpZM4U02nH .

-- Sam Hodge Director

Kognat Proprietary Limited ACN 623 943 304 Mobile : +61417801006 [email protected]

samhodge avatar Feb 27 '19 05:02 samhodge

try via git tools or command line instead web interface

dimka11 avatar Feb 27 '19 05:02 dimka11

OK

samhodge avatar Feb 27 '19 05:02 samhodge

I am uploading to my private google drive.

samhodge avatar Feb 27 '19 05:02 samhodge

Nah forget that I am uploading to github via the CLI

samhodge avatar Feb 27 '19 06:02 samhodge

This will take some time the rate is 24Kb/s

samhodge avatar Feb 27 '19 06:02 samhodge

Sorry this is CUDA 10, but that upgrade doesnt depend on CUDA compute capability or AVX/AVX2 so it should be good for older hardware.

samhodge avatar Feb 27 '19 06:02 samhodge

Bless you for your efforts!

TaakoMagnusen avatar Feb 27 '19 06:02 TaakoMagnusen

this exceeds GitHub's file size limit of 100.00 MB

samhodge avatar Feb 27 '19 08:02 samhodge

I will go back to the original plan of my Google Drive.

samhodge avatar Feb 27 '19 08:02 samhodge

mac is done https://drive.google.com/open?id=1GT0TUH_q96vDEU0cjC6bccqDdgkMjFdf Python 3.6 CUDA 10.0 CUDNN 7.4

samhodge avatar Feb 27 '19 08:02 samhodge

mac is done https://drive.google.com/open?id=1GT0TUH_q96vDEU0cjC6bccqDdgkMjFdf

Thank you very much

dimka11 avatar Feb 27 '19 16:02 dimka11

@samhodge so you also have the Linux version?

TaakoMagnusen avatar Feb 27 '19 16:02 TaakoMagnusen

I believe I do

samhodge avatar Feb 27 '19 19:02 samhodge

here is the Win64 version, tested on Windows 10 with CUDA 10 and CUDNN 7.4.22 https://drive.google.com/open?id=1dWlAqVqcCZmH3q3vefo8ohVhDxBTRp0H

samhodge avatar Feb 27 '19 20:02 samhodge

I need to build it out, shouldnt be too long for the Linux version.

samhodge avatar Feb 27 '19 20:02 samhodge

Just forgot about the layers to the Linux build, missing ASM for boringSSL being one of them, just working on that now, not sure what the outcome will be regarding NCCL.

samhodge avatar Feb 27 '19 23:02 samhodge

Seems like I have made it work, dont use this for anything mission critical, it seems that not having the assembly optimisation of boringssl is a bit of a security risk, but having no AVX instructions it is the only thing possible, hit another snag with SWIG, but I continue on.

samhodge avatar Feb 28 '19 00:02 samhodge

I have run it about five times, seems like it cannot find libcublas.so.10.0 even though it is right there in the LD_LIBRARY_PATH, I adjusted the string and will try again.

samhodge avatar Feb 28 '19 05:02 samhodge

Got it uploading now.

samhodge avatar Feb 28 '19 07:02 samhodge

https://drive.google.com/file/d/18uLlegoLagk3PDUJkXJZpfs0cBfxuBrX/view?usp=sharing Linux Python 3.6 CUDA 10.0 CUDNN 7.4.22

samhodge avatar Feb 28 '19 07:02 samhodge

You are a god! Can't wait to try this install this weekend

TaakoMagnusen avatar Feb 28 '19 14:02 TaakoMagnusen

https://drive.google.com/file/d/18uLlegoLagk3PDUJkXJZpfs0cBfxuBrX/view?usp=sharing Linux Python 3.6 CUDA 10.0 CUDNN 7.4.22

I try to build on i5 760, it take me 5 hr and still fail Everything work fine after I use your compile. Thank you very much !

lthquy avatar Mar 15 '19 11:03 lthquy

Thanks for the feedback, to be honest I don’t think NCCL is sorted in the MacOS version.

samhodge avatar Mar 15 '19 19:03 samhodge

https://drive.google.com/file/d/18uLlegoLagk3PDUJkXJZpfs0cBfxuBrX/view?usp=sharing Linux Python 3.6 CUDA 10.0 CUDNN 7.4.22

samhodge, you are my hero. It works perfectly on a Pentium g4400 that doesn't have avx.

markldn avatar Apr 01 '19 21:04 markldn

@markldn

I am only too glad it helped someone out.

samhodge avatar Apr 02 '19 00:04 samhodge

Sam Many thanks... 2013 GTX 670/4 (CC3) and an old 2008 Dell T5400 Dual Xeon Workstation E5430 with 16GB RAM 18.04 Bionic... still quite capable )) Nice work!

edaustin avatar Apr 22 '19 03:04 edaustin

Awesome news @edaustin

samhodge avatar Apr 22 '19 03:04 samhodge

Oh problem ))))

2019-04-22 13:42:12.573196: F tensorflow/core/platform/cpu_feature_guard.cc:37] The TensorFlow library was compiled to use SSE4.2 instructions, but these aren't available on your machine.

Followed by a crash!

Could you build and modify the image file: tensorflow/core/platform/cpu_feature_guard.cc

So, that the FATAL is changed to WARNING, or better still nothing ))))

Many thanks!

edaustin avatar Apr 22 '19 12:04 edaustin

Tensorflow-GPU 1.13.2 for Python 3.6 Linux_X86_64 (built in Ubuntu 18.04) • No AVX. No SSE4.1. No SSE4.2. • CUDA 10.0. CUDNN 7.6.4. Compute Capability 5.2.

Google Drive Download: https://drive.google.com/file/d/1N75Iu_8_zS2D4w4Dzs0xYcp9y1pvVgU3/view?usp=sharing

  • Currently building same, but with SSE4.1 and SSE4.2. Will share once done ...

Marx-Melencio avatar Mar 23 '21 15:03 Marx-Melencio