n2v
n2v copied to clipboard
GPU training doesn't work?
Hi, thanks for your nice code.
When I'm training the model, it trains on cpu, not gpu, which makes the training quite slow.
I've installed tensorflow-gpu 1.14.0 and keras 2.2.5. And the environment works fine with other project (other projects can train on gpu). I wonder is there any configuration we need to set explicitly to make gpu work? Thanks!
name: n2vv2
channels:
- conda-forge
- defaults
dependencies:
- _libgcc_mutex=0.1=conda_forge
- _openmp_mutex=4.5=1_gnu
- abseil-cpp=20210324.2=h9c3ff4c_0
- absl-py=0.15.0=pyhd8ed1ab_0
- aiohttp=3.7.4.post0=py39h3811e60_1
- argon2-cffi=21.1.0=py39h3811e60_2
- astunparse=1.6.3=pyhd8ed1ab_0
- async-timeout=3.0.1=py_1000
- async_generator=1.10=py_0
- attrs=21.2.0=pyhd8ed1ab_0
- backcall=0.2.0=pyh9f0ad1d_0
- backports=1.0=py_2
- backports.functools_lru_cache=1.6.4=pyhd8ed1ab_0
- bleach=4.1.0=pyhd8ed1ab_0
- blinker=1.4=py_1
- brotlipy=0.7.0=py39h3811e60_1003
- c-ares=1.18.1=h7f98852_0
- ca-certificates=2021.10.8=ha878542_0
- cached-property=1.5.2=hd8ed1ab_1
- cached_property=1.5.2=pyha770c72_1
- cachetools=4.2.4=pyhd8ed1ab_0
- certifi=2021.10.8=py39hf3d152e_1
- cffi=1.15.0=py39h4bc2ebd_0
- chardet=4.0.0=py39hf3d152e_2
- click=8.0.3=py39hf3d152e_1
- cryptography=35.0.0=py39h95dcef6_2
- cudatoolkit=11.3.1=ha36c431_9
- cudnn=8.2.1.32=h86fa8c9_0
- cupti=11.3.1=0
- dataclasses=0.8=pyhc8e2a94_3
- debugpy=1.5.1=py39he80948d_0
- decorator=5.1.0=pyhd8ed1ab_0
- defusedxml=0.7.1=pyhd8ed1ab_0
- entrypoints=0.3=pyhd8ed1ab_1003
- gast=0.4.0=pyh9f0ad1d_0
- giflib=5.2.1=h36c2ea0_2
- google-auth=1.35.0=pyh6c4a22f_0
- google-auth-oauthlib=0.4.6=pyhd8ed1ab_0
- google-pasta=0.2.0=pyh8c360ce_0
- grpc-cpp=1.39.1=h850795e_1
- grpcio=1.39.0=py39hff7568b_0
- h5py=3.1.0=nompi_py39h25020de_100
- hdf5=1.10.6=nompi_h6a2412b_1114
- icu=68.2=h9c3ff4c_0
- idna=2.10=pyh9f0ad1d_0
- importlib-metadata=4.8.1=py39hf3d152e_1
- importlib_resources=5.4.0=pyhd8ed1ab_0
- ipykernel=6.4.2=py39hef51801_0
- ipython=7.29.0=py39hef51801_1
- ipython_genutils=0.2.0=py_1
- jedi=0.18.0=py39hf3d152e_3
- jinja2=3.0.2=pyhd8ed1ab_0
- jpeg=9d=h36c2ea0_0
- jsonschema=4.2.1=pyhd8ed1ab_0
- jupyter_client=7.0.6=pyhd8ed1ab_0
- jupyter_core=4.9.1=py39hf3d152e_0
- jupyterlab_pygments=0.1.2=pyh9f0ad1d_0
- keras=2.6.0=pyhd8ed1ab_0
- keras-preprocessing=1.1.2=pyhd8ed1ab_0
- krb5=1.19.2=hcc1bbae_3
- ld_impl_linux-64=2.36.1=hea4e1c9_2
- libblas=3.9.0=12_linux64_openblas
- libcblas=3.9.0=12_linux64_openblas
- libcurl=7.79.1=h2574ce0_1
- libedit=3.1.20191231=he28a2e2_2
- libev=4.33=h516909a_1
- libffi=3.4.2=h9c3ff4c_4
- libgcc-ng=11.2.0=h1d223b6_11
- libgfortran-ng=11.2.0=h69a702a_11
- libgfortran5=11.2.0=h5c6108e_11
- libgomp=11.2.0=h1d223b6_11
- liblapack=3.9.0=12_linux64_openblas
- libnghttp2=1.43.0=h812cca2_1
- libopenblas=0.3.18=pthreads_h8fe5266_0
- libpng=1.6.37=h21135ba_2
- libprotobuf=3.16.0=h780b84a_0
- libsodium=1.0.18=h36c2ea0_1
- libssh2=1.10.0=ha56f1ee_2
- libstdcxx-ng=11.2.0=he4da1e4_11
- libzlib=1.2.11=h36c2ea0_1013
- markdown=3.3.4=pyhd8ed1ab_0
- markupsafe=2.0.1=py39h3811e60_1
- matplotlib-inline=0.1.3=pyhd8ed1ab_0
- mistune=0.8.4=py39h3811e60_1005
- multidict=5.2.0=py39h3811e60_1
- nbclient=0.5.4=pyhd8ed1ab_0
- nbconvert=6.2.0=py39hf3d152e_0
- nbformat=5.1.3=pyhd8ed1ab_0
- nccl=2.11.4.1=hdc17891_0
- ncurses=6.2=h58526e2_4
- nest-asyncio=1.5.1=pyhd8ed1ab_0
- notebook=6.4.5=pyha770c72_0
- numpy=1.19.5=py39hdbf815f_2
- oauthlib=3.1.1=pyhd8ed1ab_0
- openssl=1.1.1l=h7f98852_0
- opt_einsum=3.3.0=pyhd8ed1ab_1
- packaging=21.0=pyhd8ed1ab_0
- pandoc=2.16.1=h7f98852_0
- pandocfilters=1.5.0=pyhd8ed1ab_0
- parso=0.8.2=pyhd8ed1ab_0
- pexpect=4.8.0=pyh9f0ad1d_2
- pickleshare=0.7.5=py_1003
- pip=21.3.1=pyhd8ed1ab_0
- prometheus_client=0.12.0=pyhd8ed1ab_0
- prompt-toolkit=3.0.22=pyha770c72_0
- protobuf=3.16.0=py39he80948d_0
- ptyprocess=0.7.0=pyhd3deb0d_0
- pyasn1=0.4.8=py_0
- pyasn1-modules=0.2.7=py_0
- pycparser=2.21=pyhd8ed1ab_0
- pygments=2.10.0=pyhd8ed1ab_0
- pyjwt=2.3.0=pyhd8ed1ab_0
- pyopenssl=21.0.0=pyhd8ed1ab_0
- pyparsing=3.0.5=pyhd8ed1ab_0
- pyrsistent=0.18.0=py39h3811e60_0
- pysocks=1.7.1=py39hf3d152e_4
- python=3.9.7=hb7a2778_3_cpython
- python-dateutil=2.8.2=pyhd8ed1ab_0
- python-flatbuffers=1.12=pyhd8ed1ab_1
- python_abi=3.9=2_cp39
- pyu2f=0.1.5=pyhd8ed1ab_0
- pyzmq=22.3.0=py39h37b5a0c_1
- re2=2021.09.01=h9c3ff4c_0
- readline=8.1=h46c0cb4_0
- requests=2.25.1=pyhd3deb0d_0
- requests-oauthlib=1.3.0=pyh9f0ad1d_0
- rsa=4.7.2=pyh44b312d_0
- scipy=1.7.1=py39hee8e79c_0
- send2trash=1.8.0=pyhd8ed1ab_0
- setuptools=58.5.3=py39hf3d152e_0
- six=1.15.0=pyh9f0ad1d_0
- snappy=1.1.8=he1b5a44_3
- sqlite=3.36.0=h9cd32fc_2
- tensorboard=2.6.0=pyhd8ed1ab_1
- tensorboard-data-server=0.6.0=py39h95dcef6_1
- tensorboard-plugin-wit=1.8.0=pyh44b312d_0
- tensorflow=2.6.0=cuda112py39h9dc3950_2
- tensorflow-base=2.6.0=cuda112py39h0b4cdfd_2
- tensorflow-estimator=2.6.0=cuda112py39heacc632_2
- termcolor=1.1.0=py_2
- terminado=0.12.1=py39hf3d152e_1
- testpath=0.5.0=pyhd8ed1ab_0
- tk=8.6.11=h27826a3_1
- tornado=6.1=py39h3811e60_2
- traitlets=5.1.1=pyhd8ed1ab_0
- typing-extensions=3.7.4.3=0
- typing_extensions=3.7.4.3=py_0
- tzdata=2021e=he74cb21_0
- urllib3=1.26.7=pyhd8ed1ab_0
- wcwidth=0.2.5=pyh9f0ad1d_2
- webencodings=0.5.1=py_1
- werkzeug=2.0.1=pyhd8ed1ab_0
- wheel=0.37.0=pyhd8ed1ab_1
- wrapt=1.12.1=py39h3811e60_3
- xz=5.2.5=h516909a_1
- yarl=1.7.2=py39h3811e60_1
- zeromq=4.3.4=h9c3ff4c_1
- zipp=3.6.0=pyhd8ed1ab_0
- zlib=1.2.11=h36c2ea0_1013
- pip:
- csbdeep==0.6.3
- cycler==0.11.0
- imagecodecs==2021.8.26
- kiwisolver==1.3.2
- matplotlib==3.4.3
- pillow==8.4.0
- ruamel-yaml==0.17.17
- ruamel-yaml-clib==0.2.6
- tifffile==2021.11.2
- tqdm==4.62.3
prefix: /home/tokariew/.local/share/conda/envs/n2vv2
with such conda environment GPU training is working for me on linux with nvidia GPU, hope it helps…
nv2 i installed from github and edited setup.py to bump version of keras
The most recent N2V version requires TF2. Could you try this combination:
conda create -n n2v_env python=3.7
conda activate n2v_env
conda install cudatoolkit=10.1 cudnn
pip install tensorflow==2.3
pip install n2v
pip install jupyter
The most recent N2V version requires TF2. Could you try this combination:
conda create -n n2v_env python=3.7 conda activate n2v_env conda install cudatoolkit=10.1 cudnn pip install tensorflow==2.3 pip install n2v pip install jupyter
And I add the "X:\anaconda3\envs\n2v_env\Library\bin" to the system path. It works very well on Win10.
Got a new GPU and can only use super slow tensorflow==2.2 or slow tensorflow==1.15
conda create -n n2v python=3.7
conda install cudatoolkit=10.0 cudnn=7.6 tensorflow-estimator==1.15.1 keras==2.2.4 tensorflow-gpu==1.15
pip install n2v==0.2.1
Edit: found a solution for CUDA 11.5 + Tensorflow 1.15 that is fast
conda create -n n2v python=3.8
conda activate n2v
pip install nvidia-pyindex
pip install nvidia-tensorflow
pip install nvidia-tensorboard
pip install n2v==0.2.1
cf. https://github.com/NVIDIA/tensorflow
sidenote: this is on Ubuntu 20.04
Edit 2: for Tensorflow 1.15., adding this to the notebook is useful to prevent annoying warnings and excessive memory allocation:
import tensorflow as tf
conf = tf.compat.v1.ConfigProto()
conf.gpu_options.allow_growth=True
session = tf.compat.v1.Session(config=conf)
tf.compat.v1.logging.set_verbosity('ERROR')
The environment version I am using is TF2, on Win11 and Anaconda. python==3.9 tensorflow=2.7 CUDA=11.8 cuDNN=8.7 refer to the author's readme for other environment requirements