im2markup icon indicating copy to clipboard operation
im2markup copied to clipboard

error importation cudnn

Open theoeiferman opened this issue 4 years ago • 20 comments

when launching :

th src/train.lua -phase train -gpu_id 1
-model_dir model
-input_feed -prealloc
-data_base_dir data/sample/images_processed/
-data_path data/sample/train_filter.lst
-val_data_path data/sample/validate_filter.lst
-label_path data/sample/formulas.norm.lst
-vocab_file data/sample/latex_vocab.txt
-max_num_tokens 150 -max_image_width 500 -max_image_height 160
-batch_size 20 -beam_size 1

cudnn is not found.... I tried" luarocks install cudnn" still doesen't work

Screenshot 2020-06-25 at 17 20 28

theoeiferman avatar Jun 25 '20 15:06 theoeiferman

Hmm can you try "require cudnn" in a torch prompt (by command "th") and see if that works?

da03 avatar Jun 25 '20 15:06 da03

Screenshot 2020-06-25 at 17 33 29 I have this as an answer, I m not very familiar with lua language

theoeiferman avatar Jun 25 '20 15:06 theoeiferman

sorry I meant entering "th" first,

then in the prompt, enter "require cudnn"

da03 avatar Jun 25 '20 16:06 da03

thanks for the advice ! but it still doesn't work . It seems that cutorch is also not install . I tried "luarocks install cutorch " but I have failed building..!

Screenshot 2020-06-25 at 23 29 00

theoeiferman avatar Jun 25 '20 21:06 theoeiferman

Oh I think it seems to be some issue with installing cudnn. Installing torch correctly might be hard, would you mind using docker? Here is a docker file that can be directly used: https://github.com/OpenNMT/OpenNMT/blob/master/Dockerfile

da03 avatar Jun 25 '20 23:06 da03

Ok I didn't know about the existence of Docker, thanks for the tips it looks great! I tried to build an image by copying the content of the dockerfile you sent me but I get "unable to prepare context: context must be a directory: /Users/teiferman27/Dockerfile" When I tried to launch directly "docker build https://github.com/OpenNMT/OpenNMT/blob/master/Dockerfile#L6" Screenshot 2020-06-28 at 21 27 52

I have one more question if I succeed to launch this dockerfile. Then I can use the Lua language with file on my computer? or just inside the "container". Thanks you for the advice already !

theoeiferman avatar Jun 28 '20 19:06 theoeiferman

For the first question, I think you need to put the dockerfile inside a folder, then inside this folder do docker build . will generate the image.

For the second question, it allows using Lua inside docker container only.

da03 avatar Jun 28 '20 21:06 da03

BTW, I think you might need to use nvidia-docker (https://github.com/NVIDIA/nvidia-docker) to support using GPUs inside docker container.

da03 avatar Jun 28 '20 21:06 da03

Ok I succeeded to use docker build -t operating_lua . After running all the afternoon to build the image, I then have tried to launch the command docker run operating_lua but it is just opening and closing on the docker dashboard ..... You think docker doesn't support the container and I need to use nvidia-docker ? thanks for the respond

theoeiferman avatar Jun 29 '20 17:06 theoeiferman

I think it should be nvidia-docker run -it operating_lua /bin/bash, but it might be better to directly check docker documentation.

da03 avatar Jun 29 '20 17:06 da03

I wanted to install nvidia-docker but I needed to install NVIDIA driver first. but it seems that this step require Linux operating system and I am on MacOs .... but I was surprised because one good point for docker was that everyone can run it from every operating system !

Then I tried docker run -dit operating_lua and I was able to open the container and read in it :

Screenshot 2020-06-29 at 22 16 15

Moreover when I tried to see if the module 'cudnn' is in the system I get :

Screenshot 2020-06-29 at 22 28 53

I am still confused about how to approach the big picture... Can I import files into the container? Do I really need nvidia-docker ?

Thank you again for your time @da03 .

theoeiferman avatar Jun 29 '20 20:06 theoeiferman

Hmm I suspect that your CUDA driver version might be too outdated (what's the output of nvcc --version and nvidia-smi?), which caused issues both for require cudnn and for installing nvidia-docker. There are actually CUDA drivers available for mac: https://www.nvidia.com/en-us/drivers/cuda/mac-driver-archive/. Fixing the driver version issue might solve all problems.

da03 avatar Jun 29 '20 20:06 da03

On the link https://github.com/NVIDIA/nvidia-docker , they talk about Linux Screenshot 2020-06-29 at 22 46 21 nvcc --version and nvidia-smi are unkown for now but probably because I didn't install nvidia-docker yet ? I am going to install CUDA drivers and then nvidia-docker.

theoeiferman avatar Jun 29 '20 20:06 theoeiferman

Oh no, so it seems nvidia-docker would not work on Mac... I have never used GPUs on Mac, but I think with a proper CUDA installation (https://docs.nvidia.com/cuda/cuda-installation-guide-mac-os-x/index.html), you should get both nvcc --version and nvidia-smi working.

da03 avatar Jun 29 '20 20:06 da03

I can't install on my mac because Nvidia doesen't support mac system anymore. I think my mac may be it is too old. I check my graphics on the system information as shown in https://www.quantstart.com/articles/Installing-Nvidia-CUDA-on-Mac-OSX-for-GPU-Based-Parallel-Computing/ But I don't have NVIDIA graphic card on my computer.

I think the incompatibility is also mentionned there https://developer.nvidia.com/nvidia-cuda-toolkit-developer-tools-mac-hosts

I am surprised of this CUDA/NVIDIA requirement to use the container though.

theoeiferman avatar Jun 29 '20 22:06 theoeiferman

Oh that explains why: this code (or cudnn) only supports CUDA and cannot run on systems without GPUs. While this version (https://opennmt.net/OpenNMT-py/im2text.html, code can be found at https://github.com/OpenNMT/OpenNMT-py) supports CPU only training, doing so would be extremely slow without the parallelism provided by GPUs. Another way might be using cloud computes such as Amazon EC2 or Google GCE or Microsoft Azure, and rent a GPU instance.

da03 avatar Jun 29 '20 22:06 da03

I manage to get another computer but the GPU is AMD Radeon and so I can't use the cudnn module ... I think it should be mentioned on the prerequisites since docker can't solve this hardware issue.

I was about to try CPU but I think that on the link you gave me (https://opennmt.net/OpenNMT-py/im2text.html) there is dependencies like torch vision and pytorch is required ( and so CUDA-enabled GPU are needed again no ? )

I try to follow the steps from https://opennmt.net/OpenNMT-py/im2text.html but the command onmt_preprocess is not found. There is a step I have missed ?

I will try to use cloud computes probably.

But just to be sure ( correct me if I am wrong) :

  • OpenNMT-py project works with pytorch (https://github.com/OpenNMT/OpenNMT-py#requirements)
  • this project works with torch ( https://github.com/harvardnlp/im2markup)

theoeiferman avatar Jul 01 '20 11:07 theoeiferman

Yes you are right that OpenNMT-py uses PyTorch and this project uses LuaTorch. PyTorch does not require GPUs (you can do CPU-only installation), but again, it might be extremely slow without using GPUs.

For the onmt_preprocess missing issue, have you installed OpenNMT-py following the instructions here? https://github.com/OpenNMT/OpenNMT-py

da03 avatar Jul 03 '20 01:07 da03

I had issues with onmt commands because I use python environment using google collab ( You can activate GPU on the settings and it seems to be a good free solution ) Installing OpenNMT-py with pip instead of clonings the project worked for the onmt command.

Is Using Google-collab a good way to perform GPU calculations ? I am trying to train the model, but it takes a lot of times, do you now how much ?

theoeiferman avatar Jul 20 '20 12:07 theoeiferman

Yeah I think so! The only problem is that the runtime would be disconnected if it's idle for a certain period of time, and the instance would be freed so all progress would be lost. Therefore, you might want to connect to your google drive, and save progress (checkpoints) to your google drive.

da03 avatar Jul 20 '20 13:07 da03