recognize
recognize copied to clipboard
GPU is not being used
Hello @marcelklehr ,
I have enabled GPU support at the admin GUI but when I start a manual process via occ recognize:classify I can see that a process is being started and using 100 % a CPU core.
The GPU is not being used.
I have installed all the specified Nvidia applications/libraries
Hi!
Are there any messages in the nextcloud log?
Unfortunatley not - the only "warning" I can see in my log would be:
[recognize] Warning: Classifying photos of user 3A60C52D-9415-4F28-A2B7-71A8CBD7A9E3 at 2021-08-26T08:37:57+02:00
The only thing I can see on my shell is that www-data is running node-v14.17.4-linux-x64. This processed cannot be stopped or killed - even a reboot does not solve it. I need to reset the whole VM to have the processed killed.
What I see additional within the log (but it's not linked to my manual start of the classifying process) would be: `[recognize] Warning: Classifier process output: 2021-08-26 07:59:08.434295: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. []
at 2021-08-26T07:59:08+02:00`
and:
[index] Error: Call to a member function getOwner() on null
GET /index.php/apps/recognize/admin/countMissed from 192.168.10.2 by 3A60C52D-9415-4F28-A2B7-71A8CBD7A9E3 at 2021-08-26T08:19:11+02:00
But I am not sure if this is linked to this issue or not.
Okay so during the night the new version was able to be downloaded. I did so today morning. Nextcloud 22.1.1 Recognize 1.6.3
When manually starting the process I get following error message: Classifying photos of user ED17CAA4-EC2F-4457-95AB-A5980927C9C8 Failed to classify images Classifier process error
My log would say: [recognize] Warning: Classifier process output: 2021-08-27 06:42:20.937775: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. Error: Cannot find module '@tensorflow/tfjs-node-gpu' Require stack:
- /var/www/cloud/apps/recognize/src/efficientnet/EfficientnetModel.js
- /var/www/cloud/apps/recognize/src/classifier_imagenet.js
at Function.Module._resolveFilename (internal/modules/cjs/loader.js:889:15)
at Function.Module._load (internal/modules/cjs/loader.js:745:27)
at Module.require (internal/modules/cjs/loader.js:961:19)
at require (internal/modules/cjs/helpers.js:92:18)
at Object.
(/var/www/cloud/apps/recognize/src/efficientnet/EfficientnetModel.js:11:9) at Module._compile (internal/modules/cjs/loader.js:1072:14) at Object.Module._extensions..js (internal/modules/cjs/loader.js:1101:10) at Module.load (internal/modules/cjs/loader.js:937:32) at Function.Module._load (internal/modules/cjs/loader.js:778:12) at Module.require (internal/modules/cjs/loader.js:961:19) { code: 'MODULE_NOT_FOUND', requireStack: [ '/var/www/cloud/apps/recognize/src/efficientnet/EfficientnetModel.js', '/var/www/cloud/apps/recognize/src/classifier_imagenet.js' ] } Trying js-only mode internal/modules/cjs/loader.js:892 throw err; ^
Error: Cannot find module '@tensorflow/tfjs-backend-wasm' Require stack:
- /var/www/cloud/apps/recognize/src/efficientnet/EfficientnetModel.js
- /var/www/cloud/apps/recognize/src/classifier_imagenet.js
at Function.Module._resolveFilename (internal/modules/cjs/loader.js:889:15)
at Function.Module._load (internal/modules/cjs/loader.js:745:27)
at Module.require (internal/modules/cjs/loader.js:961:19)
at require (internal/modules/cjs/helpers.js:92:18)
at Object.
(/var/www/cloud/apps/recognize/src/efficientnet/EfficientnetModel.js:19:3) at Module._compile (internal/modules/cjs/loader.js:1072:14) at Object.Module._extensions..js (internal/modules/cjs/loader.js:1101:10) at Module.load (internal/modules/cjs/loader.js:937:32) at Function.Module._load (internal/modules/cjs/loader.js:778:12) at Module.require (internal/modules/cjs/loader.js:961:19) { code: 'MODULE_NOT_FOUND', requireStack: [ '/var/www/cloud/apps/recognize/src/efficientnet/EfficientnetModel.js', '/var/www/cloud/apps/recognize/src/classifier_imagenet.js' ] }
at 2021-08-27T06:42:20+02:00
So it looks like '@tensorflow/tfjs-node-gpu & @tensorflow/tfjs-backend-wasm are not included in the NC app.
I've had to disable GPU for now, because the bundle would exceed the bundle size limit :/
I've had to disable GPU for now, because the bundle would exceed the bundle size limit :/
The limitation from the Nextcloud appstore?
Yeah
Okay, would it be possible that you create a "Github-only" version of it (e.g. xxx-RC1) so I can download and test it?
I'll definitely try to make something available. Currently, my problem is that I have to develop that blindly, as I don't have a GPU machine available.
If you want you can pack me the thing and I will act as your alpha-/beta tester?!
I'm testing it with my NVIDIA GeForce GTX 1660 super (cuda supported even I couldn't find it on the list)
First I have to set up another instance .. I'm using an older version where it still is integrated
lol nextcloud apps is down :(
Now I can wait even longer
GPU support has to wait until other issues are sorted out, sorry.
Okay so for the moment I can remove all necessary Nvidia libraries (except driver)?
Okay so for the moment I can remove all necessary Nvidia libraries (except driver)?
For the moment no NVIDIA drivers and libraries are needed, but they won't hurt either, so it's up to you.
It's just a bit complex to install different CUDA libraries/versions - that's why I am asking :-) At the moment I sticking with CUDA 11.2 as you have mentioned it in a previous version
@marcelklehr just in case of: Windows now supports NVIDIA GPUs within its WSL which I am using. So if you have any tests which I could do just let me know.
@derritter88, did you get it working?
I've NC in Docker and have been able to get containers gaining access to GPU, i.e. Tensorflow example:
docker run --gpus all -it --rm tensorflow/tensorflow:latest-gpu python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
Similar results get NVIDIA examples:
#docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Maxwell" with compute capability 5.0
> Compute 5.0 CUDA device: [NVIDIA GeForce GTX 960M]
5120 bodies, total time for 10 iterations: 6.155 ms
= 42.591 billion interactions per second
= 851.816 single-precision GFLOP/s at 20 flops per interaction
I've big archive of photos to get processed and running it on CPU is an overkill.
Thanks for hints on how to get it working - am not shy customizing NC container/whatever is needed.
@derritter88, did you get it working? I've NC in Docker and have been able to get containers gaining access to GPU, i.e. Tensorflow example:
docker run --gpus all -it --rm tensorflow/tensorflow:latest-gpu python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"Similar results get NVIDIA examples:
#docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark > Windowed mode > Simulation data stored in video memory > Single precision floating point simulation > 1 Devices used for simulation GPU Device 0: "Maxwell" with compute capability 5.0 > Compute 5.0 CUDA device: [NVIDIA GeForce GTX 960M] 5120 bodies, total time for 10 iterations: 6.155 ms = 42.591 billion interactions per second = 851.816 single-precision GFLOP/s at 20 flops per interactionI've big archive of photos to get processed and running it on CPU is an overkill.
Thanks for hints on how to get it working - am not shy customizing NC container/whatever is needed.
Hello @bugsyb ,
thanks for sharing this with me/us. Might be a useful information for some people but unfortunately I do not use Nextcloud as a Docker container. I "just" have a regular dedicated Nextcloud VM. I also had around ~100k of photos/images to classify but my CPU handled that over the last couple of weeks.
I had some discussions with @marcelklehr about it and the major problem would be to have an AI library like Tensorflow which could handled both Nvidia and AMD GPUs
Hi @derritter88 ,
Thanks for swift response.
I did take a quick look at what gets installed as part of Recognize and smells like tensorflow-webgl gets there.
There is also flag in the code which suggest it should be possible even today:
process.env.RECOGNIZE_GPU
Hopes were that given your earlier engagement you'd know how to get Recognize using GPU.
I have also large number of photos to be processed and... well, hoped could leverage GPU which is wasted otherwise.
I run most of apps these days as containers, just for simplicity/dependency and easiness of portability between systems. Happy to share knowledge on the side if you'd be interested.
Re GPUs Nvidia and AMD, tensorflow allows to get it run both natively as well as in container, as demonstrated for Nvidia.
Here is small explanation covering AMD: https://community.amd.com/t5/hsa/tensorflow-with-amd-gpu/td-p/199925 https://medium.com/analytics-vidhya/install-tensorflow-2-for-amd-gpus-87e8d7aeb812 https://www.amd.com/en/technologies/infinity-hub/tensorflow https://tealfeed.com/install-tensorflow-gpu-amd-gpus-vbs7s
There was also other implementation DirectML, though as Internet claims, it was for Windows and WSL which standard Linux wouldn't count in as to be used (am not sure about the latter though).
If we could get started with Nvidia, which is more popular across people who would use it for Linux (not so much gaming ;) ) it would be great, especially as Tensorflow is already available.
I can't help much with AMD as don't have one.
To be honest: I gave up this topic and passed my GPU to a Plex VM for video transcoding but maybe @marcelklehr could improve the general logic of recognize?
I have an AMD gpu in a laptop that I use for nextcloud
I have an AMD gpu in a laptop that I use for nextcloud
AMD GPUs probably won't work anyways.