insightface Not sure if insightface is using CPU or GPU?

Hi, all:

I can run my test.py (using insightface) for real-time face swapping. However:

the swapping for camera streaming is extremly slow. It cannot even see my mouth open and close
I ONLY installed onnxruntime-gpu
But when I run my code, I got Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}

as follows:

➜  python test.py 
~/.local/lib/python3.10/site-packages/numpy/core/getlimits.py:518: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
  setattr(self, word, getattr(machar, word).flat[0])
~/.local/lib/python3.10/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
  return self._float_to_str(self.smallest_subnormal)
~/.local/lib/python3.10/site-packages/numpy/core/getlimits.py:518: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
  setattr(self, word, getattr(machar, word).flat[0])
~/.local/lib/python3.10/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
  return self._float_to_str(self.smallest_subnormal)
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ~/.insightface/models/buffalo_l/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ~/.insightface/models/buffalo_l/2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ~/.insightface/models/buffalo_l/det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ~/.insightface/models/buffalo_l/genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ~/.insightface/models/buffalo_l/w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5
set det-size: (640, 640)
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
inswapper-shape: [1, 3, 128, 128]
~/.local/lib/python3.10/site-packages/insightface/utils/transform.py:68: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`.
  P = np.linalg.lstsq(X_homo, Y)[0].T # Affine matrix. 3 x 4
[ WARN:[email protected]] global cap_gstreamer.cpp:1728 open OpenCV | GStreamer warning: Cannot query video position: status=0, value=-1, duration=-1
QFactoryLoader::QFactoryLoader() checking directory path "/usr/lib/x86_64-linux-gnu/qt6/plugins/platforms" ...
QFactoryLoader::QFactoryLoader() looking at "/usr/lib/x86_64-linux-gnu/qt6/plugins/platforms/libqeglfs.so"
Found metadata in lib /usr/lib/x86_64-linux-gnu/qt6/plugins/platforms/libqeglfs.so, metadata=
{
    "IID": "org.qt-project.Qt.QPA.QPlatformIntegrationFactoryInterface.5.3",
    "MetaData": {
        "Keys": [
            "eglfs"
        ]
    },
    "archreq": 0,
    "className": "QEglFSIntegrationPlugin",
    "debug": false,
    "version": 393728
}
......

I think the Applied providers should be one of TensorrtExecutionProvider or CUDAExecutionProvider, for I got so many providers.

['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'MIGraphXExecutionProvider', 'ROCMExecutionProvider', 'OpenVINOExecutionProvider', 'DnnlExecutionProvider', 'TvmExecutionProvider', 'VitisAIExecutionProvider', 'QNNExecutionProvider', 'NnapiExecutionProvider', 'JsExecutionProvider', 'CoreMLExecutionProvider', 'ArmNNExecutionProvider', 'ACLExecutionProvider', 'DmlExecutionProvider', 'RknpuExecutionProvider', 'XnnpackExecutionProvider', 'CANNExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']

Why the actual provider I obtained is CPUExecutionProvider ?? Is this the reason that makes my test.py very slow???

Cheers

Aug 02 '23 04:08 jiapei100

did u specify providers ?

Aug 02 '23 08:08 phineas-pta

@phineas-pta Nope... How? I mean: how from within insightface ?

Aug 02 '23 15:08 jiapei100

import insightface

insightface.app.FaceAnalysis(name='buffalo_l', providers=['CUDAExecutionProvider'])

Aug 02 '23 15:08 phineas-pta

@phineas-pta

Thank you ...

But, how can I test if this particilar provider is applied? After adding providers=['CUDAExecutionProvider']), I still got the following:

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/lvision/.insightface/models/buffalo_l/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/lvision/.insightface/models/buffalo_l/2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/lvision/.insightface/models/buffalo_l/det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/lvision/.insightface/models/buffalo_l/genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/lvision/.insightface/models/buffalo_l/w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5
set det-size: (640, 640)
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}

How to test if CPUExecutionProvider is now working, or if CUDAExecutionProvider is working?

BTW, the speed of my program is still NOT fast enough. It still cannot catch up my mouth's speed even with a slower speaking speed. For now,

frame size: 640x480
the following time differences are in seconds.

...
Time difference is: 0.752118
Time difference is: 0.833245
Time difference is: 0.748658
Time difference is: 0.6013
Time difference is: 0.720522
Time difference is: 0.658189
Time difference is: 0.608403
Time difference is: 0.763706
Time difference is: 0.792851
Time difference is: 0.84137
...

Aug 02 '23 15:08 jiapei100

well it means u didnt set up env correctly: need cuda 11.8 (not 12 yet), remove anything onnxruntime and reinstall onnxruntime-gpu, recent nvidia gpu

Aug 02 '23 15:08 phineas-pta

@phineas-pta

hmmm... That's probably why. I ONLY installed CUDA 12.2 ...

Aug 02 '23 15:08 jiapei100

official wheels onnxruntime-gpu dont support cuda 12 yet, if u want u have to build from source

Aug 02 '23 16:08 phineas-pta

onnxruntime-gpu now suport CUDA 12, just uninstall and reinstall the new one: Must install with extra-index-url to work with CUDA 12

pip uninstall onnxruntime-gpu onnxruntime
pip install onnxruntime-gpu --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/

Feb 06 '24 11:02 TheBlackHacker

I can't ran model on GPU when I ran the below code I get GPU output import onnxruntime as ort print( ort.get_device() )

but when I ran the below code just cpu is available.

Can you help me with that?

model = insightface.app.FaceAnalysis(providers=['CUDAExecutionProvider']) model.prepare(ctx_id = 1) # use CPU and set NMS threshold

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}

@phineas-pta

Feb 20 '24 06:02 arad2022

install cuda & cudnn properly

Feb 20 '24 09:02 phineas-pta

@phineas-pta

@ phineas-pta

Thank you ...

谢谢..。

But, how can I test if this particilar provider is applied? After adding 但是，如何测试是否应用了这个特定的提供程序 providers=['CUDAExecutionProvider']), I still got the following:，我仍然得到以下:
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/lvision/.insightface/models/buffalo_l/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/lvision/.insightface/models/buffalo_l/2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/lvision/.insightface/models/buffalo_l/det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/lvision/.insightface/models/buffalo_l/genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/lvision/.insightface/models/buffalo_l/w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5
set det-size: (640, 640)
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
How to test if CPUExecutionProvider is now working, or if CUDAExecutionProvider is working?

如何测试 CPUExectionProvider 是否正在工作，或 CUDAExectionProvider 是否正在工作？

BTW, the speed of my program is still NOT fast enough. It still cannot catch up my mouth's speed even with a slower speaking speed. For now,

顺便说一下，我的程序的速度还是不够快。它仍然不能赶上我的嘴的速度，即使说话速度较慢。现在,

frame size: 640x480

画面大小: 640x480

the following time differences are in seconds.

以下时差单位为秒。
...
Time difference is: 0.752118
Time difference is: 0.833245
Time difference is: 0.748658
Time difference is: 0.6013
Time difference is: 0.720522
Time difference is: 0.658189
Time difference is: 0.608403
Time difference is: 0.763706
Time difference is: 0.792851
Time difference is: 0.84137
...

My question is the same as yours. My current environment is cudnn=8.2.4 cuda=11.4 onnxruntime gpu=1.12

1710668186860

Mar 17 '24 09:03 yyh59098