donut inference speed on cpu is very slow as compared to gpu inference

inference speed on cpu is very slow as compared to gpu inference

Open vishal-nayak1 opened this issue 2 years ago • 2 comments

@gwkrsrch I have tried to run the inference script on cpu, the cpu inference time is very high as compared to gpu inference time.Can you fix this issue?

Oct 04 '22 12:10 vishal-nayak1

Is there any solution to this ? @gwkrsrch thank you

Oct 10 '22 09:10 trikiamine23

Hi guys, I've been playing around with the library. Looks quite interesting. great work @gwkrsrch Just some notes here on the time taken ( please note here, I've used a Linux Dell Inspiron 15 model with 8 GB RAM ) So please take the results with a pinch of salt as my machine has the most basic configs and is not GPU enabled. used latest Mint Linux version (21.0)

Without GPU and CUDA installed, it took approx. 4 hours to complete.
Installed CUDA and same tests completed within 1 hour. So I wanted to ask what is the minimum required in terms of GPU and RAM on my laptop to be able to train with a decent number of images (500 - 1000) ? thanks.

Oct 16 '22 11:10 ghost

Hi @dneemuth @trikiamine23 @vishal-nayak1 ,

I recently updated some lines to make the CPU inference fast. It seems torch.bfloat16-related lines were the source of the issue. Please use the latest version. Hope this update helps :)

Feel free to reopen this or open another issue if you have anything new for sharing/debugging.

Nov 16 '22 13:11 gwkrsrch

@NielsRogge Can you please update these changes to transformer library as well.

Thanks

Nov 21 '22 09:11 vishal-nayak1

Hi @gwkrsrch,

Could you clarify which changes are necessary for fast CPU inference, then I'll update the 🤗 model as well

Nov 21 '22 09:11 NielsRogge

Hi @gwkrsrch thank you very much for the changes.

I have noticed that things did not change. Do you have any standard timings on CPU (before/after change) ?

For me dpi 300 and 2 fields to extract with 16CPUs = 10seconds

Dec 08 '22 10:12 trikiamine23

I have the same issue. An Image with size 1658x2343 has around 40 seconds to classify. Im running on 8 CPUs...

Dec 09 '22 21:12 HolzmanoLagrene

donut donut copied to clipboard

inference speed on cpu is very slow as compared to gpu inference

donut
donut copied to clipboard