DeepSpeech icon indicating copy to clipboard operation
DeepSpeech copied to clipboard

Anyone know how to set per_process_gpu_memory_fraction ?

Open austinksmith opened this issue 1 year ago • 1 comments

I want to set the following configuration option for tensorflow, I forked this repo and I can see that gpu options are being set on transcribe.py this is fine however... how do I compile from source if I modify these settings? I want to be able to run 2 processes concurrently , i have enough vram if i allocate each one to use 8gb of vram max.



# Assume that you have 12GB of GPU memory and want to allocate ~4GB:
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)

sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))


I was able to find this discourse forum, https://discourse.mozilla.org/t/how-to-restrict-transcribe-py-from-consuming-whole-gpu-memory/75555/5

this gives the code i need, but how do I compile from source once I change the code?

austinksmith avatar May 16 '24 18:05 austinksmith

I figured this out by following a combination of the old docs for version 0.74 and the read the docs version, I had to modify tensorflow directly. Modifying deepspeech to accept this flag didn't actually work. I have a compiled version that limits GPU utilization to 25% of available vram, I tested with a gtx1050TI 4GB restricted to only 1GB and it was actually fairly quick. faster than using 4gb? I think its because when i compiled natively it used some AVX512 cpu extensions to speed things up.

If anyone wants a copy of the modified deepspeech binary, let me know and i'll link it.

austinksmith avatar May 18 '24 22:05 austinksmith