Joseph Liba
Joseph Liba
Because of the Python GIL, the preprocessing doesn't fully efficiently use all the CPU cores. By spawning the CPU tasks in its own multiprocess, you can get requests that happen...
Im trying my AMD EPYC 7302 with Nvidia A4000 and A5000. One thing I notice is that with a single request, allowing for both GPUs in the device list and...
## 🚀 Feature Request Provide context to the input, but not actually include the context in the translation. Useful for real time translation applications. Also continue translations from context translation....