Lap-Hang Ho
Lap-Hang Ho
NB: word level timestamps were added to openai/whisper last week. Tried it out, it seems to work. https://github.com/openai/whisper/commit/500d0fe9668fae5fe2af2b6a3c4950f8a29aa145
Thanks for the response, I'll keen an eye on the big-refactor getting merged into main!
Hi @danihinjos, no I didn't end up implementing this unfortunately. I did try the real toxicity task, but found the perspective api too slow. As an aside, I did gain...
Had the same issue for realtime endpoints, passing in TS_MAX_RESPONSE_SIZE in the env variable solved this as well.
@levatas-weston yes to the PyTorchModel, see the base class docs below. https://sagemaker.readthedocs.io/en/stable/api/inference/model.html#sagemaker.model.Model
I'm getting this issue as well (trying qlora with ZeRO-3 and 4 gpus, same error message), @Di-Zayn were you able to solve it?
I was keen on sharding the model across gpus in order to be able to allow for larger models. As an aside, the latest FSDP and qlora examples are working...