Michael Voznesensky
Michael Voznesensky
I'm a noob here, so please, disregard readily and with prejudice if what I write does not jive with the spirit of the project. It feels like localtime_thread_safe is a...
> This sounds great! I was also wondering how fast it'd be if [Triton's flash attention](https://github.com/openai/triton/blob/master/python/tutorials/06-fused-attention.py) was integrated, but unfortunately it's A100 only. > > Implementation-wise, I think we could...
> do you have more benchmarks? for example, on cpu. No, I am sorry, I do not. I plan on working with @jongwook to benchmark this properly :)
This is the RFC - The implementation PR will be here: https://github.com/openai/whisper/pull/115
> @voznesenskym I am curious why you guys started with "torchdynamo" instead of more widely-adopted "torchscript". We are in the process of making this torch.jit compatible, so I was wondering...
> @voznesenskym I am trying to benchmark your approach with torchdynamo but got some error modules. do you know which version torchinductor, torchdynamo and triton are used to make your...
> Can someone provide any info in how to actually use this change? > > My guess is that you have to set TorchDynamo to Whisper somehow. but nor sure...
@pytorchbot merge
> @voznesenskym is there a test forthcoming or should we land this? I was torn on it - I had a test but it was ugly and felt like it...
@pytorchbot rebase