Andy Ehrenberg comments

Repositories
Issues
Comments

Results 12 comments of


                                            Andy Ehrenberg

Flax Whisper uses a lot of GPU memory

Some of the extra GPU memory can probably be attributed to how the flax generation implements the kv cache. Check what happens when you set max new tokens to be...

Flax Whisper uses a lot of GPU memory

Also, it doesn't make sense to run the flax stuff within a `torch.no_grad()` context.