Per E Kummervold
Per E Kummervold
Is it possible to train Albert from scratch in another language using a TPU v3 (128Gb)? Could you give an estimated training time? Days, weeks, months? What is a reasonable...
I am considering training AlBert from scratch in another language on a single TPU v3 128Gb. I have a corpus of around 2B words. Would this be a sufficient corpus...
Great library. Is it also possible to use the library for dividing words into syllables?
Any chance you can add TPU support in the Colab? I think this is supported more of less out of the box now in the newest pyTorch Lightning versions.
I read your paper with great interest. You seem to have a lot of novel ideas about how to improve the pretraining. Some of the scores are really impressive. I...
Trying this on a TPU. While the transcription from YouTube seemed to be working fine when I first started the Gradio App, this seems to have stopped working. Anyone else...
The performance on long audio files is just fantastic. However, I want to process a large amount of files
### Description Encountered this on Apple M2 Pro after following instructions from https://developer.apple.com/metal/jax/, and then trying to get https://github.com/sanchit-gandhi/whisper-jax to run. Steps for reproducing: * Compile and install metal jax...
When using SimpleT5, what optimizer will be used? Adafactor or AdamX? Can this be changed?