Jules Gagnon-Marchand
Jules Gagnon-Marchand
Added a test to make sure that there is still text to work on, otherwise it would crash.
Passes all tests. Pull request at https://github.com/ShailChoksi/text2digits/pull/48
bad_word_ids accept ngrams, you could've just tokenized your rejected word list
did you look into using `transformers.Constraint`?
yes people seem to usually just have different heads
I'm trying to get `google/flan-t5-xxl` to run with a single A100 80GB gpu, for seq2seq policy. Is there already a way to set the precision to bfloat16? (I don't see...
Enabling offloading a model from GPU memory to CPU memory when it's not in use would likely be helpful too.
@gabrielhuang have you started doing work like this? (I'm also at Mila)
This is my current approach, indeed, just allowing the user to pass kwargs for `from_pretrained` and `Linear`. Passing `torch_dtype` to `from_pretrained` and `dtype` to `Linear` works. I suppose adding amp...
looks like stable baselines 3 doesn't support bfloat16, because of all the `a_tensor_name.cpu().numpy()` calls. Indeed, doing that with a `bfloat16` tensor leads to an exception, because torch tries to build...