Jules Gagnon-Marchand

Results 42 comments of Jules Gagnon-Marchand

Added a test to make sure that there is still text to work on, otherwise it would crash.

Passes all tests. Pull request at https://github.com/ShailChoksi/text2digits/pull/48

bad_word_ids accept ngrams, you could've just tokenized your rejected word list

did you look into using `transformers.Constraint`?

yes people seem to usually just have different heads

I'm trying to get `google/flan-t5-xxl` to run with a single A100 80GB gpu, for seq2seq policy. Is there already a way to set the precision to bfloat16? (I don't see...

Enabling offloading a model from GPU memory to CPU memory when it's not in use would likely be helpful too.

@gabrielhuang have you started doing work like this? (I'm also at Mila)

This is my current approach, indeed, just allowing the user to pass kwargs for `from_pretrained` and `Linear`. Passing `torch_dtype` to `from_pretrained` and `dtype` to `Linear` works. I suppose adding amp...

looks like stable baselines 3 doesn't support bfloat16, because of all the `a_tensor_name.cpu().numpy()` calls. Indeed, doing that with a `bfloat16` tensor leads to an exception, because torch tries to build...