Matt
Matt
:relieved: 
I'm working on that and _highly confident¹_ that a solution can be found. ¹ Not at all confident
Quick update - `tokenizer.convert_tokens_to_string()` is actually a general method that does what we want, but it has the problem that it strips prefix spaces in tokenizer classes where prepending prefix...
@zucchini-nlp The condition now returns a per-sample vector correctly. Can I be lazy and ask you to add the test for it that @amyeroberts was requesting in #29116 here? If...
@zucchini-nlp you can just add the test to this PR's branch instead!
cc @amyeroberts @gante this PR now tests per-row stopping conditions from #29116, thanks to @zucchini-nlp. Tests are passing, so the feature looks good! I ran the slow tests locally as...
cc @amyeroberts this should be ready for re-review! A quick summary of the changes: - The old method of hardcoding the removal of prefixes like `##` is gone - it...
No, thank you for all the patience fixing my horrifically verbose docstrings and incomprehensible tests, lol
It's on the to-do list, but I'm afraid there are competing priorities at the moment!
Hi @sanjeevk-os, I actually took a look at the ESM code - it actually looks like some of the supports for gradient checkpointing are already there, in which case you...