Preetam Chhimpa

Results 9 comments of Preetam Chhimpa

Quick update: - `test_training_overfit` for BLT now passes locally with the overridden thresholds in `BltModelTest` (loss ~95% reduction, grad norm ~81% reduction), and generation overfits the fixed pattern. - CI...

> I found it weird that the generation is not working with use_cache=True. I think it is worth investigating why (cc: @itazap if you have time to guide @preetam1407 )...

> checking monday, it is weird to me that `make fixup` doesnt work as expected. You shouldn't have to add those placeholders to begin with @ArthurZucker This is fixed now....

@3outeille, will be waiting for your review! I think we have resolved all the issues mentioned last week.

> alright, just last issue to address and it will be good to merge. Good job overall ! 🚀 Hey @3outeille, could you please point me to the last remaining...

@3outeille, updated `tests/test_training_mixin.py` to set `use_cache=True` only for `model_type == "recurrent_gemma"`.

A few CI checks are still failing. The `CI tests_tokenization` failures look infra-related, similar to some earlier CI failures in this PR. I ran the failing tests locally, and they...

@3outeille, all requested changes are done. Whenever you get time, I’d appreciate you taking a look to merge it. Thanks a lot!