Cyril Vallez

Results 63 comments of Cyril Vallez

Hi @ArthurZucker, thanks for the feedback! Unfortunately, I don't think we can handle all cases by keeping the same data structure. For example, if someone passes the `past_key_values` to `generate()`...

Hi @ArthurZucker, I am done with the work. Could you please review it? At this point, it should be 100% backward compatible. The only change is that now `past_key_values` will...

Hi @ArthurZucker, I was investigating why we observe those "minimal improvements" even with very large input sizes and just 2 new tokens. I found out that the reason was that...

> @Cyrilvallez kudos for this really high quality contribution 🤗 🚀 Your in-depth explanations are very useful, and all of it makes sense. I think at some point we did...

@ArthurZucker I agree that a new class would be clearer and more maintainable. Not sure what you meant by "that would be used upon activation" however? I would use the...

Ok, then by default we will still benefit from removing the leak of the logits which is already a big gain. I will make the necessary changes next Monday 👌🏻

@ArthurZucker @gante I realized yesterday that what actually creates the copies is not the current `DynamicCache` itself, but the back and forth `from_legacy_cache` and `to_legacy_cache` calls (that creates tuples that...

@ArthurZucker @gante The work is ready for final review! As previously said, `EfficientDynamicCache` was not needed in the end. This makes the changes more natural based on the current state...

@ArthurZucker @gante I applied all changes following your comments! Repo consistency and code quality errors do not come from files I modified (I think code quality errors come from ruff...