Poedator

Results 12 comments of Poedator

It looks like this issue was fixed [by this commit](https://github.com/slundberg/shap/commit/8065cf7f9239574eed1c7d0ab3b40d538d16608d) on Mar 22, 2022. It is not yet part of release (as of 0.40.0). If necessary - just repeat this...

Hello, @caleb-artifact, and thank you for interest to SpQR quantization! Most likely you encountered excessive memory usage error that was fixed by now. I just re-tested it today. With PR...

Hello @ccccj , if you are focused on the best performance in some specific domain (presumably this is the reason for having your own dataset) - then you may get...

As s a solution, I added additional `expected_shapes` to `_ignore_causal_mask_sdpa()` and improved StaticCache detection code. Note: it is inconvenient to have StaticCache as layer.self_attn objects and other Caches as model-level...

all CI tests are green, SLOW tests were OK on my side yesterday

I noticed that mistral model support for 4D masks stayed broken after these fixes. So I added similar lines to `src/transformers/modeling_attn_mask_utils.py::_prepare_4d_causal_attention_mask_for_sdpa()`

I added `Mask4DTestHard` tests (without static cache part) to `tests/models/mistral/test_modeling_mistral.py` to ensure that the 4d masks keep working in the models that use `_prepare_4d_causal_attention_mask_for_sdpa()`. These new tests would fail without...

> Let's remove unrelated changes! sorry, but without these changes, the fixes and tests will not work. I looked for related PRs, all I found was #30476 but it is...

I tried to follow Arthur's advice to streamline the path for the 4D masks and it seems to work. The relevant tests do pass. @ArthurZucker @gante , please review

I combined the 2 tests from common, which were very similar. Added tolerance - now Mixtral passes it OK. @ArthurZucker, @gante - please see if it is good to merge...