Olivia Lee comments

Results 17 comments of


                                            Olivia Lee

shape inference

if i am right, is there a way to define behavior of certain op during shape inference

shape inference

thanks for the response, I think every op which may take a shape as input and the output shape is related with the shape input needs such consideration, and dynamic...

shape inference

a shape tracer may be needed where both the dynamic symbolic shape and the shape of shape tensor are propagated, dynamic symbolic shape is propagated and transformed in the exact...

> BTW mistral will nee a `SlidingWindowCache` based on the implementation of `RecurrentGemma`! from my understanding, SlidingWindowCache is for memory efficiency, actually I wonder how SlidingWindowCache would address the issue...

Add torch.compile for Mistral

Hi Aurthur, I have add support for Sliding Window Cache, and please take a look at its implementation and also the _update_causal_mask implementation, I have add my thoughts as comments

Add torch.compile for Mistral

> Good work > > * for all the copied from that were removed, we need to use on of the model as the new base (mixtral for example) >...

Add torch.compile for Mistral

> @ArthurZucker Don't forget our `run-slow` feature 🙏 > > @zhenglongjiepheonix Could you push an empty commit with message `[run-slow] mistral`? Thank you 🤗 The slow CI is failing on...

Add torch.compile for Mistral

Currently there are some issues related with Mistral tests, since my dev is based on A100, I run these tests on colab T4 using the current main branch, @ArthurZucker @ydshieh...

Add torch.compile for Mistral

> If except the 4 mentioned failing tests, all other are passing with this PR + `test_compile_static_cache` is passing on a A10 with torch 2.3, it's OK from my side...

Add torch.compile for Mistral

> > > If except the 4 mentioned failing tests, all other are passing with this PR + `test_compile_static_cache` is passing on a A10 with torch 2.3, it's OK from...