Olivia Lee issues

Results 10 issues of


                                            Olivia Lee

why I get all AP zeros when I try to evaluate after training?

![1](https://user-images.githubusercontent.com/32992656/98470475-cf785600-2220-11eb-8e9e-6c9d3de29a97.png) this is training loss of the last step

shape inference

I think some ops should propagate its result rather than the shape of its result in order to let following ops work properly during shape inference, for example, consider the...

bug

shape inference

no-issue-activity

Add torch.compile for Mistral

As suggested by the title, this PR attempts to add torch.compile support for mistral, and this is a not-ready-to-merge PR, it tries to replicate what has been done in Llama...

save as pb

hey, i am using model.save as you mentioned so that i could get .pb file, but it turns out that i only get a file without any suffixes which is...

Add torch compile for mixtral

This PR is working in progress and it tries to add torch compile support for Mixtral, it currently also contains changes from #30642 because there are some common ground shared...

Add torch.compile Support For Mamba

`torch.compile` support for mamba! Closes #31246

run-slow

Fix attn mask ignore logic in training-time trace

This pr fixes a scenario where we want to use dynamo trace in training mode, the current attn mask ignore logic creates a problem where data-dependent branch condition `torch.all(attn_mask==1)` will...

run-slow

Add Param Cache For Recompilation

The parameter cache instance is needed to handle recompilation where we need to make sure the parameters we created in the first run are used, currently the use case does...

Modify Parallelization Strategy to Make it More General

As per title, this PR tries a more general approach rather than relying purely on human heuristics, basically it uses the following steps to search a possible parallelization strategy for...

[WIP] Refactor to Introduce Backend Abstraction

# What does this PR do? - [x] add backend abstraction - [x] refactor the original pipeline flow to accommodate potential needs of different backend - [x] modify API so...