Sparsh Tewatia issues

Repositories
Issues
Comments

Results 3 issues of


                                            Sparsh Tewatia

[New Model]: Phi3ForCausalLM

### The model to consider. https://huggingface.co/microsoft/Phi-3-medium-128k-instruct I was trying to run the exl2 quants for these models , but getting error at rotatry embedding these models use two rope scaling...

Flash attention soft capping support

In Jax experimental pallas kernels for TPU , there is support for attn logits softcapping for paged attention but not for flash attention. If support can be added for pallas...

enhancement

How to do sequence classification training ?

**Describe the bug** I want to train a reward model using Easydel with sequence classification. The classifier has been implemented in the Flax sequence classifier classes for each model, but...