Erfan Zare Chavoshi

Results 26 comments of Erfan Zare Chavoshi

I have implemented a version of that but I haven't checked that yet I used the same architecture as EasyLM in some parts https://github.com/erfanzar/EasyDeL/blob/main/EasyDel/modules/llama/modelling_llama_flax.py

is it available now? I mean there's still no way to leave a team I think that something funny is how this kinda basic feature is not supported yet.

obrax do not support load streaming and sharding data or array across devices with pjit so I think the current checkpointing method that is being used right now is a...

is it possible to share weights and state with me? so i can debug that and fix issue, anyway that's the first time i see an issue like that i...

this issue might be fixed do to recent changes and bug fixes in past days in fjformer

Hi and thanks for using easydel Actually im creating that and mostly focusing on cpu and gpu so i forgot to test that on tpus ... Ill fix that soon,...

i have fixed the issue related to shmap and xmap ..., but some custom kernels are still not supported or have incorrect computations in TPUv3, and pallas flash attention can...

use 1,-1,1,1 that's the best sharding case or write custom sharding methods and use FSDP on every layer that's easier

Hi @davisyoshida, I hope you're doing well! I wanted to reach out regarding **qax** usage in **fjformer** for quantization workflows within the EasyDel project. After the 0.4.35 release, I noticed...