Libin Tang

Results 4 issues of Libin Tang

# What does this PR do? Mixtral-8x22b model loading needs to be on meta dueto host memory limitation. Fixes # (issue) ## Before submitting - [ ] This PR fixes...

run-test
synapse 1.16_dependency

# What does this PR do? Initial enablement with FP8 Training with Intel Gaudi Transformer Engine ([(porting from OHF #91) Only linear layer is replaced with FP8. Fixes # (issue)...

# What does this PR do? 1. add use_flash_attentiong, flash_attention_recompute, flash_attention_causal_mask 2. add mark step per decoder 3. add fusedsdpa fp8 Fixes # (issue) ## Before submitting - [ ]...

# What does this PR do? Initial emblement for mamba with static shape. Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the...