Jimin Ha
Jimin Ha
> @jiminha Can you rebase on main and fix the merge conflicts? Done. Please review.
> Is this compatible only with 1.16? When running > > ``` > QUANT_CONFIG=./quantization_config/maxabs_quant.json python run_generation.py --model_name_or_path mistralai/Mistral-7B-Instruct-v0.2 --attn_softmax_bf16 --use_hpu_graphs --trim_logits --use_kv_cache --reuse_cache --bf16 --batch_size 4 --max_new_tokens 512 --max_input_tokens 32000...
@regisss I forgot to remove some instructions that's outdated. I removed just FusedSDPA option from my code and only kept flash_attention option(which shows best performance). Try this command: python run_generation.py...
> LGTM! > > The generations with causal mask look a bit off to me at the beginning (right after the input): > > ``` > input 1: ('DeepSpeed is...
@cfgfung Your result looks good. Could you also port relevant test case for your model and make sure those are passing? https://github.com/huggingface/transformers/blob/main/tests/models/clipseg to https://github.com/huggingface/optimum-habana/tree/main/tests/transformers/tests/models
> @cfgfung Your result looks good. Could you also port relevant test case for your model and make sure those are passing? https://github.com/huggingface/transformers/blob/main/tests/models/clipseg to https://github.com/huggingface/optimum-habana/tree/main/tests/transformers/tests/models Actually, above model test cases...
@cfgfung all looks good. Could you please make sure that you ran "make style" and upload the fix? I still see many style error.
Could you provide performance/accuracy data btw GPU vs HPU for both cases that you added? Also, could you add test cases in tests/
@sywangyi for zero_shot_eval, what's the criteria for passing?
@cfgfung Your code looks good. Can you just modify your test file name to something like test_image_segementation.py . Also, please make sure your test runs on CI test. https://github.com/huggingface/optimum-habana/blob/8786b7592c58d394f9460710415de3c08775b1b6/Makefile#L45