Blake
Blake
It fixed the issue for you, but what does it do?
I had originally tried to print out a message but for whatever reason, it was not appearing. I may try to add this in the future. But otherwise, I would...
I have this same issue. I can do Lora/Dora, DDP Lora/Dora, QLora/QDora, DDP QLora/QDora, FSDP Lora/Dora, and FSDP QLora but FSDP QDora does not seem to be working.
This fixed the issue I was having, but when using DORA/QDora with FSDP it errors outs: [rank0]: Traceback (most recent call last): [rank0]: File "trl_finetune.py", line 401, in [rank0]: trainer.train(resume_from_checkpoint=args.resume_from_checkpoint)...
Smooth quant seems broken as well.
I am using v0.7.1. The latest tag
I am using this software as well as [tensorrtllm_backend](https://github.com/triton-inference-server/tensorrtllm_backend). I forget which project was having issues, but I was unable to build the docker image then. I will try again...
With DeepSpeed 0.8.2 JIT I get an new error: Setting pad_token_id to eos_token_id:50256 for open-end generation. !!!! kernel execution error. (m: 16384, n: 4, k: 4096, error: 13) !!!! kernel...
If I run the code with something like this it seems to work: ```python gpt_model.model = deepspeed.init_inference(gpt_model.model, mp_size=world_size, dtype=dtype, max_tokens=args.max_tokens) ``` By removing ```replace_with_kernel_inject=True``` it seems to fix the issues...
Just made an issue for this at #2955 . I am pretty sure that bfloat16 is not currently supported. Float32, float16 and int8 are supported(though I have had issues with...