Stas Bekman
Stas Bekman
@arthur-morgan-712, you have a problem with your cuda environment: ``` /root/anaconda3/envs/bitten/lib/python3.8/site-packages/deepspeed/ops/csrc/includes/gemm_test.h:6:10: fatal error: cuda_profiler_api.h: No such file or directory #include ``` properly install the cuda environment, including all dev header...
Logan, did Deepspeed add a feature to allow users to configure a list of param names not to be sharded? If I'm not mistaken, based on Tim's OP, this is...
a gentle ping here, as the m4 group needs to have all official opt models to support fast tokenizers. Thank you, @ArthurZucker!
we can also re-do the tiny tokenizers if they don't conform with the needs of the CI.
Thank you very much for taking care of this, Arthur!
the fix offered by @mrelg is still needed to make this project work. Here is the patch version of the same as https://github.com/vaibhavk97/GoBooDo/issues/60#issuecomment-918563558 ``` diff --git a/GoBooDo.py b/GoBooDo.py index 0971a7d..2dab8f3...
ok, the deepspeed CI is running pt-1.8 - how do we solve that then? I have passed this change to the Deepspeed team let's see what they say. edit: they...
Just read the LORA paper and your implementation combined with weight quantization is very neat, @deniskamazur. Thank you! a few comments: 1. If and when this is integrated into transformers...
> If this is not possible i hope it will be easy to detect that a model is the 8-bit variant, so we can avoid executing half() on the model....
> * I'd like to discuss if we actually need LoRa adapters in the possible implementation. As I see it, they are not necessarily a part of the 8bit model....