Logan Adams comments

Results 294 comments of


                                            Logan Adams

[BUG]"deepspeed: command not found". Use a replicated Python environment

Hi @H9990HH969 - You best bet in this case is probably to build the .whl from source and copy it to the machine without internet access and then install it...

[BUG] export CUDA_VISIBLE_DEVICES=0,1,6,7 does not work

Hi @xu-song, I'll try to repro this and take a look.

[BUG] export CUDA_VISIBLE_DEVICES=0,1,6,7 does not work

At a first glance, it looks like we should be handling that properly [here](https://github.com/microsoft/DeepSpeed/blob/8d53ac0cd3a708d1b9a4281b5e01044b5bab6d61/deepspeed/launcher/runner.py#L389). Could you try setting it this way and let me know if that works for you?...

[BUG] export CUDA_VISIBLE_DEVICES=0,1,6,7 does not work

@xu-song - thanks, I was able to repro and I'm taking a look at this.

[BUG] export CUDA_VISIBLE_DEVICES=0,1,6,7 does not work

@xu-song - if you using the Deepspeed launcher, this isn't supported, but you can specify the nodes this way: https://www.deepspeed.ai/getting-started/#resource-configuration-single-node ![image](https://user-images.githubusercontent.com/114770087/232091979-292ee07a-02f9-4b23-8f95-1022d0660ce4.png)

Request to update flash_attention in deepspeed inference

Hi @bmedishe - there are two different triton versions in the requirements, one for sd that is the one you note, and the one for sparse_attn that is 1.0.0. Could...

Request to update flash_attention in deepspeed inference

@bmedishe - thanks, still working on getting these updated, will update this thread when it is complete.

Request to update flash_attention in deepspeed inference

@bmedishe - we need this specific version for now for stable diffusion for now unfortunately.

[BUG] cpu_adam warning

Hi @SkyAndCloud - the warning you are seeing comes from [here](https://github.com/microsoft/DeepSpeed/blob/master/op_builder/builder.py#L354). Specifically, this comes when the system installed cuda and torch cuda do not match each other. Are you running...

[BUG] cpu_adam warning

Are there other cuda installations on the system? Since something seems to be triggering that warning. Do you see either of the printouts from [this function ](https://github.com/microsoft/DeepSpeed/blob/master/op_builder/builder.py#LL85C19-L85C19)in your code too?