diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Expansion proposal of `diffusers-cli env`

Open tolgacangoz opened this issue 1 year ago • 20 comments

Coming from this discussion: #7345. Do you want me to add more or remove some of them? Also, I thought of adding GPU's model if available, but I guess it violates privacy, right?

transformers-cli env
An NVIDIA GPU may be present on this machine, but a CUDA-enabled jaxlib is not installed. Falling back to CPU.

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

- `transformers` version: 4.38.2
- Platform: Linux-6.5.0-26-generic-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.21.4
- Safetensors version: 0.4.2
- Accelerate version: 0.28.0
- Accelerate config:  not found
- PyTorch version (GPU?): 2.2.1+cu121 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): 0.8.2 (CPU)
- Jax version: 0.4.25
- JaxLib version: 0.4.25
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>
huggingface-cli env
Copy-and-paste the text below in your GitHub issue.

- huggingface_hub version: 0.21.4
- Platform: Linux-6.5.0-26-generic-x86_64-with-glibc2.35
- Python version: 3.10.12
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Token path ?: /home/user_name/.cache/huggingface/token
- Has saved token ?: False
- Configured git credential helpers: cache
- FastAI: N/A
- Tensorflow: N/A
- Torch: 2.2.1
- Jinja2: 3.1.3
- Graphviz: N/A
- Pydot: N/A
- Pillow: 10.2.0
- hf_transfer: N/A
- gradio: N/A
- tensorboard: N/A
- numpy: 1.26.4
- pydantic: N/A
- aiohttp: 3.9.3
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: /home/user_name/.cache/huggingface/hub
- HF_ASSETS_CACHE: /home/user_name/.cache/huggingface/assets
- HF_TOKEN_PATH: /home/user_name/.cache/huggingface/token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10

@sayakpaul @yiyixuxu @DN6 @asomoza

tolgacangoz avatar Mar 20 '24 07:03 tolgacangoz

Should I also add GPU's model if available to detect this kind of situation: #2153. However, I am unsure if this creates potential privacy issues.

tolgacangoz avatar Mar 21 '24 05:03 tolgacangoz

Should I also add GPU's model if available to detect this kind of situation: https://github.com/huggingface/diffusers/issues/2153. However, I am unsure if this creates potential privacy issues.

What will help a lot is the amount of VRAM and maybe the architecture, but the VRAM would be good to know for the OOM issues.

I don't think it has privacy issues since the user have to willingly run the command and then post the information themselves, they can even edit the parts they don't want to share.

asomoza avatar Mar 21 '24 10:03 asomoza

Should I also add GPU's model if available to detect this kind of situation: https://github.com/huggingface/diffusers/issues/2153. However, I am unsure if this creates potential privacy issues.

Agree with @asomoza's https://github.com/huggingface/diffusers/pull/7403#issuecomment-2011821072 comment :)

sayakpaul avatar Mar 21 '24 12:03 sayakpaul

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

~~I am still testing Windows part.~~ Could anybody test Mac OS part if possible?

tolgacangoz avatar Mar 22 '24 11:03 tolgacangoz

Tested on my Mac M2:

GPU Model: Apple M2 Max

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

- `diffusers` version: 0.28.0.dev0
- Platform: macOS-13.3-arm64-arm-64bit
- Running on a notebook?: No
- Running on Google Colab?: No
- Python version: 3.9.17
- PyTorch version (GPU?): 2.2.1 (False)
- Flax version (CPU?/GPU?/TPU?): 0.7.0 (cpu)
- Jax version: 0.4.13
- JaxLib version: 0.4.13
- Huggingface_hub version: 0.20.3
- Transformers version: 4.36.2
- Accelerate version: 0.25.0
- Accelerate config: 	not found
- PEFT version: 0.9.1.dev0
- Safetensors version: 0.3.1
- xFormers version: not installed
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

Could it make sense to make GPU Model: Apple M2 Max a part of the bullets? And call it "Accelerator"?

sayakpaul avatar Mar 22 '24 11:03 sayakpaul

I handled the merge conflict. Now, it is ready for a review. @sayakpaul @yiyixuxu @asomoza

tolgacangoz avatar Mar 26 '24 14:03 tolgacangoz

Would like to @BenjaminBossan's thoughts too.

sayakpaul avatar Mar 26 '24 14:03 sayakpaul

I added the references. I actually wanted to think more about what else could be added; sorry for waiting that much. I remember I found one, but later I forgot :laughing:. Anyway, I will resolve the current conversations.

tolgacangoz avatar Apr 10 '24 14:04 tolgacangoz

@standardAI Is this ready for review?

DN6 avatar Apr 22 '24 08:04 DN6

Within this day, yes.

tolgacangoz avatar Apr 22 '24 08:04 tolgacangoz

Nice work, I like all the info it gives, I have one issue with the accelerate config:

image

It doesn't detect the config file but I created it with the default command:

image

On the other side, even if it worked, this is what I have:

compute_environment: LOCAL_MACHINE
debug: false
distributed_type: 'NO'
downcast_bf16: 'no'
gpu_ids: all
machine_rank: 0
main_training_function: main
mixed_precision: 'no'
num_machines: 1
num_processes: 1
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false

IMO it'll just clutter the copy & paste in the issues with something that's is not relevant most of the time, if the issue has something to do with accelerate we can ask for the config file but I don't think it's a good idea to have it all the time.

asomoza avatar Apr 23 '24 16:04 asomoza

I think I agree with @asomoza. Perhaps we exclude accelerate information from the CLI command for now. We can revisit adding it back in if there's a pressing need.

DN6 avatar Apr 24 '24 06:04 DN6

Thank you for the feedback. Sorry, I didn't mention one needed to add --accelerate-config_file flag with the location of the config file, and just "default_loc" for the default location. Anyway, I removed it.

tolgacangoz avatar Apr 24 '24 09:04 tolgacangoz

Thanks so much for all your invaluable feedback. This PR is ready for a final review now.

tolgacangoz avatar Apr 24 '24 12:04 tolgacangoz

Thanks for the approvals. Plus bitsandbytes and quanto?

tolgacangoz avatar May 06 '24 17:05 tolgacangoz

IMO bitsandbytes would be good. For all the new models that use a T5 text encoder, most people will need to use it in 8bit or 4bit so some issues can come from this library, especially in windows.

quanto in diffusers it's still an experiment so probably not yet.

asomoza avatar May 06 '24 17:05 asomoza

@standardAI Can we do a final review and merge? I think this is ready.

DN6 avatar May 07 '24 07:05 DN6

Yes, this is ready. I've only added bitsandbytes since the last approvals.

tolgacangoz avatar May 07 '24 07:05 tolgacangoz

The latest view:

❯ diffusers-cli env
An NVIDIA GPU may be present on this machine, but a CUDA-enabled jaxlib is not installed. Falling back to cpu.

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

- 🤗 Diffusers version: 0.28.0.dev0
- Platform: Ubuntu 24.04 LTS - Linux-6.8.0-31-generic-x86_64-with-glibc2.39
- Running on a notebook?: No
- Running on Google Colab?: No
- Python version: 3.10.14
- PyTorch version (GPU?): 2.3.0+cu121 (True)
- Flax version (CPU?/GPU?/TPU?): 0.8.3 (cpu)
- Jax version: 0.4.26
- JaxLib version: 0.4.26
- Huggingface_hub version: 0.23.0
- Transformers version: 4.40.2
- Accelerate version: 0.30.0
- PEFT version: 0.10.0
- Bitsandbytes version: 0.43.1
- Safetensors version: 0.4.3
- xFormers version: not installed
- Accelerator: NVIDIA GeForce GTX 1650, 4096 MiB VRAM
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

tolgacangoz avatar May 09 '24 07:05 tolgacangoz

@sayakpaul I think this is ready to merge now too, no?

yiyixuxu avatar May 13 '24 21:05 yiyixuxu

Thanks for merging!

tolgacangoz avatar May 14 '24 06:05 tolgacangoz