Google Colab - from_pretrained hang

Open asutermo opened this issue 1 year ago • 0 comments

Issue:

Colab will always exit out around the 'HEAD' request (I haven't debugged further yet, but this is the last output log) even though I've precached the models. It seems that as soon as this request is sent, the process ends.

[Dataset 0]
loading image sizes.
make buckets
number of images (including repeats) / 各bucketの画像枚数（繰り返し回数を含む）
bucket 0: resolution (320, 704), count: 60
bucket 1: resolution (384, 640), count: 190
bucket 2: resolution (448, 576), count: 420
bucket 3: resolution (512, 512), count: 70
bucket 4: resolution (576, 448), count: 55
bucket 5: resolution (640, 384), count: 195
bucket 6: resolution (704, 320), count: 10
mean ar error (without repeats): 0.06102849154163728
clip_skip will be unexpected / SDXL学習ではclip_skipは動作しません
preparing accelerator
loading model for process 0/1
load Diffusers pretrained models: stabilityai/stable-diffusion-xl-base-1.0, variant=fp16
start sdxl
https://huggingface.co:443 "GET /api/models/stabilityai/stable-diffusion-xl-base-1.0 HTTP/1.1" 200 6426
https://huggingface.co:443 "HEAD /stabilityai/stable-diffusion-xl-base-1.0/resolve/main/model_index.json HTTP/1.1" 200 0

I added extra logs, it fails on in the 'from_pretrained' call.

        logger.info(f"load Diffusers pretrained models: {name_or_path}, variant={variant}")
        try:
            try:
                logger.info('start sdxl')
                pipe = StableDiffusionXLPipeline.from_pretrained(
                    name_or_path, torch_dtype=model_dtype, variant=variant, tokenizer=None
                )
                logger.info('finish sdxl')

What I've tried

No caching and just using sd-scripts
Pre-caching using diffusers
Pre-caching using huggingface-cli

Details:

Google Colab T4 Everything running in Conda Pip: pip install diffusers==0.27.2 transformers==4.37.0 torch==2.3.0 torchvision torchaudio accelerate einops opencv-python voluptuous Source Tag: v0.8.7

nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   34C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |

Jun 20 '24 22:06 asutermo