voltaML-fast-stable-diffusion icon indicating copy to clipboard operation
voltaML-fast-stable-diffusion copied to clipboard

NameError: name 'loaded_model' is not defined, and FileNotFoundError: [Errno 2] No such file or directory: 'onnx/clip.onnx'

Open dtheb opened this issue 2 years ago • 1 comments

Hi,

Trying to run accelerated SD1.5 models, getting this issue Running on windows 11 WSL, with RTX 3070 8GB

CMD:

docker run --gpus=all -v C:\voltaml\engine/engine:/workspace/voltaML-fast-stable-diffusion/engine -v C:\voltaml\output/engine:/workspace/voltaML-fast-stable-diffusion/static/output -p 5003:5003 -it voltaml/volta_diffusion_webui:v0.2
172.17.0.1 - - [18/Dec/2022 13:15:21] "POST /voltaml/job HTTP/1.1" 500 -
Traceback (most recent call last):
  File "/workspace/voltaML-fast-stable-diffusion/volta_accelerate.py", line 661, in infer_trt
    if loaded_model!=args.model_path:
NameError: name 'loaded_model' is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 2548, in __call__
    return self.wsgi_app(environ, start_response)
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 2528, in wsgi_app
    response = self.handle_exception(e)
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 2525, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1822, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1820, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1796, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/workspace/voltaML-fast-stable-diffusion/app.py", line 88, in upload_file
    pipeline_time = infer_trt(saving_path=saving_path,
  File "/workspace/voltaML-fast-stable-diffusion/volta_accelerate.py", line 664, in infer_trt
    load_trt(saving_path, model, prompt, img_height, img_width, num_inference_steps)
  File "/workspace/voltaML-fast-stable-diffusion/volta_accelerate.py", line 599, in load_trt
    trt_model.loadEngines(engine_dir, onnx_dir, args.onnx_opset,
  File "/workspace/voltaML-fast-stable-diffusion/volta_accelerate.py", line 279, in loadEngines
    torch.onnx.export(model,
  File "/usr/local/lib/python3.8/dist-packages/torch/onnx/__init__.py", line 350, in export
    return utils.export(
  File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 163, in export
    _export(
  File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 1148, in _export
    with torch.serialization._open_file_like(f, "wb") as opened_file:
  File "/usr/local/lib/python3.8/dist-packages/torch/serialization.py", line 230, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/usr/local/lib/python3.8/dist-packages/torch/serialization.py", line 211, in __init__
    super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'onnx/clip.onnx'

dtheb avatar Dec 18 '22 13:12 dtheb

You need to create a folder called "onnx"

VoltaML avatar Dec 19 '22 12:12 VoltaML

where?

veonua avatar Dec 28 '22 07:12 veonua

Please pull the new docker and try it again. The issue should be fixed.

harishprabhala avatar Dec 29 '22 09:12 harishprabhala

I'm on v 03 what is new ?

veonua avatar Dec 29 '22 14:12 veonua

I'm on v 03 what is new ?

Fixes: CFG now working for TensorRT, -1 seed, inference requests are now being handled smoothly

VoltaML avatar Dec 29 '22 15:12 VoltaML

sorry for my English. I am on docker v 03 and there is the same error. What path needs to be created to fix this issue?

veonua avatar Dec 29 '22 15:12 veonua

I get the same error trying voltaML for the first time. Is this a dead repo?

aifartist avatar Jan 28 '23 03:01 aifartist

I get the same error trying voltaML for the first time. Is this a dead repo?

No, quite the opposite is true, you can take a look at the draft in Pull Requests tab. I will be merging that to the experimental branch today. TRT is still in the works when it comes to the new WebUI though but you can expect it soon.

Stax124 avatar Jan 28 '23 08:01 Stax124

@Stax124 I've been doing perf analysis of SD within the context of A1111 and found a 3X perf boost for 4090's doing basic 1 image batches at 512x512 euler_a sd v2.1. I had 13.5 it/s before and now get 39.5 it/s. This is on a fast i9-13900 with a 4090 on Ubuntu. This was from upgrading cuDNN from v8.5 to v8.7. I brought this up with the PyTorch team on github and they created a PR and will be fixing it soon if it hasn't already been merged into PyTorch 2.0.0. The best I got when I tried VoltaML was only 18 it/s and that was even after I upgraded cuDNN to v8.7 and pytorch to the nightly build. I found the problem with the error above. The undefined variable occurs only if you try a TRT generation before clicking accelerate and ?compiling? or whatever it does. But even then it still doesn't work. I have not bothered trying to debug it myself but given the poor SD perf I'm not sure if TRT is really going to be significantly better than what I already can do.

FYI, people on Windows report something closer to a 2X improvement with my changes and can't seem to get my easy to repro 39.5 number. I do have a dual boot setup but haven't bothered to find out what's wrong with Windows. I'm happy with my Ubuntu perf. FYI2, I've also discovered surprisingly that single thread CPU perf makes a huge difference for what I was expecting was mostly GPU processing. A 128 thread threadripper can't do a serial stream of 1 image at a time anywhere as fast as a 5.8 GHz Raptor Lake. I've posted the details of this elsewhere. I'd get a threadripper IF AND ONLY IF I was doing SD on the CPU. But I'm not.

If VoltaML does have some magic that provides a good perf speed up I'd be happy to help test it. Currently I can build the nightly Torch 2.0 and can use CUDA 12.0 if needed. You might consider added Euler_a support as I find it is faster than most?all? of the others, although I haven't tested all of them.

aifartist avatar Jan 28 '23 21:01 aifartist

@aifartist We already found a patch for the 4090 and we are aware of the problems with Ada Lovelace architecture. Documentation for this bug will be available in our own docs. For now, we will help people that ask (or I might display a clear warning). With it, we got 49it/s (approx with xFormers).

Also, this bug is already patched on the experimental branch, it just needed a folder created before it runs. I forgot to close the issue - I have a lot of work on my table - so thanks for bringing my attention to this issue.

On the last note, Yes - I believe in TRT if we can make it work with lower VRAM because it really brings a big performance uplift. If you want to test the new stuff that we are adding, please check the experimental branch. I would like more people to help me or at least, give me their honest opinions.

Last - PyTorch supports most of the K-Diffusion samplers (Euler Ancestral included) with Karras sigmas, while TRT supports only diffusers without Karras sigmas (but we still have an Euler A, just a worse one)

Thanks for your interest in this project and have a nice rest of your day.

(Also if you want to chat with me or other devs, come to our discord: https://discord.com/invite/pY5SVyHmWm, I will happily chat about this topic with you)

Stax124 avatar Jan 28 '23 21:01 Stax124