voltaML-fast-stable-diffusion TRT Inference Not Working [volta_trt

trafficstars

[E] 3: [executionContext.cpp::validateInputBindings::1831] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::validateInputBindings::1831, condition: profileMaxDims.d[i] >= dimensions.d[i]. Supplied binding dimension [2,4,64,96] for bindings[0] exceed min ~ max range at index 3, maximum dimension in profile is 64, minimum dimension in profile is 64, but supplied dimension is 96.

Exception in thread Thread-87:
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/voltaML-fast-stable-diffusion/volta_accelerate.py", line 544, in infer_trt
    images = demo.infer(prompt, negative_prompt, args.height, args.width, verbose=args.verbose, seed=args.seed)
  File "/workspace/voltaML-fast-stable-diffusion/volta_accelerate.py", line 404, in infer
    noise_pred = self.runEngine(self.unet_model_key, {"sample": sample_inp, "timestep": timestep_inp, "encoder_hidden_states": embeddings_inp})['latent']
  File "/workspace/voltaML-fast-stable-diffusion/volta_accelerate.py", line 271, in runEngine
    return engine.infer(feed_dict, self.stream)
  File "/workspace/voltaML-fast-stable-diffusion/utilities.py", line 108, in infer
    raise ValueError(f"ERROR: inference failed.")
ValueError: ERROR: inference failed.

rtx4090 used original Dockerfile from the volta_trt_flash branch.

Dec 12 '22 10:12 bigahega

Please wait for some time. We are updating the branch and pushing a new docker,

Dec 12 '22 15:12 VoltaML

Please try out our new docker.

Dec 13 '22 03:12 VoltaML

Initially worked fine with 512x512 then wanted to generate an image with 512x768 but it fell apart with the same error.

[E] 3: [executionContext.cpp::validateInputBindings::1831] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::validateInputBindings::1831, condition: profileMaxDims.d[i] >= dimensions.d[i]. Supplied binding dimension [1,4,96,64] for bindings[0] exceed min ~ max range at index 2, maximum dimension in profile is 64, minimum dimension in profile is 64, but supplied dimension is 96.

After this error, went back to 512x512 and got the following error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 2548, in __call__
    return self.wsgi_app(environ, start_response)
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 2528, in wsgi_app
    response = self.handle_exception(e)
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 2525, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1822, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1820, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1796, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/workspace/voltaML-fast-stable-diffusion/app.py", line 88, in upload_file
    pipeline_time = infer_trt(saving_path=saving_path,
  File "/workspace/voltaML-fast-stable-diffusion/volta_accelerate.py", line 541, in infer_trt
    pipeline_time = demo.infer(prompt, negative_prompt, args.height, args.width, verbose=args.verbose, seed=args.seed)
  File "/workspace/voltaML-fast-stable-diffusion/volta_accelerate.py", line 401, in infer
    noise_pred = self.runEngine(self.unet_model_key, {"sample": sample_inp, "timestep": timestep_inp, "encoder_hidden_states": embeddings_inp})['latent']
  File "/workspace/voltaML-fast-stable-diffusion/volta_accelerate.py", line 269, in runEngine
    return engine.infer(feed_dict, self.stream)
  File "/workspace/voltaML-fast-stable-diffusion/utilities.py", line 108, in infer
    raise ValueError(f"ERROR: inference failed.")
ValueError: ERROR: inference failed.

Not working anymore.

Dec 13 '22 09:12 bigahega

The engine files have not been compiled with dynamic shapes. So this is why you might be getting the error. Are you accelerating it through UI or through CLI?

Dec 13 '22 09:12 VoltaML

Built an image using the Dockerfile and accelerated through the web UI.

Dec 13 '22 09:12 bigahega

Built an image using the Dockerfile and accelerated through the web UI.

We have enabled dynamic shapes. Please try now

Dec 17 '22 04:12 harishprabhala

Same error. Build an engine locally using following execution command.

python volta_accelerate.py \ --prompt 'a highly detailed matte painting of a man on a hill watching a rocket launch in the distance by studio ghibli, makoto shinkai, by artgerm, by wlop, by greg rutkowski, volumetric lighting, octane render, 4 k resolution, trending on artstation, masterpiece' \ --height 512 --width 512 \ --model-path 'runwayml/stable-diffusion-v1-5' \ --hf-token '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' \ --backend 'TRT' \ --output-dir 'static/output' \ -v --build-dynamic-shape

Got the following errors when I tried to generate images with different sizes for example 768x768 or 512x768.

[E] 3: [executionContext.cpp::validateInputBindings::1831] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::validateInputBindings::1831, condition: profileMaxDims.d[i] >= dimensions.d[i]. Supplied binding dimension [2,4,96,96] for bindings[0] exceed min ~ max range at index 2, maximum dimension in profile is 64, minimum dimension in profile is 64, but supplied dimension is 96. )

Traceback (most recent call last): File "/media/vyro/vyro/MachineLearning/Volta-ML/voltaML-fast-stable-diffusion/volta_accelerate.py", line 688, in infer_trt pipeline_time = trt_model.infer(prompt, negative_prompt, args.height, args.width, guidance_scale=args.guidance_scale, verbose=args.verbose, seed=args.seed, output_dir=args.output_dir) File "/media/vyro/vyro/MachineLearning/Volta-ML/voltaML-fast-stable-diffusion/volta_accelerate.py", line 460, in infer noise_pred = self.runEngine(self.unet_model_key, {"sample": sample_inp, "timestep": timestep_inp, "encoder_hidden_states": embeddings_inp})['latent'] File "/media/vyro/vyro/MachineLearning/Volta-ML/voltaML-fast-stable-diffusion/volta_accelerate.py", line 324, in runEngine return engine.infer(feed_dict, self.stream) File "/media/vyro/vyro/MachineLearning/Volta-ML/voltaML-fast-stable-diffusion/utilities.py", line 108, in infer raise ValueError(f"ERROR: inference failed.") ValueError: ERROR: inference failed.

Jan 28 '23 10:01 MuhammadArham-43

voltaML-fast-stable-diffusion
voltaML-fast-stable-diffusion copied to clipboard

TRT Inference Not Working [volta_trt_flash]

voltaML-fast-stable-diffusion voltaML-fast-stable-diffusion copied to clipboard

TRT Inference Not Working [volta_trt_flash]

voltaML-fast-stable-diffusion
voltaML-fast-stable-diffusion copied to clipboard