openvino_notebooks [235-controlnet] Fail to prepare calibration datasets

Describe the bug Section: Prepare calibration datasets Code block:

import datasets

num_inference_steps = 20
subset_size = 200

dataset = datasets.load_dataset("jschoormans/humanpose_densepose", split="train", streaming=True).shuffle(seed=42)
input_data = []
for batch in dataset:
    caption = batch["caption"]
    if len(caption) > tokenizer.model_max_length:
        continue
    img = batch["file_name"]
    input_data.append((caption, pose_estimator(img)))
    if len(input_data) >= subset_size // num_inference_steps:
        break

Error output:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[32], line 15
     13     continue
     14 img = batch["file_name"]
---> 15 input_data.append((caption, pose_estimator(img)))
     16 if len(input_data) >= subset_size // num_inference_steps:
     17     break

File /opt/conda/lib/python3.10/site-packages/controlnet_aux/open_pose/__init__.py:214, in OpenposeDetector.__call__(self, input_image, detect_resolution, image_resolution, include_body, include_hand, include_face, hand_and_face, output_type, **kwargs)
    211         output_type = "pil"
    213 if not isinstance(input_image, np.ndarray):
--> 214     input_image = np.array(input_image, dtype=np.uint8)
    216 input_image = HWC3(input_image)
    217 input_image = resize_image(input_image, detect_resolution)

TypeError: int() argument must be a string, a bytes-like object or a real number, not 'dict'

Installation instructions (Please mark the checkbox) [*] I followed the installation guide at https://github.com/openvinotoolkit/openvino_notebooks#-installation-guide to install the notebooks. I run the notebook on Kaggle.

Additional context Also, on Kaggle the %load_ext skip_kernel_extension fails to be executed, so I had to delete this line and all the lines to skip cells if not checked.

Feb 24 '24 12:02 MarkYav

@l-bat Can you check, please?

Feb 24 '24 12:02 MarkYav

@MarkYav, Thanks for reporting the problem. Unfortunately, I'm not able reproduce provided error. Maybe there is something additional specific details? which datasets, controlnet-aux, pytorch, transformers and diffusers version do you use?

Feb 25 '24 20:02 l-bat

@l-bat Thank you for your prompt reply!

On Kaggle:

datasets                                 2.17.1
controlnet_aux                           0.0.7
pytorch-ignite                           0.4.13
pytorch-lightning                        2.1.3
transformers                             4.37.0
diffusers                                0.26.3

The full list could be found here: https://pastebin.com/WGGLrYT5

Also, it runs on Colab Enterprise with installed libs (see the list mentioned below). But the problem is that after quantization the generated image is bad (see photo).

It extracts the pose correctly, tho:

Installed libs Colab Enterprise: https://pastebin.com/U0GtqC4r

Feb 26 '24 19:02 MarkYav

@MarkYav, what inference device do you use? Looks like a problem with the calibration data or with device.

Feb 28 '24 10:02 l-bat

@l-bat I use Colab Enterprise on Google Cloud. There are details:

Feb 28 '24 10:02 MarkYav

@MarkYav, do you use CPU here?

Feb 28 '24 10:02 l-bat

@l-bat Yes

Feb 28 '24 11:02 MarkYav

@MarkYav, could you please try downloading the unet_calibration_data.pkl and loading it into notebook instead of preparing a calibration dataset? This will help me to understand if there is problem with data preparation.

import pickle
with open('unet_calibration_data.pkl', 'rb') as f:
    unet_calibration_data = pickle.load(f)

Feb 28 '24 19:02 l-bat

@l-bat I tried to run the notebook using the code provided.

I changed:

%%skip not $to_quantize.value

CONTROLNET_INT8_OV_PATH = Path("controlnet-pose_int8.xml")
UNET_INT8_OV_PATH = Path("unet_controlnet_int8.xml")

# These are commented lines:
# if not (CONTROLNET_INT8_OV_PATH.exists() and UNET_INT8_OV_PATH.exists()):
    # unet_calibration_data = collect_calibration_data(ov_pipe, subset_size=subset_size)

# These are new inserted lines:
import pickle
with open('unet_calibration_data.pkl', 'rb') as f:
    unet_calibration_data = pickle.load(f)

And I got an error in the next cell:

I checked the code and the length of the downloaded unet_calibration_data is 20:

In this cell the prev_idx is initially set to 0, and in the loop goes from 0 to 19. But the problem is that later we set prev_idx += num_inference_steps and now prev_idx is 20 and will go from 20 to 39 -- here we have a list index out of range exception. See photos:

Thus, I assume there is a bug in the calibration dataset preparation OR usage.

Mar 01 '24 13:03 MarkYav

@MarkYav, I uploaded small subset of unet_calibration_data, which contains 20 samples. I forgot to mention that you should also change size of subset_size to 20 (instead of 200) to align input_data with unet_calibration_data

Mar 04 '24 12:03 l-bat

@l-bat I tried to run it but got the same output:

Right after that I generated a picture using the NOT quantized model and got this:

I was using ov_pipe(prompt, pose, 20, negative_prompt=negative_prompt). Also, I am surprised that the generated image is square.

Mar 04 '24 16:03 MarkYav

@MarkYav, if you can provide cat /proc/cpuinfo maybe I can find a suitable configuration to reproduce the error

Mar 08 '24 09:03 l-bat

@l-bat

I tried to use another Machine type on Google Cloud: before I was using e2-highmem-8 but after I changed to n2-highmem-8 it all worked. Also, I used the last version of the tutorial Notebook.

Apr 05 '24 08:04 MarkYav

@MarkYav could we close the issue?

Apr 05 '24 08:04 l-bat

@l-bat Yes, please.

Apr 05 '24 09:04 MarkYav

openvino_notebooks openvino_notebooks copied to clipboard

[235-controlnet] Fail to prepare calibration datasets

openvino_notebooks
openvino_notebooks copied to clipboard