alibi icon indicating copy to clipboard operation
alibi copied to clipboard

Alibi Explainer on Object detection model (say YOLO V4 model)

Open aniketzz opened this issue 2 years ago • 11 comments

I am trying to explain the object detected via any object detection model just as an image classification model using Seldon alibi AnchorImage algorithm. I modified my prediction function such that the out of the function is the image and the image class.

However, I am getting the following error.

explainer = AnchorImage(predict_fn, image_shape, segmentation_fn=segmentation_fn, segmentation_kwargs=kwargs, images_background=None) File "~/python3.7/site-packages/alibi/explainers/anchor_image.py", line 348, in __init__ self.predictor = self._transform_predictor(predictor) File "~/python3.7/site-packages/alibi/explainers/anchor_image.py", line 605, in _transform_predictor if np.argmax(predictor(np.zeros((1,) + self.image_shape)).shape) == 0: AttributeError: 'list' object has no attribute 'shape'

Can anyone guide me on how to make it right? I think the explainer is looking for a list of the predicted classes instead of just a single class prediction.

aniketzz avatar Sep 06 '21 04:09 aniketzz

The output of the prediction function should be either the prediction probabilities for every possible class or the predicted class of the image. I see you have modified the prediction function to output 2 things - the image detected and the predicted class. This will not work as the explainer expects just the predicted class and so fails because it finds a list of 2 things in the output. Can you modify the prediction function so that it only outputs the predicted class?

jklaise avatar Sep 06 '21 08:09 jklaise

I changed the prediction function to output the predicted class_id only. But the error remains the same. It is expecting a list so I guess this approach won't work.

aniketzz avatar Sep 06 '21 09:09 aniketzz

Can you give us the exact type of both the input and output of your prediction function? They should both be numpy arrays, what are the dimensions?

The prediction function is also assumed to work on batches, so any input to the explainer should have leading batch dimension 1.

jklaise avatar Sep 06 '21 09:09 jklaise

The input is an image as numpy array and the output is an integer class_id. Input dimension is (486, 729, 3) and output is an integer value between 0 and 79

aniketzz avatar Sep 06 '21 09:09 aniketzz

Does your prediction function work on batches? The prediction function for one image should take in an array of shape (1, 486, 729, 3) and return an array of shape (1, 1).

jklaise avatar Sep 06 '21 09:09 jklaise

it just works on 1 batch size.

Here is the code snip: `def do_detect(model, img, conf_thresh, nms_thresh, use_cuda=1): model.eval() t0 = time.time()

if type(img) == np.ndarray and len(img.shape) == 3:  # cv2 image
    img = torch.from_numpy(img.transpose(2, 0, 1)).float().div(255.0).unsqueeze(0)
elif type(img) == np.ndarray and len(img.shape) == 4:
    img = torch.from_numpy(img.transpose(0, 3, 1, 2)).float().div(255.0)
else:
    print("unknow image type")
    exit(-1)

if use_cuda:
    img = img.cuda()
img = torch.autograd.Variable(img)

t1 = time.time()

output = model(img)
# print(output)
# print(img.shape)
image_shape = img.shape
pd_fn = lambda x: model(x)
# plt.imshow(img)

segmentation_fn = 'slic'
kwargs = {'n_segments': 15, 'compactness': 20, 'sigma': .5}
explainer = AnchorImage(model, image_shape, segmentation_fn=segmentation_fn, 
                        segmentation_kwargs=kwargs, images_background=None)
explanation = explainer.explain(img, threshold=.95, p_sample=.8, tau=0.50)
# plt.imshow(explanation.anchor)
# plt.show()
t2 = time.time()

print('-----------------------------------')
print('           Preprocess : %f' % (t1 - t0))
print('      Model Inference : %f' % (t2 - t1))
print('-----------------------------------')

return utils.post_processing(img, conf_thresh, nms_thresh, output),img`

and this time I got the below error:

TypeError: conv2d() received an invalid combination of arguments - got (numpy.ndarray, Parameter, NoneType, tuple, tuple, tuple, int), but expected one of:

  • (Tensor input, Tensor weight, Tensor bias, tuple of ints stride, tuple of ints padding, tuple of ints dilation, int groups) didn't match because some of the arguments have invalid types: (numpy.ndarray, Parameter, NoneType, tuple, tuple, tuple, int)
  • (Tensor input, Tensor weight, Tensor bias, tuple of ints stride, str padding, tuple of ints dilation, int groups) didn't match because some of the arguments have invalid types: (numpy.ndarray, Parameter, NoneType, tuple, tuple, tuple, int)

I am using this git pytorch model as an example.

aniketzz avatar Sep 06 '21 09:09 aniketzz

Seems like a few things are going on here:

  • I'm not sure what you mean by "works on 1 batch size"? The prediction function must take an arbitrary batch size, i.e. (n, 486, 729, 3) for any n>=1, this is because internally the image in question is perturbed in many ways and a batch is sent to the model
  • The predictor must take as input a numpy array, not a torch tensor.
  • The explainer must take as input a numpy array not a torch tensor (it looks like you are feeding a torch tensor to the explainer here?)
  • You are feeding model to the explainer instead of pd_fn? The prediction function that goes into the explainer must be of the form Callable[[np.ndarray], np.ndarray], i.e. a Python function taking in a numpy array (denoting a batch of images) and returning a numpy array denoting a batch of predicted classes/probabilities, although if model is a standard torch model it should already be of the required form (note that your pd_fn is equivalent to model).

jklaise avatar Sep 06 '21 09:09 jklaise

I made a change in my prediction function. It takes an image NumPy array along with the bounding box. Here is the prediction function. ` def perd_fn(img, boxes, class_names=None): import cv2 img = np.copy(img)

width = img.shape[1]
height = img.shape[0]
for i in range(len(boxes)):
    box = boxes[i]
    x1 = int(box[0] * width)
    y1 = int(box[1] * height)
    x2 = int(box[2] * width)
    y2 = int(box[3] * height)

    if len(box) >= 7 and class_names:
        cls_conf = box[5]
        cls_id = box[6]
    else:
        cls_conf = 0
        cls_id = "Unknown"

resu = (cls_id,0.00001)
return np.asarray(resu)`

This resolved the error but while calling explanation = explainer.explain(img, threshold=.95, p_sample=.8, tau=0.50)

A new error pops up. IndexError: boolean index did not match indexed array along dimension 0; dimension is 100 but corresponding boolean dimension is 2

When I am making the resu list of length 100(for testing), its simply getting killed without any error.

aniketzz avatar Sep 06 '21 10:09 aniketzz

Your prediction function should take only one array, the (batched) input image, otherwise the method will not work...

jklaise avatar Sep 06 '21 10:09 jklaise

I have changed the prediction function such that it takes only images as input. But I am still getting the same error.

explanation = explainer.explain(img, threshold=.95, p_sample=.8, tau=0.50)

  File "python3.7/site-packages/alibi/explainers/anchor_image.py", line 516, in explain
    **kwargs,

  File "python3.7/site-packages/alibi/explainers/anchor_base.py", line 667, in anchor_beam
    (pos,), (total,) = self.draw_samples([()], min_samples_start)

  File "python3.7/site-packages/alibi/explainers/anchor_base.py", line 356, in draw_samples
    for i, anchor in enumerate(anchors)]

  File "python3.7/site-packages/alibi/explainers/anchor_base.py", line 356, in <listcomp>
    for i, anchor in enumerate(anchors)]

  File "python3.7/site-packages/alibi/explainers/anchor_image.py", line 129, in __call__
    covered_true = raw_data[labels][: self.n_covered_ex]

**IndexError: boolean index did not match indexed array along dimension 0; dimension is 100 but corresponding boolean dimension is 2**

aniketzz avatar Sep 06 '21 11:09 aniketzz

Someting still seems to be wrong, the error is saying that we have a batch of 100 images internally, but the output of the predictor has 2 as leading dimension.

Can you confirm that you can feed a batch of (n, 486, 729, 3) images as input and received an array of shape (n, 1) as output to the prediction function?

jklaise avatar Sep 06 '21 11:09 jklaise