tvm icon indicating copy to clipboard operation
tvm copied to clipboard

[Bug] Tensor advanced indexing with boolean mask not working

Open NVukobrat opened this issue 3 years ago • 4 comments

Hi,

I’m trying to index the tensor in the following fashion:

x[0, mask]

where x is the input tensor (e.g. (2, 3) shape), while the mask is a boolean array (e.g. (3,) shape). For a better example, here are example values:

x = [ 
  [1, 2, 3], 
  [4, 5, 6] 
] 
mask = [True, True, False]

Therefore, when using NumPy as a reference, the results of x[0, mask] should be [1, 2] tensor. However, as TVM output I’m getting something like [2, 2, 1]. Can someone provide me with more information on this topic?

Also, for easier reproduction, below is TVM test similar to the example below (test is added and run from the following test file):

tests/python/relay/test_op_level3.py

Unit test:

def test_adv_index_with_mask(target, dev, executor_kind):
    def verify_adv_index(data_shape, index, bool_mask_shape):
        dtype = "float32"
        inputs = [relay.var("data", relay.TensorType(data_shape, dtype))]
        np_data = np.random.uniform(size=data_shape).astype(dtype)
        np_indices = np.ones(bool_mask_shape, dtype=bool)
        np_indices[-1] = False
        inputs.append(relay.var("index_{}".format(index), shape=(), dtype="int"))
        inputs.append(relay.var("mask_{}".format(index), relay.TensorType(bool_mask_shape, "bool")))
        np_out = np_data[index, np_indices]
        np_args = [np_data] + [index, np_indices]
        out = relay.op.adv_index(inputs)

        func = relay.Function(inputs, out)
        op_res = relay.create_executor(executor_kind, device=dev, target=target).evaluate(func)(
            *np_args
        )
        tvm.testing.assert_allclose(op_res.numpy(), np_out, rtol=1e-5)

    verify_adv_index((2, 3), 1, (3,))

Lastly, this issue is also posted yesterday on the discussion board

NVukobrat avatar Aug 03 '22 13:08 NVukobrat

There were related discussion on this topic. https://github.com/apache/tvm/issues/9237#issuecomment-992428813 https://github.com/apache/tvm/pull/7373

masahi avatar Aug 04 '22 01:08 masahi

I see. Thanks for the references. I'll try to find a workaround for this issue for now.

However, I agree that adv_index should also support boolean dtype as it'll be more like the NumPy way of indexing.

Thanks again for your help @masahi 😄

NVukobrat avatar Aug 04 '22 10:08 NVukobrat

Hey @masahi,

Are there any thoughts/plans on how this will be supported?

I tried the approach by using argwhere in order to determine indices and afterwards feed those results to the adv_index function. However, dynamic shapes from argwhere and problematic to handle.

Therefore, pinging you here if you have more info/insights regarding the proper approach to handling this kind of problem.

NVukobrat avatar Aug 08 '22 09:08 NVukobrat

No plan from my side at least. Based on the conversation in https://github.com/apache/tvm/pull/7373, it seems that the proper way is to improve adv_index to support boolean indices.

masahi avatar Aug 08 '22 10:08 masahi

Hi @NVukobrat, I finally took a look at this issue in detail and came up with a simple solution, https://github.com/apache/tvm/pull/13306. Your example is used as the test case.

UPDATE: Oh I just noticed that you also tried using argwhere... Yes, argwhere introduces dynamic shape, but it is inevitable since indexing by mask, when the mask is not a constant tensor, is inherently a dynamic operation.

masahi avatar Nov 07 '22 10:11 masahi

Reopened since the PR hasn't been merged yet

masahi avatar Nov 07 '22 10:11 masahi