avalanche icon indicating copy to clipboard operation
avalanche copied to clipboard

Tensor labels unexpected behaviour storage policies

Open Mattdl opened this issue 2 years ago • 3 comments

Hi, I don't have time for PR atm, but quickly reporting a bug when iterating targets in the storage_policy.py. Iteration over the targets in ClassBalanced and Parameterized Buffer will see each sample as a new class when targets is a tensor.

For example

for idx, target in enumerate(new_data.targets):
  # if target is tensor, will be seen as separate class (because different id between tensors)
  
  # Quick fix:
  target = int(target)
  ...

For me this was the case when creating a dataset_benchmark, when first applying wrap_with_task_labels, this would return tensor-labels (even if the original dataset returned int's).

The quick fix above worked, but the below solution did not, by explicitly passing a targets transform:

 target_to_int = transforms.Lambda(lambda x: int(x))
return dataset_benchmark(
    train_datasets=wrap_with_task_labels(train_sets, target_transform=target_to_int),
    ...
    )
    
def wrap_with_task_labels(datasets, target_transform):
  return [AvalancheDataset(ds, task_labels=idx, target_transform=target_transform) for idx, ds in enumerate(datasets)]
          

Hope it helps!

Mattdl avatar Apr 21 '22 08:04 Mattdl

Thanks. This is definitely a bug. Do you have a small script that we can use to reproduce it?

AntonioCarta avatar Apr 21 '22 14:04 AntonioCarta

@lrzpellegrini

AntonioCarta avatar Jul 26 '23 09:07 AntonioCarta

Both ClassBalanced and ParametricBuffer now convert each target to int explicitly, so this is no longer an issue. It may be possible that some benchmarks still return targets as tensors instead of int (which is an error). @Mattdl do you remember the benchmark you were using?

lrzpellegrini avatar Jul 26 '23 09:07 lrzpellegrini