avalanche
avalanche copied to clipboard
Tensor labels unexpected behaviour storage policies
Hi, I don't have time for PR atm, but quickly reporting a bug when iterating targets in the storage_policy.py
.
Iteration over the targets in ClassBalanced and Parameterized Buffer will see each sample as a new class when targets
is a tensor.
For example
for idx, target in enumerate(new_data.targets):
# if target is tensor, will be seen as separate class (because different id between tensors)
# Quick fix:
target = int(target)
...
For me this was the case when creating a dataset_benchmark
, when first applying wrap_with_task_labels
, this would return tensor-labels (even if the original dataset returned int's).
The quick fix above worked, but the below solution did not, by explicitly passing a targets transform:
target_to_int = transforms.Lambda(lambda x: int(x))
return dataset_benchmark(
train_datasets=wrap_with_task_labels(train_sets, target_transform=target_to_int),
...
)
def wrap_with_task_labels(datasets, target_transform):
return [AvalancheDataset(ds, task_labels=idx, target_transform=target_transform) for idx, ds in enumerate(datasets)]
Hope it helps!
Thanks. This is definitely a bug. Do you have a small script that we can use to reproduce it?
@lrzpellegrini
Both ClassBalanced and ParametricBuffer now convert each target to int explicitly, so this is no longer an issue. It may be possible that some benchmarks still return targets as tensors instead of int (which is an error). @Mattdl do you remember the benchmark you were using?