avalanche
avalanche copied to clipboard
GenericCLScenario donot support data for regression tasks. TypeError: unhashable type: 'list'
I am now working on a regression task. I found all the generators: filelist_benchmark, dataset_benchmark, tensors_benchmark, paths_benchmark, nc_benchmark, ni_benchmark
do not support regression tasks. This might be caused by the function origin_stream.benchmark.get_classes_timeline
. I guess this function is trying to get the number of unique classes in each experience while the labels of regression tasks are not discrete.
Here is a minimal working example:
import torch
from torch.utils.data import TensorDataset
from avalanche.benchmarks.generators import filelist_benchmark, dataset_benchmark, \
tensors_benchmark, paths_benchmark,\
nc_benchmark, ni_benchmark
train_datasets = (TensorDataset(torch.randn(100,10),torch.randn(100,1)),TensorDataset(torch.randn(100,10),torch.randn(100,1)))
test_datasets = (TensorDataset(torch.randn(10,10),torch.randn(10,1)),TensorDataset(torch.randn(10,10),torch.randn(10,1)))
# Create the continual learning scenario
scenario = dataset_benchmark(train_datasets=train_datasets, test_datasets=test_datasets)
for experience in scenario.train_stream: # TypeError: unhashable type: 'list'
print("task ", experience.task_label)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[10], line 19
17 # Create the continual learning scenario
18 scenario = dataset_benchmark(train_datasets=train_datasets, test_datasets=test_datasets)
---> 19 for experience in scenario.train_stream:
20 print("task ", experience.task_label)
File d:\anaconda3\envs\py39\lib\site-packages\avalanche\benchmarks\scenarios\generic_scenario.py:593, in SequenceCLStream.__iter__(self)
591 exp: TCLExperience
592 for i in range(len(self)):
--> 593 exp = self[i]
594 yield exp
File d:\anaconda3\envs\py39\lib\site-packages\avalanche\benchmarks\scenarios\generic_scenario.py:616, in SequenceCLStream.__getitem__(self, item)
612 raise IndexError("Experience index out of bounds" + str(int(item)))
614 curr_exp = item if self.slice_ids is None else self.slice_ids[item]
--> 616 exp = self._make_experience(curr_exp)
617 if self.set_stream_info:
618 exp.current_experience = curr_exp
File d:\anaconda3\envs\py39\lib\site-packages\avalanche\benchmarks\scenarios\dataset_scenario.py:663, in FactoryBasedStream._make_experience(self, experience_idx)
662 def _make_experience(self, experience_idx: int) -> TDatasetExperience:
--> 663 a = self.benchmark.experience_factory(self, experience_idx) # type: ignore
664 return a
File d:\anaconda3\envs\py39\lib\site-packages\avalanche\benchmarks\scenarios\classification_scenario.py:67, in _default_classification_experience_factory(stream, experience_idx)
64 def _default_classification_experience_factory(
65 stream: "ClassificationStream", experience_idx: int
66 ):
---> 67 return ClassificationExperience(
68 origin_stream=stream, current_experience=experience_idx
69 )
File d:\anaconda3\envs\py39\lib\site-packages\avalanche\benchmarks\scenarios\classification_scenario.py:184, in ClassificationExperience.__init__(self, origin_stream, current_experience)
173 self._benchmark: ClassificationScenario = origin_stream.benchmark
175 dataset: TClassificationDataset = origin_stream.benchmark.stream_definitions[
176 origin_stream.name
177 ].exps_data[current_experience]
179 (
180 classes_in_this_exp,
181 previous_classes,
182 classes_seen_so_far,
183 future_classes,
--> 184 ) = origin_stream.benchmark.get_classes_timeline(
185 current_experience, stream=origin_stream.name
186 )
188 super().__init__(
189 origin_stream,
190 dataset,
(...)
195 future_classes,
196 )
File d:\anaconda3\envs\py39\lib\site-packages\avalanche\benchmarks\scenarios\dataset_scenario.py:531, in ClassesTimelineCLScenario.get_classes_timeline(self, current_experience, stream)
501 def get_classes_timeline(
502 self, current_experience: int, stream: str = "train"
503 ) -> Tuple[
(...)
507 Optional[List[int]],
508 ]:
509 """
510 Returns the classes timeline given the ID of a experience.
511
(...)
529 the benchmark is initialized by using a lazy generator.
530 """
--> 531 class_set_current_exp = self.classes_in_experience[stream][current_experience]
533 if class_set_current_exp is not None:
534 # May be None in lazy benchmarks
535 classes_in_this_exp = list(class_set_current_exp)
File d:\anaconda3\envs\py39\lib\site-packages\avalanche\benchmarks\scenarios\classification_scenario.py:268, in _LazyClassesInClassificationExps.__getitem__(self, exp_id)
266 def __getitem__(self, exp_id: Union[int, slice]) -> LazyClassesInExpsRet:
267 indexing_collate = _LazyClassesInClassificationExps._slice_collate
--> 268 result = manage_advanced_indexing(
269 exp_id, self._get_single_exp_classes, len(self), indexing_collate
270 )
271 return result
File d:\anaconda3\envs\py39\lib\site-packages\avalanche\benchmarks\utils\dataset_utils.py:335, in manage_advanced_indexing(idx, single_element_getter, max_length, collate_fn)
333 elements: List[X] = []
334 for single_idx in indexes_iterator:
--> 335 single_element = single_element_getter(int(single_idx))
336 elements.append(single_element)
338 if len(elements) == 1:
File d:\anaconda3\envs\py39\lib\site-packages\avalanche\benchmarks\scenarios\classification_scenario.py:284, in _LazyClassesInClassificationExps._get_single_exp_classes(self, exp_id)
281 if targets is None:
282 return None
--> 284 return set(targets)
TypeError: unhashable type: 'list'
I think a generator for regression tasks is necessary.
Hi, I agree with you. This feature is on the roadmap and planned for the next release. Right now you can add a fake target attribute to your data, adding a list of zeros with the same length of the data to dataset.targets
.
Meet the same problem while trying to using avalanche
doing physics related research
This is fixed in the latest version. The new benchmark generators don't require class or task labels (example).
The old ones are still available for backward compatibility.
This is fixed in the latest version. The new benchmark generators don't require class or task labels (example).
The old ones are still available for backward compatibility.
Thanks, now this problems is solved using 0.5.0
. But when I directly use benchmark_from_datasets
to generate data streams and train the steams with Naive
, a new error no attribute 'targets_task_labels
was thrown. So, I suggest add a simple example like this to facilitate onboarding new researcher in the Physics field who heavily use it for regression tasks.
Yes, we need to do some work on the strategy side (I'm working on that). However, keep in mind that most methods are designed for classification, which means that we can easily remove the task labels (when unused) but they would not make sense for regression tasks.
Avalanche strategies are designed to be general (apart from these minor fixes that we have to do), so if a method can work with regression tasks then you should be able to use it without any issues by changing the loss function. However, the user needs to understand the method to know if it supports regression, which may be difficult for non-expert users. We don't really have a good solution for this because each method and task requires different considerations that are hard to generalize. For example, many methods are not directly applicable but very easy to generalize by changing few lines of code if you understand what they are doing.
Yes, we need to do some work on the strategy side (I'm working on that). However, keep in mind that most methods are designed for classification, which means that we can easily remove the task labels (when unused) but they would not make sense for regression tasks.
Avalanche strategies are designed to be general (apart from these minor fixes that we have to do), so if a method can work with regression tasks then you should be able to use it without any issues by changing the loss function. However, the user needs to understand the method to know if it supports regression, which may be difficult for non-expert users. We don't really have a good solution for this because each method and task requires different considerations that are hard to generalize. For example, many methods are not directly applicable but very easy to generalize by changing few lines of code if you understand what they are doing.
Thanks, I'll dive into the code then~