DeFRCN OutOfMemoryError with PrototypicalCalibrationBlock

Hello, when I train my dataset using DeFRCN, I encountered an issue. The base training process goes smoothly, but when I attempt K-shot finetuning, I keep getting an OutOfMemoryError.

I tried to solve it and found that when setting PCB_ENABLE to False, this issue doesn't occur.

However, when PCB_ENABLE is set to True, even if I adjust IMS_PER_BATCH to 1 on A100-40G, I still encounter the OutOfMemoryError.

Has anyone else experienced a similar issue? How was it resolved?

Apr 10 '24 13:04 gladdduck

Solution is to locate the PCB module :/path/defrcn/defrcn/evaluation/calibration_layer py build_prototypes function in the code: 'All_feature.append (feature.cpu ().data)' adds the following code: features =None.

Apr 11 '24 08:04 cnjhh

Solution is to locate the PCB module :/path/defrcn/defrcn/evaluation/calibration_layer py build_prototypes function in the code: 'All_feature.append (feature.cpu ().data)' adds the following code: features =None.

thanks for your reply! this works!

Apr 12 '24 05:04 gladdduck

Solution is to locate the PCB module :/path/defrcn/defrcn/evaluation/calibration_layer py build_prototypes function in the code: 'All_feature.append (feature.cpu ().data)' adds the following code: features =None.

However, this error still occurs from time to time. the code locate in calibration_layer py build_prototypes function features = self.extract_roi_features(img, boxes) extract_roi_features function conv_feature = self.imagenet_model(images.tensor[:, [2, 1, 0]])[ I'm very confused about this, even though I used gc.collect() and torch.cuda.empty_cache()

Apr 19 '24 05:04 gladdduck

features = self.extract_roi_features(img, boxes) boxes = None img = None all_features.append(features.cpu().data) features = None

features create by you customs datasets for novel classes ,you can solve this by reducing the number of novel classes, or generate features offline, instead of loading the novel datas to train it when the model validated, save it through the pickle module, and then modify the code to load the offline trained one directly during validation

Apr 19 '24 08:04 cnjhh

The device I use is A800 80G, and the novel data I set is 10 shot 13class, and when the model is loaded with the pcb module, the video memory occupies 53G, and 80G is not enough before the modification

Apr 19 '24 08:04 cnjhh

DeFRCN DeFRCN copied to clipboard

OutOfMemoryError with PrototypicalCalibrationBlock

DeFRCN
DeFRCN copied to clipboard