fiftyone icon indicating copy to clipboard operation
fiftyone copied to clipboard

[BUG] Memory leak when using `add_coco_labels` for instance segmentation with coco_id_field set

Open h-fernand opened this issue 9 months ago • 3 comments

Describe the problem

When trying to add COCO format instance segmentation prediction data to my dataset using add_coco_labels the program will begin rapidly using RAM until eventually it runs out of RAM and crashes. This only happens if I set the coco_id_field to coco_id so that I can sync up my annotations with my samples properly. If I omit the coco_id_field and let the function run with the default behavior, my annotations get mismatched but the program does not eat nearly as much RAM and actually does finish running. This code also produces the same erroneous behavior if I provide add_coco_labels with a view containing only the test data split instead of the whole dataset.

Code to reproduce issue

import fiftyone as fo
import fiftyone.utils.coco as fouc

dataset_name = "dataset"
splits = ['train', 'val', 'test']
dataset_root = '/path/to/dataset/root'
annotations_dir = 'annotations
annfile_template = 'instances_{split}.json'

predictions_file = '/path/to/predictions/file.json'

combined_dataset = fo.Dataset(name=dataset_name, persistent=True)

for split in splits:
    print(f"Loading: {split} dataset")

    annfile = f"{dataset_root}/{annotations_dir}/{annfile_template.format(split=split)}"
    data_path = f"{dataset_root}/{split}"
    split_dataset_name = f"ground_truth_{split}"

    split_dataset = fo.Dataset.from_dir(
        data_path=data_path,
        labels_path=annfile,
        dataset_type=fo.types.COCODetectionDataset,
        name=split_dataset_name,
        include_id=True,
        persistent=True
    )
    split_dataset.tag_samples(split)
    combined_dataset.merge_samples(split_dataset)

with open(predictions_file, 'r') as f:
    prediction_data = json.load(f)

predictions = prediction_data['annotations']
classes = prediction_data['categories']
classes = [x['name'] for x in classes]

fouc.add_coco_labels(combined_dataset, "predictions", predictions, classes, label_type="segmentations", coco_id_field="coco_id")

System information

  • OS Platform and Distribution: Linux Ubuntu 22.04
  • Python version: Python 3.10.12
  • FiftyOne version (fiftyone --version): v0.23.8
  • FiftyOne installed from (pip or source): pip

Willingness to contribute

The FiftyOne Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the FiftyOne codebase?

  • [ ] Yes. I can contribute a fix for this bug independently
  • [x] Yes. I would be willing to contribute a fix for this bug with guidance from the FiftyOne community
  • [ ] No. I cannot contribute a bug fix at this time

h-fernand avatar May 22 '24 13:05 h-fernand