label-studio-converter icon indicating copy to clipboard operation
label-studio-converter copied to clipboard

feat: LSDV-4831: Export BrushLabels to COCO

Open cdpath opened this issue 2 years ago • 16 comments

cdpath avatar Dec 19 '22 07:12 cdpath

@cdpath I wanted to check in on this and see if you've had a chance to work on the patch update.

hogepodge avatar Jan 10 '23 23:01 hogepodge

Kinda busy at work recently. Will update soon.

cdpath avatar Jan 16 '23 16:01 cdpath

@cdpath hi! do you have any updates?

makseq avatar Feb 08 '23 02:02 makseq

@hogepodge please keep tracking this PR.

makseq avatar Feb 08 '23 02:02 makseq

@cdpath we're trying to make the process for merging community feature requests easier. One thing that would help me a lot in moving this forward would be what we call "acceptance criteria." Essentially, when we hand this off to QA to determine if we can merge it, what is the expected behavior that we can test?

This is a much-requested feature, and we're very grateful for the patch. I want to help us move this along as best as I can.

hogepodge avatar Feb 23 '23 15:02 hogepodge

@hogepodge Sorry to be late. Just did a little update as a walk-around if pycocotools is not available.

cdpath avatar Feb 24 '23 03:02 cdpath

Codecov Report

:exclamation: No coverage uploaded for pull request base (master@fc5eb78). Click here to learn what that means. Patch has no changes to coverable lines.

Additional details and impacted files
@@            Coverage Diff            @@
##             master     #175   +/-   ##
=========================================
  Coverage          ?   45.93%           
=========================================
  Files             ?       21           
  Lines             ?     1822           
  Branches          ?        0           
=========================================
  Hits              ?      837           
  Misses            ?      985           
  Partials          ?        0           

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov-commenter avatar Mar 22 '23 04:03 codecov-commenter

@cdpath Our QA team tried to setup pycocotools on window and got this problem: image

What are your thoughts here? Any ideas? Maybe we can add pycocotools as options? (something like this pip install label-studio-converter[pycocotools])

makseq avatar Mar 28 '23 22:03 makseq

@cdpath Our QA team tried to setup pycocotools on window and got this problem: image

What are your thoughts here? Any ideas? Maybe we can add pycocotools as options? (something like this pip install label-studio-converter[pycocotools])

Yeah, that's an option. Another approach may be: create another fork of pycocotools, which includes wheels for Windows

cdpath avatar Mar 29 '23 10:03 cdpath

Do you know how to make it as extra package in pip? We have no bandwidth to support forks of pycocotools.

makseq avatar Mar 29 '23 16:03 makseq

Any updates on this feature?

kriap139 avatar Apr 08 '23 14:04 kriap139

@makseq I've added an extra package, but am not certain whether I've correctly updated the _get_supported_formats

cdpath avatar Apr 13 '23 02:04 cdpath

I've looked through the patch, and assuming that we've resolved the windows issue by making it an optional install, I'd like to move forward with merging this.

hogepodge avatar May 10 '23 16:05 hogepodge

Hello,

I'm using the ml backend with SAM integration and I need to export to COCO with brushlabels, rectangleLabels and keypointLabels as well. So I used your PR code and did some little change. I only changed the files converter.py and brush.py

This is working well for me. There's still a problem when an annotation have multiple label for instance when the user label some stuff with brushLabels and with PolygonLabels in the same time...

Here's my code:

converter.py

...
   def convert_to_coco(
        self, input_data, output_dir, output_image_dir=None, is_dir=True
    ):
        def add_image(images, width, height, image_id, image_path):
            images.append(
                {
                    'width': width,
                    'height': height,
                    'id': image_id,
                    'file_name': image_path,
                }
            )
            return images

        self._check_format(Format.COCO)
        ensure_dir(output_dir)
        output_file = os.path.join(output_dir, 'result.json')
        if output_image_dir is not None:
            ensure_dir(output_image_dir)
        else:
            output_image_dir = os.path.join(output_dir, 'images')
            os.makedirs(output_image_dir, exist_ok=True)
        images, categories, annotations = [], [], []
        categories, category_name_to_id = self._get_labels()
        data_key = self._data_keys[0]
        item_iterator = (
            self.iter_from_dir(input_data)
            if is_dir
            else self.iter_from_json_file(input_data)
        )
        for item_idx, item in enumerate(item_iterator):
            image_path = item['input'][data_key]
            image_id = len(images)
            width = None
            height = None
            # download all images of the dataset, including the ones without annotations
            if not os.path.exists(image_path):
                try:
                    image_path = download(
                        image_path,
                        output_image_dir,
                        project_dir=self.project_dir,
                        return_relative_path=True,
                        upload_dir=self.upload_dir,
                        download_resources=self.download_resources,
                    )
                except:
                    logger.info(
                        'Unable to download {image_path}. The image of {item} will be skipped'.format(
                            image_path=image_path, item=item
                        ),
                        exc_info=True,
                    )
            # add image to final images list
            try:
                with Image.open(os.path.join(output_dir, image_path)) as img:
                    width, height = img.size
                images = add_image(images, width, height, image_id, image_path)
            except:
                logger.info(
                    "Unable to open {image_path}, can't extract width and height for COCO export".format(
                        image_path=image_path, item=item
                    ),
                    exc_info=True,
                )

            # skip tasks without annotations
            if not item['output']:
                # image wasn't load and there are no labels
                if not width:
                    images = add_image(images, width, height, image_id, image_path)

                logger.warning('No annotations found for item #' + str(item_idx))
                continue

            # concatenate results over all tag names
            labels = []
            for key in item['output']:
                labels += item['output'][key]

            if len(labels) == 0:
                logger.debug(f'Empty bboxes for {item["output"]}')
                continue

            for label in labels:
                category_name = None
                for key in [
                    'rectanglelabels',
                    'polygonlabels',
                    'brushlabels',
                    'keypointlabels',
                    'labels',
                ]:
                    if key in label and len(label[key]) > 0:
                        category_name = label[key][0]
                        break

                if category_name is None:
                    logger.warning("Unknown label type or labels are empty")
                    continue

                if not height or not width:
                    if 'original_width' not in label or 'original_height' not in label:
                        logger.debug(
                            f'original_width or original_height not found in {image_path}'
                        )
                        continue

                    width, height = label['original_width'], label['original_height']
                    images = add_image(images, width, height, image_id, image_path)

                category_id = category_name_to_id[category_name]

                annotation_id = len(annotations)

                if "polygonlabels" in label:
                    if "points" not in label:
                        logger.warn(label)
                    points_abs = [
                        (x / 100 * width, y / 100 * height) for x, y in label["points"]
                    ]
                    x, y = zip(*points_abs)

                    annotations.append(
                        {
                            'id': annotation_id,
                            'image_id': image_id,
                            'category_id': category_id,
                            'segmentation': [
                                [coord for point in points_abs for coord in point]
                            ],
                            'bbox': get_polygon_bounding_box(x, y),
                            'ignore': 0,
                            'iscrowd': 0,
                            'area': get_polygon_area(x, y),
                        }
                    )
                elif 'brushlabels' in label and brush.pycocotools_imported:
                    if "rle" not in label:
                        logger.warn(label)
                    coco_rle = brush.ls_rle_to_coco_rle(label["rle"], height, width)
                    segmentation = brush.ls_rle_to_polygon(label["rle"], height, width)
                    bbox = brush.get_cocomask_bounding_box(coco_rle)
                    area = brush.get_cocomask_area(coco_rle)
                    annotations.append(
                        {
                            "id": annotation_id,
                            "image_id": image_id,
                            "category_id": category_id,
                            "segmentation": segmentation,
                            "bbox": bbox,
                            'ignore': 0,
                            "iscrowd": 0,
                            "area": area,
                        }
                    )
                elif 'rectanglelabels' in label or 'keypointlabels' in label:
                    if "rle" not in label:
                        logger.warn(label)
                    coco_rle = brush.ls_rle_to_coco_rle(label["rle"], height, width)
                    segmentation = brush.ls_rle_to_polygon(label["rle"], height, width)
                    bbox = brush.get_cocomask_bounding_box(coco_rle)
                    area = brush.get_cocomask_area(coco_rle)
                    annotations.append(
                        {
                            'id': annotation_id,
                            'image_id': image_id,
                            'category_id': category_id,
                            'segmentation': segmentation,
                            'bbox': bbox,
                            'ignore': 0,
                            'iscrowd': 0,
                            'area': area,
                        }
                    )
                elif 'keypointlabels' in label:
                    if "rle" not in label:
                        logger.warn(label)
                    print(label["rle"])
                    coco_rle = brush.ls_rle_to_coco_rle(label["rle"], height, width)
                    segmentation = brush.ls_rle_to_polygon(label["rle"], height, width)
                    bbox = brush.get_cocomask_bounding_box(coco_rle)
                    area = brush.get_cocomask_area(coco_rle)
                    annotations.append(
                        {
                            'id': annotation_id,
                            'image_id': image_id,
                            'category_id': category_id,
                            'segmentation': segmentation,
                            'bbox': bbox,
                            'ignore': 0,
                            'iscrowd': 0,
                            'area': area,
                        }
                    )
                else:
                    raise ValueError("Unknown label type")

                if os.getenv('LABEL_STUDIO_FORCE_ANNOTATOR_EXPORT'):
                    annotations[-1].update({'annotator': get_annotator(item)})

        with io.open(output_file, mode='w', encoding='utf8') as fout:
            json.dump(
                {
                    'images': images,
                    'categories': categories,
                    'annotations': annotations,
                    'info': {
                        'year': datetime.now().year,
                        'version': '1.0',
                        'description': '',
                        'contributor': 'Label Studio',
                        'url': '',
                        'date_created': str(datetime.now()),
                    },
                },
                fout,
                indent=2,
            )
...

brush.py

...
def ls_rle_to_coco_rle(ls_rle, height, width):
    """from LS rle to compressed coco rle"""
    ls_mask = decode_rle(ls_rle)
    ls_mask = np.reshape(ls_mask, [height, width, 4])[:, :, 3]
    ls_mask = np.where(ls_mask > 0, 1, 0)
    binary_mask = np.asfortranarray(ls_mask)
    coco_rle = binary_mask_to_rle(binary_mask)
    result = pycocotools.mask.frPyObjects(coco_rle, *coco_rle.get('size'))
    result["counts"] = result["counts"].decode()
    return result

def ls_rle_to_polygon(ls_rle, height, width):
    """from LS rle to polygons"""
    ls_mask = decode_rle(ls_rle)
    ls_mask = np.reshape(ls_mask, [height, width, 4])[:, :, 3]
    ls_mask = np.where(ls_mask > 0, 1, 0)

    # Find contours from the binary mask
    contours = measure.find_contours(ls_mask, 0.5)
    segmentation = []

    for contour in contours:
        # Flip dimensions then ravel and cast to list
        contour = np.flip(contour, axis=1)
        contour = contour.ravel().tolist()
        segmentation.append(contour)
    return segmentation
...

There is still the issue when an annotation have multiple labels... There is no way to find them using the filter section.

Therefore I used:

Filter -> annotationResults contains {label}

to find problematic annotations...

ODAncona avatar Jun 30 '23 14:06 ODAncona

@hogepodge let's try to take into account the last comment: https://github.com/heartexlabs/label-studio-converter/pull/175#issuecomment-1614720231

let's talk with @nehalecky on how we can add this changes and deliver this PR eventually.

makseq avatar Jul 17 '23 15:07 makseq

Any updates on this feature?

Mat198 avatar May 15 '24 12:05 Mat198

After careful consideration, we’ve determined that this is more of an improvement than a critical bug. Additionally, it seems to be an outdated request and hasn’t garnered much interest from the community. For these reasons, we will be closing this issue. We will continue developing the converter library as a part of Label Studio SDK.

We appreciate your understanding and encourage you to submit your feedback, questions and suggestions here: https://github.com/HumanSignal/label-studio-sdk/issues

makseq avatar Jun 04 '24 14:06 makseq