yolov5 icon indicating copy to clipboard operation
yolov5 copied to clipboard

How to convert from COCO instance segmentation format to YOLOv5 instance segmentation Without Roboflow?

Open ichsan2895 opened this issue 2 years ago โ€ข 5 comments

Search before asking

  • [X] I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

Hello, is it possible to convert COCO instance segmentation Custom dataset to YOLOv5 instance segmentation dataset (without Roboflow ) or maybe creating from scratch?

I already check this tutorial Train On Custom Data 1st and this tutorial Format of YOLO annotations

Most of tutorial just tell format for BBOX and doesn't tell how to convert COCO to YOLO image image

But I don't find any tutorial for converting COCO to YOLOv5 without Roboflow

Can somebody help me?

Thanks for sharing

Additional

No response

ichsan2895 avatar Dec 29 '22 03:12 ichsan2895

๐Ÿ‘‹ Hello @ichsan2895, thank you for your interest in YOLOv5 ๐Ÿš€! Please visit our โญ๏ธ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a ๐Ÿ› Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training โ“ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://ultralytics.com or email [email protected].

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

github-actions[bot] avatar Dec 29 '22 03:12 github-actions[bot]

You can code it yourself in python, just keep in mind on coco, origin is top left, in yolo it's center

ExtReMLapin avatar Dec 29 '22 07:12 ExtReMLapin

Using this script, you can convert the COCO segmentation format to the YOLO segmentation format. https://github.com/ultralytics/JSON2YOLO

RectLabel is an offline image annotation tool for object detection and segmentation. Although this is not an open source program, with RectLabel you can import the COCO segmentation format and export to the YOLO segmentation format.

class_index x1 y1 x2 y2 x3 y3 ...
0 0.180027 0.287930 0.181324 0.280698 0.183726 0.270573 ...

yolo_polygon

ryouchinsa avatar Dec 29 '22 10:12 ryouchinsa

Using this script, you can convert the COCO segmentation format to the YOLO segmentation format. https://github.com/ultralytics/JSON2YOLO

RectLabel is an offline image annotation tool for object detection and segmentation. Although this is not an open source program, with RectLabel you can import the COCO segmentation format and export to the YOLO segmentation format. https://rectlabel.com/help#xml_to_yolo

class_index x1 y1 x2 y2 x3 y3 ...
0 0.180027 0.287930 0.181324 0.280698 0.183726 0.270573 ...

yolo_polygon

Thanks I will check out JSON2YOLO script, I will report back if I see any trouble

ichsan2895 avatar Jan 03 '23 03:01 ichsan2895

Did you see any trouble? @ichsan2895

Edohvin avatar Jan 09 '23 13:01 Edohvin

Did you see any trouble? @ichsan2895

Sorry for slow respon

Yes the JSON2COCO (https://github.com/ultralytics/JSON2YOLO) is failed to work in my computer.

image

The log seems success, but the label/annot txt was not found. I'm not sure what happen.

The COCO was made by labelme annotator. So, it has directory config as :

COCO_Project
|-> JPEGImages\
     |-> img_01.jpg
     |-> img_02.jpg
|-> annotations.txt

Fortunatelly, After One week of debugging. I created a jupyter notebook that mix the code from JSON2COCO & stackoverflow to converting COCO to YOLO and you can download it here : https://drive.google.com/file/d/1xhBiWv_Y0HBZQHoWBwF7yjpRrDZhrk4f/view?usp=sharing

Just change the last cell to desired output_path and json_file path. If you want use bbox annotations, just add use_segment=False. Then you run all cells from start to the last.

# the annot will be formatted as bbox, ideal for object detection task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=False)

# the annot will be formatted as polygon, ideal for instance segmentation task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=True)

ichsan2895 avatar Jan 13 '23 18:01 ichsan2895

Did you see any trouble? @ichsan2895

Sorry for slow respon

Yes the JSON2COCO (https://github.com/ultralytics/JSON2YOLO) is failed to work in my computer.

image

The log seems success, but the label/annot txt was not found. I'm not sure what happen.

The COCO was made by labelme annotator. So, it has directory config as :

COCO_Project
|-> JPEGImages\
     |-> img_01.jpg
     |-> img_02.jpg
|-> annotations.txt

Fortunatelly, After One week of debugging. I created a jupyter notebook that mix the code from JSON2COCO & stackoverflow to converting COCO to YOLO and you can download it here : https://drive.google.com/file/d/1xhBiWv_Y0HBZQHoWBwF7yjpRrDZhrk4f/view?usp=sharing

Just change the last cell to desired output_path and json_file path. If you want use bbox annotations, just add use_segment=False. Then you run all cells from start to the last.

# the annot will be formatted as bbox, ideal for object detection task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=False)

# the annot will be formatted as polygon, ideal for instance segmentation task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=True)

The python notebook worked perfectly for me, thank you !

iagorrr avatar Jan 18 '23 21:01 iagorrr

Did you see any trouble? @ichsan2895

Sorry for slow respon Yes the JSON2COCO (https://github.com/ultralytics/JSON2YOLO) is failed to work in my computer. image The log seems success, but the label/annot txt was not found. I'm not sure what happen. The COCO was made by labelme annotator. So, it has directory config as :

COCO_Project
|-> JPEGImages\
     |-> img_01.jpg
     |-> img_02.jpg
|-> annotations.txt

Fortunatelly, After One week of debugging. I created a jupyter notebook that mix the code from JSON2COCO & stackoverflow to converting COCO to YOLO and you can download it here : https://drive.google.com/file/d/1xhBiWv_Y0HBZQHoWBwF7yjpRrDZhrk4f/view?usp=sharing Just change the last cell to desired output_path and json_file path. If you want use bbox annotations, just add use_segment=False. Then you run all cells from start to the last.

# the annot will be formatted as bbox, ideal for object detection task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=False)

# the annot will be formatted as polygon, ideal for instance segmentation task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=True)

The python notebook worked perfectly for me, thank you !

Can you share sample coco json file? This code didn't work.

---> 85     line = *(segments[last_iter] if use_segments else bboxes[last_iter]),  # cls, box or segments
     86     f.write(('%g ' * len(line)).rstrip() % line + '\n')
     87 print("that images contains class:",len(bboxes),"objects")

IndexError: list index out of range

kadirnar avatar Mar 25 '23 20:03 kadirnar

Did you see any trouble? @ichsan2895

Sorry for slow respon Yes the JSON2COCO (https://github.com/ultralytics/JSON2YOLO) is failed to work in my computer. image The log seems success, but the label/annot txt was not found. I'm not sure what happen. The COCO was made by labelme annotator. So, it has directory config as :

COCO_Project
|-> JPEGImages\
     |-> img_01.jpg
     |-> img_02.jpg
|-> annotations.txt

Fortunatelly, After One week of debugging. I created a jupyter notebook that mix the code from JSON2COCO & stackoverflow to converting COCO to YOLO and you can download it here : https://drive.google.com/file/d/1xhBiWv_Y0HBZQHoWBwF7yjpRrDZhrk4f/view?usp=sharing Just change the last cell to desired output_path and json_file path. If you want use bbox annotations, just add use_segment=False. Then you run all cells from start to the last.

# the annot will be formatted as bbox, ideal for object detection task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=False)

# the annot will be formatted as polygon, ideal for instance segmentation task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=True)

The python notebook worked perfectly for me, thank you !

Can you share sample coco json file? This code didn't work.

---> 85     line = *(segments[last_iter] if use_segments else bboxes[last_iter]),  # cls, box or segments
     86     f.write(('%g ' * len(line)).rstrip() % line + '\n')
     87 print("that images contains class:",len(bboxes),"objects")

IndexError: list index out of range

Sure, I have make this sample coco json with labelme library.

Please take a look.. typical coco json to yolo segmentation.zip

ichsan2895 avatar Mar 26 '23 12:03 ichsan2895

image Why are there negative values?

kadirnar avatar Mar 28 '23 10:03 kadirnar

image Why are there negative values?

Can you share the entire your coco dataset (images + coco_annot.json)? Since it never happen to me (negative value). For the good result, it recomended to create coco annotation with labelme, then convert it with my notebook to yolo format

ichsan2895 avatar Mar 31 '23 15:03 ichsan2895

Hi, @ichsan2895, You can do it in just a couple of clicks using apps from Supervisely ecosystem:

  1. Firstly, you need to upload your COCO format data to the Supervisely using Import COCO app. You can also upload data in another format one of the import applications

  2. Next, you can export the data from Supervisely in the YOLO v5/v8 format:

    • For polygons and masks (without internal cutouts), use the "Export to YOLOv8" app;
    class x1 y1 x2 y2 x3 y3 ...
    0 0.100417 0.654604 0.089646 0.662646 0.087561 0.666667 ...
    
    class x_center y_center width height
    0 0.16713 0.787696 0.207783 0.287495
    

I'm sure there are many apps in the Supervisely ecosystem that can help solve your tasks.

almazgimaev avatar May 08 '23 06:05 almazgimaev

๐Ÿ‘‹ Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

  • Docs: https://docs.ultralytics.com
  • HUB: https://hub.ultralytics.com
  • Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO ๐Ÿš€ and Vision AI โญ

github-actions[bot] avatar Jun 08 '23 00:06 github-actions[bot]

Hi ๐Ÿ‘‹๐Ÿป I'm probably late to the party, but you can convert between formats with supervision.

import supervision as sv

sv.DetectionDataset.from_coco(
    images_directory_path='...',
    annotations_path='...',
    force_masks=True
).as_yolo(
    images_directory_path='...',
    annotations_directory_path='...',
    data_yaml_path='...'
)

SkalskiP avatar Jul 24 '23 22:07 SkalskiP

@SkalskiP thanks for sharing your solution! We appreciate your input and contribution to the YOLOv5 community. Your code snippet using supervision library seems like a handy tool for converting between formats. It's great to see different approaches to tackle the problem. Keep up the good work!

glenn-jocher avatar Jul 24 '23 22:07 glenn-jocher

Hi ๐Ÿ‘‹๐Ÿป I'm probably late to the party, but you can convert between formats with supervision.

import supervision as sv

sv.DetectionDataset.from_coco(
    images_directory_path='...',
    annotations_path='...',
    force_masks=True
).as_yolo(
    images_directory_path='...',
    annotations_directory_path='...',
    data_yaml_path='...'
)

is segment support?

lonngxiang avatar Nov 21 '23 11:11 lonngxiang

@lonngxiang yes, the supervision utility provides support for converting segmentation annotations in addition to detection annotations. You can use the force_masks=True argument in the from_coco method to ensure that masks are enforced during the conversion process. This enables seamless conversion between different annotation formats, allowing you to work with segmentation annotations as well.

glenn-jocher avatar Nov 21 '23 14:11 glenn-jocher

We updated our general_json2yolo.py script so that the RLE mask with holes can be converted to the YOLO segmentation format correctly. https://github.com/ultralytics/ultralytics/issues/917#issuecomment-1821375321

ryouchinsa avatar Nov 21 '23 17:11 ryouchinsa

@ryouchinsa thank you for sharing the update to the general_json2yolo.py script! ๐Ÿ’ป It's great to see the community working together to improve the conversion process for RLE masks with holes to the YOLO segmentation format. Your contribution will definitely benefit others who are facing similar challenges. Keep up the great work! If you have any further improvements or insights, feel free to share them.

glenn-jocher avatar Nov 21 '23 18:11 glenn-jocher

@lonngxiang yes, the supervision utility provides support for converting segmentation annotations in addition to detection annotations. You can use the force_masks=True argument in the from_coco method to ensure that masks are enforced during the conversion process. This enables seamless conversion between different annotation formats, allowing you to work with segmentation annotations as well.

tks, i try use this datasets with 1 label https://universe.roboflow.com/naumov-igor-segmentation/car-segmetarion๏ผš

image

but i use this script coco2yolo๏ผŒbut i got 2 labeles,i donnot know why

import supervision as sv

sv.DetectionDataset.from_coco(
    images_directory_path= r"C:\Users\loong\Downloads\Car\valid",
    annotations_path=r"C:\Users\loong\Downloads\Car\valid\_annotations.coco.json",
    force_masks=True
).as_yolo(
    images_directory_path=r"C:\Users\loong\Downloads\Car_yolo\val\images",
    annotations_directory_path=r"C:\Users\loong\Downloads\Car_yolo\val\labels",
    data_yaml_path=r"C:\Users\loong\Downloads\Car_yolo\data.yaml"
)

image

lonngxiang avatar Nov 22 '23 02:11 lonngxiang

@lonngxiang it looks like the issue might be related to the conversion process. One possibility is that the COCO dataset includes multiple categories, leading to the creation of multiple labels during the conversion. You may want to review the original COCO annotations and ensure that only the desired category (in this case, "car") is included in the annotations. Double-checking the original COCO annotations to ensure that only the "car" category is present could help resolve the issue.

Additionally, you might want to inspect the annotations.coco.json file to confirm the structure and contents of the annotations. This can help identify any unexpected data that might be causing the extra labels to appear during the conversion process.

Feel free to reach out if you have further questions or need additional assistance!

glenn-jocher avatar Nov 22 '23 02:11 glenn-jocher

it looks like the issue might be related to the conversion process. One possibility is that the COCO dataset includes multiple categories, leading to the creation of multiple labels during the conversion. You may want to review the original COCO annotations and ensure that only the desired category (in this case, "car") is included in the annotations. Double-checking the original COCO annotations to ensure that only the "car" category is present could help resolve the issue.

Additionally, you might want to inspect the annotations.coco.json file to confirm the structure and contents of the annotations. This can help identify any unexpected data that might be causing the extra labels to appear during the conversion process.

Feel free to reach out if you have further questions or need additional assistance!

tks; but how to use this script if i use this download datasets https://universe.roboflow.com/naumov-igor-segmentation/car-segmetarion

lonngxiang avatar Nov 22 '23 05:11 lonngxiang

@lonngxiang i understand your question, but as an open-source contributor, I am unable to guide you on using specific third-party datasets, such as the one from Roboflow, as I am not associated with them. I recommend referencing the documentation or support resources provided by Roboflow for guidance on using their datasets with the supervision library for conversion. If you encounter any specific issues related to YOLOv5 or general conversion processes, I am here to assist. Additionally, feel free to consult the YOLOv5 documentation for further insights on dataset conversion.

glenn-jocher avatar Nov 22 '23 05:11 glenn-jocher

@lonngxiang i understand your question, but as an open-source contributor, I am unable to guide you on using specific third-party datasets, such as the one from Roboflow, as I am not associated with them. I recommend referencing the documentation or support resources provided by Roboflow for guidance on using their datasets with the supervision library for conversion. If you encounter any specific issues related to YOLOv5 or general conversion processes, I am here to assist. Additionally, feel free to consult the YOLOv5 documentation for further insights on dataset conversion.

tks, I have utilized the Supervision library to convert the COCO segmentation format to the YOLO format. However, when I ran the Ultralytics command, the results were not as expected .

yolo segment train model=yolov8m-seg.yaml data=/mnt/data/loong/segmetarion/Car_yolo/data.yaml epochs=100

lonngxiang avatar Nov 22 '23 05:11 lonngxiang

Hi @lonngxiang ๐Ÿ‘‹๐Ÿป, I'm the creator of Supervision. Have you been able to solve your conversion problem?

SkalskiP avatar Nov 22 '23 08:11 SkalskiP

Hi @lonngxiang ๐Ÿ‘‹๐Ÿป, I'm the creator of Supervision. Have you been able to solve your conversion problem?

yes,but use sv.DetectionDataset.from_coco().as_yolo() not work for yolo segment format๏ผ›finally i fixed by used this methods https://github.com/ultralytics/JSON2YOLO/blob/master/general_json2yolo.py

lonngxiang avatar Nov 26 '23 02:11 lonngxiang

@lonngxiang itโ€™s great to hear that you found a solution! If you have any other questions or encounter more issues in the future, feel free to ask. Weโ€™re here to help. Good luck with your project!

glenn-jocher avatar Nov 26 '23 10:11 glenn-jocher

Hi @SkalskiP, Can supervision convert multiple polygons in the COCO format to YOLO segmentation format?

"annotations": [
{
    "area": 594425,
    "bbox": [328, 834, 780, 2250],
    "category_id": 1,
    "id": 1,
    "image_id": 1,
    "iscrowd": 0,
    "segmentation": [
        [495, 987, 497, 984, 501, 983, 500, 978, 498, 962, 503, 937, 503, 926, 532, 877, 569, 849, 620, 834, 701, 838, 767, 860, 790, 931, 803, 963, 802, 972, 846, 970, 896, 969, 896, 977, 875, 982, 847, 984, 793, 987, 791, 1001, 783, 1009, 785, 1022, 791, 1024, 787, 1027, 795, 1041, 804, 1059, 811, 1072, 810, 1081, 800, 1089, 788, 1092, 783, 1098, 784, 1115, 780, 1120, 774, 1123, 778, 1126, 778, 1136, 775, 1140, 767, 1140, 763, 1146, 767, 1164, 754, 1181, 759, 1212, 751, 1264, 815, 1283, 839, 1303, 865, 1362, 880, 1442, 902, 1525, 930, 1602, 953, 1640, 996, 1699, 1021, 1773, 1039, 1863, 1060, 1920, 1073, 1963, 1089, 1982, 1102, 2013, 1107, 2037, 1107, 2043, 1099, 2046, 1097, 2094, 1089, 2123, 1074, 2137, 1066, 2153, 1033, 2172, 1024, 2166, 1024, 2166, 1023, 2129, 1019, 2093, 1004, 2057, 996, 2016, 1000, 1979, 903, 1814, 860, 1727, 820, 1647, 772, 1547, 695, 1637, 625, 1736, 556, 1854, 495, 1986, 459, 2110, 446, 1998, 449, 1913, 401, 1819, 362, 1720, 342, 1575, 328, 1440, 335, 1382, 348, 1330, 366, 1294, 422, 1248, 437, 1222, 450, 1190, 466, 1147, 482, 1107, 495, 1076, 506, 1019, 497, 1016],
        [878, 2293, 868, 2335, 855, 2372, 843, 2413, 838, 2445, 820, 2497, 806, 2556, 805, 2589, 809, 2622, 810, 2663, 807, 2704, 793, 2785, 772, 2866, 742, 2956, 725, 3000, 724, 3013, 740, 3024, 757, 3029, 778, 3033, 795, 3033, 812, 3032, 812, 3046, 803, 3052, 791, 3063, 771, 3069, 745, 3070, 733, 3074, 719, 3077, 702, 3075, 680, 3083, 664, 3082, 631, 3072, 601, 3061, 558, 3058, 553, 3039, 558, 3023, 566, 3001, 568, 2983, 566, 2960, 572, 2912, 571, 2859, 567, 2781, 572, 2698, 576, 2643, 583, 2613, 604, 2568, 628, 2527, 637, 2500, 636, 2468, 629, 2445, 621, 2423, 673, 2409, 726, 2388, 807, 2344, 878, 2293]
    ]
}],

ryouchinsa avatar Nov 30 '23 10:11 ryouchinsa

@ryouchinsa I analyzed the json2yolo conversion code and am curious about the principle. Why do ti divide by k=0, k=1? In the end, the first and last points of each unit polygon are the same, so we can check which instance it is, but can you tell why we need to do forward and backward for a unit polygon?

for k in range(2):
        # forward connection
        if k == 0:
            # idx_list: [[5], [12, 0], [7]]
            for i, idx in enumerate(idx_list):
                # middle segments have two indexes
                # reverse the index of middle segments
                # ์ฒซ๋ฒˆ์งธ์™€ ๋งˆ์ง€๋ง‰ ์„ธ๊ทธ๋จผํŠธ๋ฅผ ์ œ์™ธํ•œ ๋‚˜๋จธ์ง€ ์„ธ๊ทธ๋จผํŠธ๋“ค์€ ๋‘๊ฐœ์˜ ์ธ๋ฑ์Šค๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Œ
                # idx_list = [ [p], [p, q], [p, q], ... , [q]]
                if len(idx) == 2 and idx[0] > idx[1]:
                    idx = idx[::-1]
                    # segments[i] : (N, 2)
                    segments[i] = segments[i][::-1, :]

                segments[i] = np.roll(segments[i], -idx[0], axis=0)
                segments[i] = np.concatenate([segments[i], segments[i][:1]])
                # deal with the first segment and the last one
                if i in [0, len(idx_list) - 1]:
                    s.append(segments[i])
                else:
                    idx = [0, idx[1] - idx[0]]
                    s.append(segments[i][idx[0] : idx[1] + 1])

        else:
            for i in range(len(idx_list) - 1, -1, -1):
                if i not in [0, len(idx_list) - 1]:
                    idx = idx_list[i]
                    nidx = abs(idx[1] - idx[0])
                    s.append(segments[i][nidx:])
    return s

YoungjaeDev avatar Feb 15 '24 06:02 YoungjaeDev

@ryouchinsa

Well, in the end, it depends on how the train code parses it, but I'm curious about whether that method is efficient.

YoungjaeDev avatar Feb 15 '24 06:02 YoungjaeDev