mmocr
mmocr copied to clipboard
How to make Training data for Training key-information-extraction
Hello guys . how can i make ma training set to train my own custom key-information-extraction is there any available tools to do that ?
Hi, MMOCR currently does not provide a data converter for KIE task. If you would like to train with your custom data, you may refer to the data preparation steps of the supported wildreciept dataset, and convert your own data to the same format. In addition, please feel free to raise a pr if you would like to add such a data converter in MMOCR.
ok Thank you very much how much + I didn't understand the notion of Key/Value when I'm trying to Annotate my Data where should id mention the key and where should I select the value .
Hi, you may refer to this tutorial to learn the key
and value
in KIE, which provides some examples of the key-value
pairs.
i Made this Annotation thats it looks Correct ?
{"file_name": "image_files/2d21105d-template19319.png", "height": 1168, "width": 1653, "annotations": [{"box": [835.8682269716757, 401.69202586323775, 864.5974563514809, 401.69202586323775, 864.5974563514809, 420.4222971786585, 835.8682269716757, 420.4222971786585], "text": "gt", "label": "1"}, {"box": [868.344747140151, 399.1946563545149, 893.3266857312859, 399.1946563545149, 893.3266857312859, 420.4222971786585, 868.344747140151, 420.4222971786585], "text": "loc", "label": "1"}, {"box": [842.1137116194595, 429.1630904591881, 1040.720123418983, 429.1630904591881, 1040.720123418983, 450.39073128333166, 842.1137116194595, 450.39073128333166], "text": "10,rueazrou,quartier", "label": "2"}, {"box": [839.6155177603459, 461.62889407258416, 957.0306291386806, 461.62889407258416, 957.0306291386806, 484.1052196510891, 839.6155177603459, 484.1052196510891], "text": "10020,rabat", "label": "2"}, {"box": [840.8646146899025, 495.3433824403416, 1054.460189644107, 495.3433824403416, 1054.460189644107, 529.057870808099, 840.8646146899025, 529.057870808099], "text": "8560451551218746063", "label": "3"}, {"box": [838.3664208307896, 366.7288527411188, 924.5541089702053, 366.7288527411188, 924.5541089702053, 385.4591240565397, 838.3664208307896, 385.4591240565397], "text": "b810-01", "label": "0"}, {"box": [1009.4864197530864, 366.72932038834944, 1029.4630041152263, 366.72932038834944, 1029.4630041152263, 385.4626796116504, 1009.4864197530864, 385.4626796116504], "text": "r", "label": "0"}]} {"file_name": "image_files/36c100a6-template19318.png", "height": 1168, "width": 1653, "annotations": [{"box": [840.8532098765431, 402.69918446601946, 868.0630864197531, 402.69918446601946, 868.0630864197531, 423.8592621359224, 840.8532098765431, 423.8592621359224], "text": "ya", "label": "1"}, {"box": [872.5980658436213, 396.64372815533983, 967.8099588477365, 396.64372815533983, 967.8099588477365, 420.8201941747573, 872.5980658436213, 420.8201941747573], "text": "restaurant", "label": "1"}, {"box": [837.8374485596707, 426.87565048543695, 1106.8751028806585, 426.87565048543695, 1106.8751028806585, 454.068504854369, 837.8374485596707, 454.068504854369], "text": "7,avanueabdelkarimbenjelloun", "label": "2"}, {"box": [843.8916460905351, 461.6208155339806, 958.7626748971195, 461.6208155339806, 958.7626748971195, 482.7808932038835, 843.8916460905351, 482.7808932038835], "text": "10020,rabat", "label": "2"}, {"box": [839.3566666666667, 499.3823689320389, 1050.9588065843623, 499.3823689320389, 1050.9588065843623, 520.5424466019418, 839.3566666666667, 520.5424466019418], "text": "5420565926897256814", "label": "3"}, {"box": [837.8374485596707, 364.9149514563107, 922.4828395061728, 364.9149514563107, 922.4828395061728, 386.0750291262136, 837.8374485596707, 386.0750291262136], "text": "c810-01", "label": "0"}, {"box": [1008.6247736625514, 366.4344854368932, 1026.7646913580247, 366.4344854368932, 1026.7646913580247, 386.0750291262136, 1008.6247736625514, 386.0750291262136], "text": "r", "label": "0"}]} {"file_name": "image_files/eca06b46-template19316.png", "height": 1168, "width": 1653, "annotations": [{"box": [840.4450617283951, 401.33840776699026, 879.6726337448561, 401.33840776699026, 879.6726337448561, 419.91300970873783, 840.4450617283951, 419.91300970873783], "text": "star", "label": "1"}, {"box": [881.736049382716, 401.33840776699026, 919.9205761316872, 401.33840776699026, 919.9205761316872, 418.892427184466, 881.736049382716, 418.892427184466], "text": "craft", "label": "1"}, {"box": [841.4654320987654, 430.23223300970875, 1028.3065843621398, 430.23223300970875, 1028.3065843621398, 453.977786407767, 841.4654320987654, 453.977786407767], "text": "avenuetarikibnziad", "label": "2"}, {"box": [843.5288477366256, 462.2331650485437, 959.1481481481483, 462.2331650485437, 959.1481481481483, 482.8716116504854, 843.5288477366256, 482.8716116504854], "text": "10020,rabat", "label": "2"}, {"box": [840.4450617283951, 495.25467961165043, 1051.0268312757203, 495.25467961165043, 1051.0268312757203, 524.1485048543689, 840.4450617283951, 524.1485048543689], "text": "5638984273905984178", "label": "3"}, {"box": [840.4450617283951, 365.2324660194175, 927.1538683127573, 365.2324660194175, 927.1538683127573, 388.9780194174757, 840.4450617283951, 388.9780194174757], "text": "c810-01", "label": "0"}, {"box": [1010.7788888888889, 367.2963106796116, 1027.2862139917695, 367.2963106796116, 1027.2862139917695, 385.8709126213592, 1010.7788888888889, 385.8709126213592], "text": "r", "label": "0"}]} {"file_name": "image_files/51ecae9f-template19317.png", "height": 1168, "width": 1653, "annotations": [{"box": [843.3628085490162, 397.9459716001536, 898.3230734495133, 397.9459716001536, 898.3230734495133, 424.1683514417427, 843.3628085490162, 424.1683514417427], "text": "setrap", "label": "1"}, {"box": [900.8212673086267, 399.1946563545149, 962.0270168569076, 399.1946563545149, 962.0270168569076, 424.16835144174263, 900.8212673086267, 424.16835144174263], "text": "traveaux", "label": "1"}, {"box": [965.7743076455781, 399.1946563545149, 1016.987281757405, 399.1946563545149, 1016.987281757405, 419.17361242429706, 965.7743076455781, 419.17361242429706], "text": "divers", "label": "1"}, {"box": [842.1137116194595, 429.1630904591881, 999.4999247436102, 429.1630904591881, 999.4999247436102, 452.8881007920544, 842.1137116194595, 452.8881007920544], "text": "2,rueidrissel", "label": "2"}, {"box": [842.1137116194595, 461.62889407258416, 947.0378537022266, 461.62889407258416, 947.0378537022266, 479.1104806336436, 842.1137116194595, 479.1104806336436], "text": "111,rabat", "label": "2"}, {"box": [842.1137116194595, 497.84075194906427, 1053.2110927145504, 497.84075194906427, 1053.2110927145504, 521.5657622819306, 842.1137116194595, 521.5657622819306], "text": "8764814411031892864", "label": "3"}, {"box": [838.3589711934156, 367.9766990291262, 929.5347325102881, 367.9766990291262, 929.5347325102881, 387.95743689320386, 838.3589711934156, 387.95743689320386], "text": "c810-01", "label": "0"}, {"box": [1004.4979423868313, 366.72932038834944, 1029.4856790123456, 366.72932038834944, 1029.4856790123456, 386.7100582524271, 1004.4979423868313, 386.7100582524271], "text": "r", "label": "0"}]}