donut icon indicating copy to clipboard operation
donut copied to clipboard

How to annotate and train donut for extracting all dates (unknown number of dates)

Open Anas-Khayata opened this issue 2 years ago • 0 comments

Hi, Thank you for the great work you shared. just wondering how the annotation in jsonl should look like if I want to extract all dates from document (Parsing) although labels of dates are different and I don't know how many dates in every image (some images have 1 date, other have 9 dates)

is the shape is like: "ground_truth": "{ "gt_parse": { "date1": "11/5/2020", "date2": "1/5/2024", "date3": "5/5/1999", "date4": "8/9/1955" } }"

or all in one node like: "ground_truth": "{ "gt_parse": { "dates" : [{"date": "11/5/2020", "date": "1/5/2024", "date": "5/5/1999", "date": "8/9/1955"}] } }"

or just a list like: "ground_truth": "{ "gt_parse": { "dates" : ["11/5/2020", "1/5/2024", "5/5/1999", "8/9/1955"}] } }"

Thanks

Anas-Khayata avatar Oct 18 '23 11:10 Anas-Khayata