How to annotate and train donut for extracting all dates (unknown number of dates)
Hi, Thank you for the great work you shared. just wondering how the annotation in jsonl should look like if I want to extract all dates from document (Parsing) although labels of dates are different and I don't know how many dates in every image (some images have 1 date, other have 9 dates)
is the shape is like: "ground_truth": "{ "gt_parse": { "date1": "11/5/2020", "date2": "1/5/2024", "date3": "5/5/1999", "date4": "8/9/1955" } }"
or all in one node like: "ground_truth": "{ "gt_parse": { "dates" : [{"date": "11/5/2020", "date": "1/5/2024", "date": "5/5/1999", "date": "8/9/1955"}] } }"
or just a list like: "ground_truth": "{ "gt_parse": { "dates" : ["11/5/2020", "1/5/2024", "5/5/1999", "8/9/1955"}] } }"
Thanks