surya icon indicating copy to clipboard operation
surya copied to clipboard

about training data set

Open wonders7796 opened this issue 1 year ago • 2 comments

Thank you very much for the open source project. After I tried it, it worked very well. Can you please give me some details about your training data set。

wonders7796 avatar Mar 29 '24 07:03 wonders7796

looks like DocLaynet dataset for text lines and layout detection. (not sure for ocr, but doclaynet contains machine-generated ocr annotations)

sralvins avatar Apr 02 '24 10:04 sralvins

how about the ordering model?

vbonnivardprobayes avatar Apr 23 '24 09:04 vbonnivardprobayes