InvoiceNet icon indicating copy to clipboard operation
InvoiceNet copied to clipboard

Size of the dataset

Open rrajp opened this issue 5 years ago • 3 comments

Can anyone suggest what would be an ideal/bare minimum data size to start the training with? I understand that is heavily dependent on the variety of format we handle but I tried with some 100 docs and didn't get any success on any field. Not even close. So If we have some knowledge on the numbers one can plan accordingly.

rrajp avatar Feb 17 '21 13:02 rrajp

@rrajp you need to at least need a set of 500-800 docs to train.

janhavisawal avatar Mar 05 '21 07:03 janhavisawal

do i have to anotate to get the JSON format manualy for 800 docs ? @janhavisawal

wijayawilly avatar Sep 15 '21 15:09 wijayawilly

do i have to anotate to get the JSON format manualy for 800 docs ? @janhavisawal

Yes, you need too. You can use makesense.ai for that.

janhavisawal avatar Oct 02 '21 17:10 janhavisawal