unilm icon indicating copy to clipboard operation
unilm copied to clipboard

DiT for Text Detection

Open senthil-r-10 opened this issue 1 year ago • 3 comments

Is it possible for the model to understand curved text? If so how it is possible, in the document and published paper there is no explanation for it. Is anyone tried using the pre-train model to predict the scene text dataset?

senthil-r-10 avatar Aug 03 '23 09:08 senthil-r-10

@senthil-r-10, We do not have curved text in the DiT training, so currently it is not supported. But you may continue train this model to support curved text detection. For scene text, what do you mean by scene text document? Can you give some examples?

wolfshow avatar Aug 10 '23 01:08 wolfshow

I mean Curved text detection only, I plan to use this approach for OCR text detection for documents like Receipts and invoices. Could you update the Data Preparation link in the help document? https://mmocr.readthedocs.io/en/v0.6.0/datasets/det.html#funsd

senthil-r-10 avatar Aug 15 '23 15:08 senthil-r-10

@wolfshow I've previously trained a DONUT model to adapt to a new language with synthetic data. Do you think same strategy might be possible with DiT for Text detection in a foreign language?

Secondly, Can't access the model checkpoints or weights listed in /dit/text_detection. Getting PublicAccessNotPermitted error for all links.

rm-asif-amin avatar Feb 15 '24 18:02 rm-asif-amin