amazon-textract-transformer-pipeline icon indicating copy to clipboard operation
amazon-textract-transformer-pipeline copied to clipboard

[Enhancement] Joint entity recognition and page/document classification

Open athewsey opened this issue 2 years ago • 0 comments

Today we demonstrate annotation and training for entity extraction only. For many use cases document classification is also important, and it should be pretty straightforward to support this too.

A sequence classification task is already supported in the open source (e.g. LayoutLMv2ForSequenceClassification), but joint cls+ner with a single model might be performant and more economical for users - and not require too much extra effort.

athewsey avatar Mar 03 '23 07:03 athewsey