amazon-textract-transformer-pipeline LILT

LILT

Open tmpuch opened this issue 2 years ago • 1 comments

I wanted to ask if this solution would currently support Language-Independent Layout Transformer - RoBERTa model (LiLT)?

If not, I wanted to request that the inference code be updated to support a LiLT model.

Sep 07 '23 00:09 tmpuch

Hi - thanks for raising this & sorry for the delayed response!

It's likely that the sample doesn't quite support LiLT out of the box yet, but the LiltForTokenClassification.forward() interface is very similar to LayoutLM/etc so I hope it wouldn't be too difficult to add... And would certainly be interested to add it if I have time or anybody wants to raise a PR!

I'm not quite sure from the LiLT bbox doc yet whether the "normalized" (x0, y0, x1, y1) coordinates would need to be handled differently than our current 1000-vocabulary handling for LayoutLM.

We mostly use AutoTokenizer/AutoModel/etc in the training script but there are some departures from this to deal with LayoutLMv2 vs LayoutXLM (which share a lot of tokenizer/processor logic but not quite everything)... So it might be there are some tweaks needed here to correctly handle LiLT as well.

...But as a first pass it's probably well worth just setting model_name_or_path hyperparam to DLVCLab/lilt-roberta-en-base to see how close it is to working already?

Sep 24 '23 13:09 athewsey

amazon-textract-transformer-pipeline amazon-textract-transformer-pipeline copied to clipboard

LILT

amazon-textract-transformer-pipeline
amazon-textract-transformer-pipeline copied to clipboard