unilm icon indicating copy to clipboard operation
unilm copied to clipboard

LayoutLMv3 | Domain adaptation on the base model

Open louisdeneve opened this issue 2 years ago • 1 comments

I'm using the base model from LayoutLMv3 and trying to adapt it to my own local data. This data is unlabeled, so I'm trying to continue the training of the base model on my own data. I'm having trouble adapting on how to mask the data and which collator to give to the Trainer. Currently my data has this structure:

features = Features({
    'input_ids': Sequence(feature=Value(dtype='int64')),
    'attention_mask': Sequence(Value(dtype='int64')),
    'bbox': Array2D(dtype="int64", shape=(512, 4)),
    'pixel_values': Array3D(dtype="float32", shape=(3, 224, 224)),
})

To mask the text part I'm using the DataCollatorForLanguageModeling but this only masks the text and doesn't include the image information. Anyone that knows how to do this?

louisdeneve avatar Jul 25 '22 08:07 louisdeneve

You may refer to LayoutLMv3's paper and BEiT's code for image masking (the Masked Image Modeling objective).

HYPJUDY avatar Jul 31 '22 06:07 HYPJUDY