transformers
transformers copied to clipboard
Add LayoutLMForQuestionAnswering model
What does this PR do?
This PR adds a LayoutLMForQuestionAnswering
class that follows the implementations of LayoutLMv2ForQuestionAnswering
and LayoutLMv3ForQuestionAnswering
, so that LayoutLM
can be fine-tuned for the question answering task.
Fixes #18380
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [x] Did you read the contributor guideline, Pull Request section?
- [x] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case: https://github.com/huggingface/transformers/issues/18380
- [ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
- [ ] Did you write any new necessary tests?
Who can review?
@Narsil
@Narsil I've left a few TODOs -- (1) supporting tensorflow, (2) filling in docs, (3) filling in tests -- which I'll gladly do. I just wanted to post sooner than later to start getting feedback on the approach.
The documentation is not available anymore as the PR was closed or merged.
Ok, for this part I will let @NielsRogge comment as I am not the best person to answer how it should be done.
@NielsRogge @Narsil gentle nudge on this PR. I plan to fix the tests + write docs as a next step but wanted to get some quick feedback about whether this approach is acceptable for including LayoutLMForQuestionAnswering
. Appreciate your consideration!
Thanks @NielsRogge!
We're discussing the pipeline part in pull request 18414. Would love your feedback there too!
@NielsRogge @Narsil I just updated it to include tests+documentation. If it's okay, I'd like to defer the tensorflow implementation for now (due to some personal lack of familiarity). I am failing a consistency check, however, as a result:
File "/Users/ankur/projects/transformers/transformers/utils/check_inits.py", line 298, in <module>
check_all_inits()
File "/Users/ankur/projects/transformers/transformers/utils/check_inits.py", line 238, in check_all_inits
raise ValueError("\n\n".join(failures))
ValueError: Problem in src/transformers/models/layoutlm/__init__.py, both halves do not define the same objects.
Differences for tf backend:
LayoutLMForQuestionAnswering in _import_structure but not in TYPE_HINT.
Could you help me resolve this?
@NielsRogge @Narsil, I went ahead and implemented support for TensorFlow and the checks are now passing. Would appreciate a re-review.
@NielsRogge gentle nudge on this PR :)
Thanks @NielsRogge! I just updated with your comments, added to the list of doc tests, and verified locally that they are (now) passing.
Up to you guys on that one!
@NielsRogge @Narsil I did some thinking over the weekend and think it makes sense to include them in AutoModelForQuestionAnswering
to be consistent with LayoutLMv2
and v3
. We can move around the auto mapping in PR #18414.
Let me know if you have any concerns with that thinking. If not, I'll proceed with merging the change in.
@Narsil @NielsRogge did you have any further questions on this PR, or is it ready to merge in?
Also happy to hold off, since we have some traction with PR #18414, and just wait to include it in the AutoModelForDocumentQuestionAnswering
there?
Hi @Narsil @NielsRogge just wanted to bump on this -- based on the most recent round of comments on PR #18414, we removed LayoutLMv2ForQuestionAnswering
and LayoutLMv3ForQuestionAnswering
from AutoModelForQuestionAnswering
, so I think it makes sense to not add LayoutLMForQuestionAnswering
to the auto mapping, if we are about to remove it.
I will go ahead and remove it and update the PR. Please let me know if it's ready to move forward. It would be very helpful to rebase PR #18414 against it for testing purposes.
This PR seems almost ready, I'd just update:
- all code examples to use either
LayoutLMTokenizer
orAutoTokenizer
- add a working code example of
LayoutLMForQuestionAnswering
/TFLayoutLMForQuestionAnswering
, with an expected output
I actually don't have a pre-trained TFLayoutLMForQuestionAnswering
(i.e. one with tensorflow weights), but I could use the same code and just reference the base model?
I'll make the other updates now.
I actually don't have a pre-trained TFLayoutLMForQuestionAnswering (i.e. one with tensorflow weights), but I could use the same code and just reference the base model?
The Transformers library makes sure that any PyTorch model also works in the other framework, and vice versa, due to the same variable names being used. So you can just do:
from transformers import TFLayoutLMForQuestionAnswering
model = TFLayoutLMForQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", from_pt=True)
and it should work (this should also normally be tested with the PT-TF cross equivalence test). You can then perhaps do model.push_to_hub("impira/layoutlm-document-qa")
to upload the TF weights to the same repo. This way, you can remove the from_pt
statement.
Wow, that is super cool! Okay let me give it a try.
Ok @NielsRogge I've made all of these changes. It was a really nice idea to put a fully working example in there. I've also pushed the TF weights to the hub.
@NielsRogge @Narsil the test failures are now occurring because LayoutLMForQuestionAnswering is not in any sort of auto mapping (for example, tests/test_modeling_tf_common.py:_prepare_for_class
uses the auto mapping to determine what the expected output labels are. I'm not sure what the best way to proceed with this is. Perhaps we include it in the QuestionAnswering mapping just to keep the commit (a) consistent with LayoutLMv2-3 and (b) passing tests, and then solve the auto mapping issue properly in PR #18414?
@ankrgyl normally if you run make fixup
and it complains about a model not being in any auto mapping, you can add it to utils/check_repo.py in the IGNORE_NON_AUTO_CONFIGURED mapping.
Then, in #18414, you can remove it from this mapping and add it to the auto mapping instead.
@ankrgyl normally if you run
make fixup
and it complains about a model not being in any auto mapping, you can add it to utils/check_repo.py in the IGNORE_NON_AUTO_CONFIGURED mapping.
@NielsRogge I actually have already added it here, and it still fails the tests :(. The reason is that I've included it in tests/models/layoutlm/test_modeling_layoutlm.py:LayoutLMModelTest.all_model_classes
. I feel like there's a tradeoff here: I can either exclude it from all tests, or put it into the QuestionAnswering auto class and then remove it shortly in PR #18414. Let me know what you think is best.
Following up on this @NielsRogge @Narsil @sgugger, could you please advise on how to proceed? It seems that if something has tests then it must be in an Auto model list (the failing tests are the due to LayoutLMForQuestionAnswering
not being part of any Auto model).
Please correct me if I'm wrong, but my understanding is that we have the following options for how to proceed:
- Add
LayoutLMForQuestionAnswering
to theAutoModelForQuestionAnswering
pipeline, which will make the tests pass. I'll remove it shortly after in https://github.com/huggingface/transformers/pull/18414. - Remove all tests about
LayoutLMForQuestionAnswering
and add them in https://github.com/huggingface/transformers/pull/18414. - Add
AutoModelForDocumentQuestionAnswering
in this PR, and then simply extend/use it in PR #18414.
To make all tests pass, you need to overwrite the _prepare_for_class
method defined in test_modeling_common.py
, to make sure the targets are prepared correctly for LayoutLMForQuestionAnswering
. It can be defined as follows in test_modeling_layoutlm.py
:
def _prepare_for_class(self, inputs_dict, model_class, return_labels=False):
inputs_dict = copy.deepcopy(inputs_dict)
if return_labels:
if model_class in get_values(MODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING):
inputs_dict["labels"] = torch.zeros(
self.model_tester.batch_size, dtype=torch.long, device=torch_device
)
elif model_class in [
*get_values(MODEL_FOR_TOKEN_CLASSIFICATION_MAPPING),
*get_values(MODEL_FOR_MASKED_LM_MAPPING),
]:
inputs_dict["labels"] = torch.zeros(
(self.model_tester.batch_size, self.model_tester.seq_length), dtype=torch.long, device=torch_device
)
elif model_class.__name__ == "LayoutLMForQuestionAnswering":
inputs_dict["start_positions"] = torch.zeros(
self.model_tester.batch_size, dtype=torch.long, device=torch_device
)
inputs_dict["end_positions"] = torch.zeros(
self.model_tester.batch_size, dtype=torch.long, device=torch_device
)
return inputs_dict
This can then be removed once the model is added to an Auto mapping.
The same needs to happen for the TF model.
Regarding the failing tests - you might need to rebase with the main branch.
Also note that sometimes, tests which are totally unrelated to your PR fail, in which case you can ignore them.
Thanks @NielsRogge just rebased
@NielsRogge I believe all outstanding comments have been addressed. Are we ready to merge this in?
I've pinged @sgugger for a final review, however he's off this week so will be merged next week :)
Thank you for merging it in! @LysandreJik or @NielsRogge are you planning to do any sort of announcement? I'm asking because we're going to publicly announce the project we've been working on (https://github.com/impira/docquery) in the next few days, and it would be great to collaborate.
I'd like to communicate on that once the pipeline is merged, because the Space above is using that right?
Also, the doc tests don't seem to pass:
_ [doctest] transformers.models.layoutlm.modeling_layoutlm.LayoutLMForQuestionAnswering.forward _
1328 ... bbox.append([0] * 4)
1329 >>> encoding["bbox"] = torch.tensor([bbox])
1330
1331 >>> word_ids = encoding.word_ids(0)
1332 >>> outputs = model(**encoding)
1333 >>> loss = outputs.loss
1334 >>> start_scores = outputs.start_logits
1335 >>> end_scores = outputs.end_logits
1336 >>> start, end = word_ids[start_scores.argmax(-1)], word_ids[end_scores.argmax(-1)]
1337 >>> print(" ".join(words[start : end + 1]))
Expected:
M. Hamann P. Harper, P. Martinez
Got:
J. S. Wigand
/__w/transformers/transformers/src/transformers/models/layoutlm/modeling_layoutlm.py:1337: DocTestFailure
_ [doctest] transformers.models.layoutlm.modeling_tf_layoutlm.TFLayoutLMForQuestionAnswering.call _
[15](https://github.com/huggingface/transformers/runs/8125145111?check_suite_focus=true#step:9:16)53 ... bbox.append([0] * 4)
1554 >>> encoding["bbox"] = tf.convert_to_tensor([bbox])
1555
1556 >>> word_ids = encoding.word_ids(0)
1557 >>> outputs = model(**encoding)
1558 >>> loss = outputs.loss
1559 >>> start_scores = outputs.start_logits
1560 >>> end_scores = outputs.end_logits
1561 >>> start, end = word_ids[tf.math.argmax(start_scores, -1)[0]], word_ids[tf.math.argmax(end_scores, -1)[0]]
1562 >>> print(" ".join(words[start : end + 1]))
Expected:
M. Hamann P. Harper, P. Martinez
Got:
<BLANKLINE>
Hi @ankrgyl Thanks a lot for adding (TF)LayoutLMForQuestionAnswering
!
For the doctest:
-
TFLayoutLMForQuestionAnswering
seems to have issue loading the weights forqa_outputs
. Could you check if the TF checkpoint inimpira/layoutlm-document-qa
has weights for this part, or see if you can find what goes wrong? The warning message isSome layers of TFLayoutLMForQuestionAnswering were not initialized from the model checkpoint at impira/layoutlm-document- qa and are newly initialized: ['qa_outputs']
and I actually got some random results for this test.
-
LayoutLMForQuestionAnswering
weight loading looks fine, but the output is different from the expected value. Could you take a look here?
Here is how you can run the doctest
First
python utils/prepare_for_doc_test.py src/transformers/utils/doc.py
Then for LayoutLMForQuestionAnswering
:
python utils/prepare_for_doc_test.py src/transformers/models/layoutlm/modeling_layoutlm.py
pytest --doctest-modules src/transformers/models/layoutlm/modeling_layoutlm.py -sv --doctest-continue-on-failure
For TFLayoutLMForQuestionAnswering
:
python utils/prepare_for_doc_test.py src/transformers/models/layoutlm/modeling_tf_layoutlm.py
pytest --doctest-modules src/transformers/models/layoutlm/modeling_tf_layoutlm.py -sv --doctest-continue-on-failure
Thank you again! If you have trouble on debugging this, let me know :-)
Hi @NielsRogge @ydshieh I'm very sorry about that -- what happened is that we've updated the weights on the underlying model and it's returning a different name from the same document (the question itself is slightly ambiguous).
I've confirmed that if I pin the revision in the tests, they pass. I've just submitted https://github.com/huggingface/transformers/pull/18854 to resolve that.
I'll investigate the weights in impira/layoutlm-document-qa
in parallel.
I'd like to communicate on that once the pipeline is merged, because the Space above is using that right?
@NielsRogge the Space is indeed using the pipeline (and incorporates Donut
too). It makes sense to do the announcement after that lands. We'll still do ours today but simply mention that we are working to upstream changes. Let me know if y'all have any concerns about that.
Hi @NielsRogge @ydshieh I'm very sorry about that -- what happened is that we've updated the weights on the underlying model and it's returning a different name from the same document (the question itself is slightly ambiguous).
No problem, thanks for the fix.
I've confirmed that if I pin the revision in the tests, they pass. I've just submitted #18854 to resolve that.
Great!