datasets icon indicating copy to clipboard operation
datasets copied to clipboard

Add DocVQA

Open NielsRogge opened this issue 1 year ago • 1 comments

Adding a Dataset

  • Name: DocVQA
  • Description: Document Visual Question Answering (DocVQA) seeks to inspire a “purpose-driven” point of view in Document Analysis and Recognition research, where the document content is extracted and used to respond to high-level tasks defined by the human consumers of this information.
  • Paper: https://arxiv.org/abs/2007.00398
  • Data: https://www.docvqa.org/datasets/docvqa
  • Motivation: Models like LayoutLM and Donut in the Transformers library are fine-tuned on DocVQA. Would be very handy to directly load this dataset from the hub.

Instructions to add a new dataset can be found here.

NielsRogge avatar Aug 04 '22 13:08 NielsRogge

Thanks for proposing, @NielsRogge.

Please, note this dataset requires registering in their website and their Terms and Conditions state we cannot distribute their URL:

1. You will NOT distribute the download URLs
...

albertvillanova avatar Aug 08 '22 05:08 albertvillanova