Ankur Goyal comments

Results 106 comments of


                                            Ankur Goyal

[WIP] Extend VisualQuestionAnsweringPipeline to support QuestionAnwering models (e.g. LayoutLM)

@Narsil I think I've implemented all of what we talked about (and apologies in advance if I missed anything). To summarize: - Padding/truncation are gone. I've left them commented out,...

[WIP] Extend VisualQuestionAnsweringPipeline to support QuestionAnwering models (e.g. LayoutLM)

@Narsil gentle nudge in case this slipped from your queue :)

[WIP] Extend VisualQuestionAnsweringPipeline to support QuestionAnwering models (e.g. LayoutLM)

> So rather than extending the VQA pipeline, it seems that the design has been updated to create a separate DocumentQuestionAnswering pipeline? Yes that's correct. > Also, I'd like to...

[WIP] Extend VisualQuestionAnsweringPipeline to support QuestionAnwering models (e.g. LayoutLM)

@NielsRogge congrats on pushing Donut -- I just saw it come through. I've integrated it into the pipeline, and it works! The code gets a bit splintered _inside_ the pipeline...

[WIP] Extend VisualQuestionAnsweringPipeline to support QuestionAnwering models (e.g. LayoutLM)

Hi @Narsil, thanks for the feedback. I will address your comments. I appreciate your willingness to pull down the code and get your hands dirty. Please let me know if...

[WIP] Extend VisualQuestionAnsweringPipeline to support QuestionAnwering models (e.g. LayoutLM)

> In general, `processor` should be extremely shallow, and the real logic should actually be in `feature_extractor`. Leveraging it is not only encouraged but extremely welcome as they can contain...

[WIP] Extend VisualQuestionAnsweringPipeline to support QuestionAnwering models (e.g. LayoutLM)

> @ankrgyl Here are some tests we can integrate If you're ok (feel free to modify, the important part is to have exact values in the asserts everywhere except `run_pipeline_test`....

[WIP] Extend VisualQuestionAnsweringPipeline to support QuestionAnwering models (e.g. LayoutLM)

> I did remove some vision layers (including `p5`) if something is failing I would consider it a bug, but I am not super familiar with this model's internals. Yes...

[WIP] Extend VisualQuestionAnsweringPipeline to support QuestionAnwering models (e.g. LayoutLM)

@Narsil I _think_ I was able to update the mini model (see [PR](https://huggingface.co/hf-internal-testing/tiny-random-layoutlmv2/discussions/1)). I verified locally that with this update + some expected test changes, the test passes.

[WIP] Extend VisualQuestionAnsweringPipeline to support QuestionAnwering models (e.g. LayoutLM)

BTW, I'm working on a space, which illustrates the pipeline. It's currently using a frozen version of the pipeline I have saved in another repo, but once we land this...