edi icon indicating copy to clipboard operation
edi copied to clipboard

[ADD] edi_pdf2data

Open etobella opened this issue 3 years ago • 14 comments

This module allows to define templates for importing documents using the library invoice2data. It can be configured in order to import invoices and other kinds of documents

etobella avatar Aug 25 '21 06:08 etobella

Hi @etobella! Thank you very much for this contribution. As the addon you are improving does not have a declared maintainer, I take the opportunity to mention that you can consider adopting it. To do so, please read the maintainer role description, and, if interested, create a pull request to add your GitHub login to the maintainers key of the addon manifest.

OCA-git-bot avatar Aug 25 '21 06:08 OCA-git-bot

Isn't this the same in features as #432 ?

pedrobaeza avatar Aug 25 '21 06:08 pedrobaeza

#432 intends to remove the invoice2data dependancy, and it is only related to invoices, this can be used in a generic way for all kind of documents, for example, we are trying to import quality reports from a laboratory.

etobella avatar Aug 25 '21 06:08 etobella

OK, but do you need the dependency to invoice2data or can you do something similar as the other module?

pedrobaeza avatar Aug 25 '21 06:08 pedrobaeza

We could check how to remove the dependency, let me think about that.

However, why do you want to remove the dependency?

etobella avatar Aug 25 '21 07:08 etobella

For not depending on external libraries and possible frictions with versions. And I also remember a not good pipeline flow with the library.

pedrobaeza avatar Aug 25 '21 08:08 pedrobaeza

Ok, thanks, I will review which is the best solution!! Thanks :smile:

etobella avatar Aug 25 '21 09:08 etobella

For not depending on external libraries and possible frictions with versions. And I also remember a not good pipeline flow with the library.

#432 has three externs libraries, https://github.com/OCA/edi/blob/26892422e3dd19e784ae5718f9831d5823d75678/account_invoice_import_simple_pdf/readme/INSTALL.rst

FernandoRomera avatar Aug 25 '21 13:08 FernandoRomera

@pedrobaeza personally I think invoice2data is the less harmful dependency than the dependencies of #432. I have spent quite some time with invoice2data (2+ years), pymupdf (again 2+years) and #432 (in test and eval) and this is much more in line with my ideal than the existing invoice2data_pdf module, the new simple_pdf proposal and actually my own work I am porting to v14.

Multiline support for a start is a key differentiator. OCR processing too although more fringe. My main issue with the 2 existing ones coming from invoice import is the ignorance of purchase orders in a stock environment.

The main benefits of simple_pdf proposal is the fields interface and I think in future that idea could be generalised for the basic fields. Also for non-Europeans, the OCA dependencies are annoying for simple_pdf. We have to install all sorts of Euro specific data files. This is nice and clean.

But honestly, PyMuPDF is great when it works, and admittedly it is being used in a rather simple use case here (EDIT: being #432, not actually here), but it is a dependency you'd live without if you could IMO. But things like dynamic form filling on printed pdf forms it is pretty much all there is that works.

gdgellatly avatar Sep 20 '21 11:09 gdgellatly

OK, it was just a suggestion as it seems working better according to @alexis-via

pedrobaeza avatar Sep 20 '21 11:09 pedrobaeza

@pedrobaeza Yes of course, until it doesn't. In practical use, simple_pdf is most suited to regular bills, coded to 1 line. If you use operating-unit, or you use purchase orders and stock, forget it, if you need to fill the origin field, forget it. For me they are completely different use cases for different kinds of businesses/supplier invoices.

As for pipeline issues, yes there were some problematic commits and release tags on invoice2data. A typo and then a py 3.6 f-string. In fact I was just fixing a v12 server for precisely that.

gdgellatly avatar Sep 20 '21 11:09 gdgellatly

@gdgellatly What do you mean by "Also for non-Europeans, the OCA dependencies are annoying for simple_pdf." ? Which dependencies are annoying for you? I don't understand that part of your comment.

alexis-via avatar Nov 15 '21 13:11 alexis-via

@alexis-via UNECE stuff of base business document import.

"account_tax_unece",
        "uom_unece",

gdgellatly avatar Nov 16 '21 02:11 gdgellatly

There hasn't been any activity on this pull request in the past 4 months, so it has been marked as stale and it will be closed automatically if no further activity occurs in the next 30 days. If you want this PR to never become stale, please ask a PSC member to apply the "no stale" label.

github-actions[bot] avatar Sep 18 '22 12:09 github-actions[bot]