Rafał Miłecki

Results 156 comments of Rafał Miłecki

Your regex doesn't match the first line: 1. `Invoice_Number`: `\w?` allows only one (optional) character. Your example is `HB342344`. Make that `\w*` 2. `Invoice_Amount`: `-? ` requires space after `-`....

@bosd: I have zero experience with OCR inputs

We don't have enough manpower to implement such solution as part of this project. I also think it's out of scope of the `invoice2data`. See also https://github.com/invoice-x/invoice2data/issues/361

The problem I see is that poppler's `pdftotext` doesn't support `-simple` or `-table`. Support for those layouts was developed in Xpdf after it has been forked by the poppler project....

Well, CSV clearly isn't a good choice for storing tree structured data. Your proposed solution is one way of handling that. Or maybe the first row shouldn't have the first...

Hi guys, if you find time to further work on this feature, please feel free to reopen this pull request. As mentioned by @m3nu this needs to be documented. Also...

Sounds good of course (to add more tests). I don't have any space time to work on tesseract though, it's out of my daily usage, sorry.

The problem I see is that poppler's `pdftotext` doesn't support `-simple` or `-table`. Support for those layouts was developed in Xpdf after it has been forked by the poppler project....

It's a harmless error as explained in the `of_platform_populate() for address-less nodes (OF: Bad cell count for ...)` e-mail thread discussion.

This was a draft to show how next `rules` syntax can be adapted to the old `lines` syntax. For now https://github.com/invoice-x/invoice2data/pull/407 got closed, so I'm closing this one too. I'm...