beangulp icon indicating copy to clipboard operation
beangulp copied to clipboard

Taking a stab at bulk testing importers.

Open alexrkaufman opened this issue 9 months ago • 4 comments

I used the code from testing.py as inspiration but tried to match the approach used in _extract

This adds a commands to the cli for test and generate that can use the same importer configurations as you would use for typical use.

I dont know that this is necessarily a good approach and would love feedback on how I could improve this or alternate paths for making testing custom beangulp importers easier.

Seems related to https://github.com/beancount/beangulp/pull/118 and https://github.com/beancount/beangulp/issues/134.

alexrkaufman avatar Mar 29 '25 20:03 alexrkaufman

the force push was just to fix changes I accidentally included that were accidentally imposed by my formatter which should make it clearer what I actually changed.

Hope this helps!

alexrkaufman avatar Mar 29 '25 20:03 alexrkaufman

Can you explain which problem this solves? At a quick glance it looks a copy of the commands implemented in the beangulp.testing module. Why is it needed? If you want to test more than one importer at a time, just iterate over the importers in a script or a makefile.

dnicolodi avatar Mar 29 '25 20:03 dnicolodi

Hi! Thank you for getting back to me and taking a look at this.

The problem this is meant to solve is to make testing more straightforward for users and enable testing with minimal (if any) modification to their import.py files.

You are right that this is very similar to testing.py. I modified that approach which enforces only one importer in the context to allow multiple importer. I made the test and generate commands part of the same cli as the identify, extract, and archive commands so the use becomes:

./import.py generate test_documents/ 

Would it be better to just modify the code that is already implemented in testing.py and then import it into __init__.py so that it exposes the test and generate commands?

It is possible this is overfit for my use case but below are some benefits I felt this approach had over a script like runtests.sh

  • It doesnt require writing an additional script.
  • This also allows me to use my archive folder as the test documents folder.
  • runtests.sh loops over each importer and checks every file in the given folder. What I wrote uses the same identification method as identify and extract.
  • This approach does not requires duplicating configuration already implemented in import.py

I have made previous unfruitful attempts to get testing set up and spent the better part of today going through the mailing list, examples here, old beancount.ingest documentation, and reds importers trying to figure this out. It is very likely that there are concepts or core ideas I am not aware of. Those other issues I linked to seem like they may also solve a similar problem and probably are suggesting something better but I don’t understand them so I tried to do what I was able to figure out. If there is a better approach to pursue with pytest or something please let me know and I would be happy to pursue that!

alexrkaufman avatar Mar 29 '25 21:03 alexrkaufman

Hello, just following up here. I hope the explanation of the situation I was in makes sense. If this is not a good approach please close this pull request and let me know if there is a better thread to pull like getting #134 to a state where it could be included in the main branch.

Greatly appreciate all the work on the project so far and would love to contribute.

alexrkaufman avatar Apr 19 '25 16:04 alexrkaufman