hledger icon indicating copy to clipboard operation
hledger copied to clipboard

Separate testsuite from code

Open zarybnicky opened this issue 8 years ago • 8 comments

Having started working on hledger a bit more (now trying to optimize both speed and memory usage - after two largish edits, converting Journal to Data.Sequence, I've halved the run time of 'hledger stats', as measured on 1000x1000x10), I find having tests inline to be quite bothersome. I'd like to move them to a separate directory structure - eg. hledger-lib/tests/Hledger/Data/AccountSpec.hs, to use a naming scheme I saw in Stack.

I'm creating an issue instead of directly opening a pull request just to see if this isn't something you are terribly opposed to :)

zarybnicky avatar Sep 06 '17 17:09 zarybnicky

I don't know if I'm for it or against it, but keen to try it.

It has been a long time since I found the unit tests convenient enough to use. How are they bothering you, out of interest ?

On Sep 6, 2017, at 10:46 AM, Jakub Zárybnický [email protected] wrote:

Having started working on hledger a bit more (now trying to optimize both speed and memory usage - after two largish edits, I've halved the run time of 'hledger stats', as measured on 1000x1000x10), I find having tests inline to be quite bothersome. I'd like to move them to a separate directory structure - eg. hledger-lib/tests/Hledger/Data/AccountSpec.hs, to use a naming scheme I saw in Stack.

I'm creating an issue instead of directly opening a pull request just to see if this isn't something you are terribly opposed to :)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

simonmichael avatar Sep 06 '17 18:09 simonmichael

I first noticed it when I was recompiling again and again to find places where I've broken types - having tests in the same files as code means that they were being compiled two times - or four times when I was profiling. Maybe more, I can't recall offhand if doctests were doing the same thing. At least hunittests depend on hledger-lib and have the same files set as source at the same time.

When I was extracting them, I noticed some tests hidden behind a TESTS CPP flag and quite messy, so that would be another thing to get rid of. (I can't point you to the code right now, sorry.)

  1. září 2017 20:59:53 SELČ, Simon Michael [email protected] napsal:

I don't know if I'm for it or against it, but keen to try it.

It has been a long time since I found the unit tests convenient enough to use. How are they bothering you, out of interest ?

On Sep 6, 2017, at 10:46 AM, Jakub Zárybnický [email protected] wrote:

Having started working on hledger a bit more (now trying to optimize both speed and memory usage - after two largish edits, I've halved the run time of 'hledger stats', as measured on 1000x1000x10), I find having tests inline to be quite bothersome. I'd like to move them to a separate directory structure - eg. hledger-lib/tests/Hledger/Data/AccountSpec.hs, to use a naming scheme I saw in Stack.

I'm creating an issue instead of directly opening a pull request just to see if this isn't something you are terribly opposed to :)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/simonmichael/hledger/issues/610#issuecomment-327581706

zarybnicky avatar Sep 06 '17 19:09 zarybnicky

Right now, I've just dumped them all into the hunittests.hs file, but splitting them back into modules won't be hard, just time consuming, after sorting out the two or three samplejournal variables that were laying about.

There are several variants - putting them into the top level tests dir (tests/hledger-lib/..., not the best IMO), putting them in the project dir (hledger-lib/test, so so), or splitting project dirs into src and test dirs (hledger-lib/src, hledger-lib/test, I like this the best).

  1. září 2017 20:59:53 SELČ, Simon Michael [email protected] napsal:

I don't know if I'm for it or against it, but keen to try it.

It has been a long time since I found the unit tests convenient enough to use. How are they bothering you, out of interest ?

On Sep 6, 2017, at 10:46 AM, Jakub Zárybnický [email protected] wrote:

Having started working on hledger a bit more (now trying to optimize both speed and memory usage - after two largish edits, I've halved the run time of 'hledger stats', as measured on 1000x1000x10), I find having tests inline to be quite bothersome. I'd like to move them to a separate directory structure - eg. hledger-lib/tests/Hledger/Data/AccountSpec.hs, to use a naming scheme I saw in Stack.

I'm creating an issue instead of directly opening a pull request just to see if this isn't something you are terribly opposed to :)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/simonmichael/hledger/issues/610#issuecomment-327581706

zarybnicky avatar Sep 06 '17 19:09 zarybnicky

I first noticed it when I was recompiling again and again to find places where I've broken types - having tests in the same files as code means that they were being compiled two times - or four times when I was profiling. Maybe more, I can't recall offhand if doctests were doing the same thing. At least hunittests depend on hledger-lib and have the same files set as source at the same time.

I'm not following you here.

What's your recompilation workflow ? I can recommend ghcid as the fastest, eg: ghcid -c "make ghci"

When I was extracting them, I noticed some tests hidden behind a TESTS CPP flag and quite messy, so that would be another thing to get rid of. (I can't point you to the code right now, sorry.)

I know what you mean, things like this should certainly be cleaned up.

simonmichael avatar Sep 06 '17 21:09 simonmichael

PS in case it's not obvious, the thinking behind keeping tests right next to the code they test was to minimise friction for writing and updating tests. Tests hidden away in another directory tree (or even another file) are less likely to be written and updated, I feel. I could be wrong.

The test code does visually clutter the file as you're trying to read production code. If I could, I would move all the unit tests to doctests, which are both visually distinct, easier to write, and useful as documentation. However running doctests seems to be much too slow.

simonmichael avatar Sep 06 '17 21:09 simonmichael

The particular command where this was most obvious was when I was profiling hunittests, i.e. stack test --profile hledger-lib:test:hunittests --ta "+RTS -p -RTS". I've tried ghcid right now together with this command - maybe I'm missing some flags in the ghci invocation, but stack test still needs to recompile all changed (and dependent) files twice (once to .o, once to .p_o) to compile hledger-lib, and twice to compile hledger-lib:test:hunittests, because it contains them as other-modules. If I change Hledger.Data.Types, it takes well over 10 minutes to run the above command.

I understand having tests right next to the code, but right now, tests are split into tests/ (functests), tools/*test.[py,hs] (runners and a single regressiontest.py?), hledger-web/tests, hledger-lib next to the code, and maybe some other place I'm missing.

I've looked a bit into why doctest might be slow, but I got stuck quite fast - it's a weird mix of TemplateHaskell, ghci and ghc itself, and I quickly got lost. It's not unreadable code, I just didn't know where to start - while I've written several projects in Haskell, I haven't played around with ghc or TH so far.

On another note, I've just managed to avoid the double recompilation issue by simply moving the source-dirs clause in hledger-lib/package.yaml into the library clause, which means that hpack doesn't generate any other-modules in the cabal file in the testsuite section.

zarybnicky avatar Sep 06 '17 22:09 zarybnicky

stack's rebuilding of everything when testing or profiling is painful, indeed. Glad you're find ways to move forward.

simonmichael avatar Sep 06 '17 22:09 simonmichael

Also, having just looked around some of top Hackage packages and https://github.com/Gabriel439/post-rfc/blob/master/sotu.md - every package that has some tests (eg. accelerate doesn't?) has them either in a top level test(s) or src/test(s) directory. While this isn't much of an argument, consistency between projects might be also something to aim for. It also means that hledger distributions don't bundle tests - it might be worth looking how much they contribute to binary size, though I expect it won't be such a big difference.

zarybnicky avatar Sep 06 '17 22:09 zarybnicky