carp self-contained tests for tasks

self-contained tests for tasks

Open SebastienGllmt opened this issue 2 years ago • 10 comments

Right now we only have integration tests. These tests are fine, but require a fully synced instance of the node so it's hard to use as a ci test or for internal testing for devs who don't have a synchronized instance yet.

We could add a test system to test tasks individually as an inyegration test of the indexer itself. Unfortunately this is non-trivial because creating mock chains for testing is one of the hardest things you could attempt for blockchains.

Probably the best way to do this while avoiding having to build a mock chain is to have ot work the following way:

Test takes in a block cbor constructed by CML and an EP
It runs all tasks in some test mode where reads and writes return mock values. I am not sure if sea-orm provides an easy way to build mock databases though
Tests at the end that the execution context has the expexted value

This will be really complicated to get correct, really complicated to write correctly and only provide marginal improvements over the existing system which is why I never spent time on it.

You could also buils unit tests for simgle tasks in the same way instead of integration teats for the whole execution graph, but the test infra you would need for this ends up being the same so you may as well try and catch more bugs with integration tests

May 29 '22 22:05 SebastienGllmt

Related to https://github.com/dcSpark/carp/issues/33

May 29 '22 22:05 SebastienGllmt

It runs all tasks in some test mode where reads and writes return mock values. I am not sure if sea-orm provides an easy way to build mock databases though

Typically, I would recommend abstracting the DB and Chain away, so you don't need to even think about what sea-orm provides. You would then mock the chain query responses and and have a minimal in-memory impl for the DB to use in tests.

Since the code already takes a hard dependency on sea-orm, I can only imagine that this would be a pain. Might be worth in in the long-run though :) At least, that's my naive take.

May 29 '22 23:05 MitchTurner

I've been looking into this a bit more. I have some ideas.

I'm thinking it might be possible to decouple Config from the concrete impls for the input (oura) and the DB conn.

On a quick pass, input.recv() is just called at that high level, so that should be easy enough to hide behind some trait. And all the places I can see self.conn called, it is calling Trait methods: TransactionTrait and ConnectionTrait. I'd prefer to take a dep on domain traits, rather than sea-orm traits, but to keep this experiment from bloating I'll probably just have Config depend on those traits for now.

Then I'll try writing some mocks for the new Input trait and the DB traits and making creating a super stupid test exec_plan with dummy tasks maybe and see what I can learn from there.

Might be a dead end, but then I'll understand better. Break some eggs.

May 30 '22 06:05 MitchTurner

I don't think we should need to abstract away the input.recv()for task-based tests since that call happens before the task framework runs (in a separate crate).

If you want to abstract away the input.recv(), you can find a separate issue for this here: https://github.com/dcSpark/carp/issues/45

May 31 '22 04:05 SebastienGllmt

Yeah. It would definitely be less context to manage to just test the tasks against process_multiera_block, etc. Thanks. I think I'll go that route.

As a side note, since Config.start consumes the exec_plan, there is a part of me that thinks we're missing some test coverage by only testing the tasks at the process_<era>_block level. A simple refactor that would address that is add a "tasks handler" object that gets dep-injected on startup. It would hold it's own exec_plan that doesn't need to be fed through start -> insert_block.

Then I'd be much more confident that testing the tasks in their crate would be plenty.

Jun 01 '22 07:06 MitchTurner

I've made some good progress this weekend. I'm just working on the Genesis tasks for now, while I'm getting to know shred and sea-orm.

At first, I decided to do the in-memory sqlite db to minimize overhead. BUT, Some of the tasks require RETURNING which isn't supported by sqlite. Idk if we need returning, but considering you've created a separate fork of sea-orm to give us more returning support, I assume it's preferred.

I've run through sea-orm's integ tests, which rely on the db being run in a docker container. I've tried running my tests against their container, but my tests seem to hang indefinitely. Gonna keep debugging.

Jun 05 '22 19:06 MitchTurner

The multiple returning is just to make sure the order is the proper order. It's not that important -- you could replace it with a function that zips the input with the index in the list. This felt a little hacky so I preferred using returning since I assumed the perf hit of the added return data would be limited. If it's a blocker for another reason, we could get rid of it.

Jun 05 '22 20:06 SebastienGllmt

Sounds good. I'll give that a shot.

Jun 05 '22 20:06 MitchTurner

K. I have a draft PR up here: https://github.com/MitchTurner/carp/pull/2

I was able to remove the exec_many_with_returning from the genesis txs task. It's at least compiling and I can query the added values, so that's a good sign. I'm not sure if I've broken anything yet. Luckily I can write some tests around it now to makes sure it's right :partying_face:

I'm not sure yet what you meant by

make sure the order is the proper order

The order of which values? I didn't end up needing to zip anything, so I'm worried I missed something. Regardless, this is a perfect candidate for a test.

Jun 05 '22 22:06 MitchTurner

K. I think this PR for the Genesis Tasks is ready for review: https://github.com/dcSpark/carp/pull/91

Sorry, it's a little beefy.

Jun 11 '22 06:06 MitchTurner

carp carp copied to clipboard

self-contained tests for tasks

carp
carp copied to clipboard