meltano icon indicating copy to clipboard operation
meltano copied to clipboard

chore: Support "state backends" beginning with systemdb

Open cjohnhanson opened this issue 2 years ago • 2 comments

This is a significant change to how state is handled.

This PR refactors our state management to use a new StateStoreManager ABC similar to how we manage settings. The first StateStoreManager is DBStateStoreManager which just uses the system db to persist state. The primary difference in behavior is that state is now a dedicated table and we no longer have to query and parse through job run history in order to assemble current state for a job.

The refactor also gets us 90% of the way towards rolling out a new state backend that can use blob storage such as AWS S3 (see #5981). It also lays the groundwork for us to roll out any new state backend without a ton of effort. With some small amount of additional refactoring, it also lays the groundwork for us to support arbitrary backends in the form of plugins once we're able to support that in the SDK (or maybe EDK).

Currently labeled "chore" rather than "feat" because this is an invisible backend refactor without user impact. The follow-up PR to add remote blob storage backends will be the "release" PR that should be included in release changelog and demos.

To test the migration (since it's a bit more complex than most migrations and includes job run history to populate the new table) I had two venvs: one with the latest meltano version installed and one with my local meltano branch installed. I started a new project using the latest version, ran a tap-github to target-jsonl pipeline to populate minimal job history and state then checked the output of meltano state list and meltano state get. Then I switched to the local branch, ran meltano upgrade database and confirmed that the outputs for state commands matched. This was a pretty limited manual test and I would love suggestions for how to include some automated integrations tests or otherwise test the migration more thoroughly.

Closes #3340

cjohnhanson avatar Sep 13 '22 01:09 cjohnhanson

Deploy Preview for meltano ready!

Name Link
Latest commit b1f615510b7743f8284fe00de090b3e0b3ace7fe
Latest deploy log https://app.netlify.com/sites/meltano/deploys/632def0e38777700097b9b18
Deploy Preview https://deploy-preview-6742--meltano.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

netlify[bot] avatar Sep 13 '22 01:09 netlify[bot]

This was a pretty limited manual test and I would love suggestions for how to include some automated integrations tests or otherwise test the migration more thoroughly.

@cjohnhanson I think it would be pretty easy to set up an integration test to exercise that the new backend works in a more e2e fashion, since the validation step in the integration test can run an arbitrary bash script (https://github.com/meltano/meltano/blob/main/integration/meltano-run/validate.sh), so we could sqlite/psql all the things! Or even just do a couple of runs and then exercise it with the meltano state command and make sure meltano state works, returns things as expected.

If you really wanted to test the migration, we could probably also rig it up to do something like import an included sqlite dump as the first step of the integration test.

pandemicsyn avatar Sep 16 '22 15:09 pandemicsyn

Codecov Report

Merging #6742 (b1f6155) into main (91e126b) will increase coverage by 0.02%. The diff coverage is 92.85%.

@@            Coverage Diff             @@
##             main    #6742      +/-   ##
==========================================
+ Coverage   88.65%   88.68%   +0.02%     
==========================================
  Files         280      283       +3     
  Lines       20343    20471     +128     
  Branches     2009     2024      +15     
==========================================
+ Hits        18035    18154     +119     
- Misses       1972     1978       +6     
- Partials      336      339       +3     
Impacted Files Coverage Δ
src/meltano/core/plugin/singer/tap.py 83.03% <80.00%> (-0.38%) :arrow_down:
src/meltano/core/state_store.py 85.48% <85.48%> (ø)
src/meltano/core/job_state.py 93.93% <93.93%> (ø)
src/meltano/core/plugin/singer/target.py 81.96% <100.00%> (+0.93%) :arrow_up:
src/meltano/core/state_service.py 96.66% <100.00%> (+3.33%) :arrow_up:
tests/fixtures/core.py 90.68% <100.00%> (+0.20%) :arrow_up:
tests/meltano/core/plugin/singer/test_tap.py 100.00% <100.00%> (ø)
tests/meltano/core/runner/test_runner.py 99.10% <100.00%> (ø)
tests/meltano/core/test_state_store.py 100.00% <100.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Sep 22 '22 15:09 codecov[bot]

Currently labeled "chore" rather than "feat" because this is an invisible backend refactor without user impact. The follow-up PR to add remote blob storage backends will be the "release" PR that should be included in release changelog and demos.

@cjohnhanson I'm chiming in a bit too late, but there's a refactor semantic type that can be used for backend changes without user impact. Things labeled "chore" are not included in the release notes, but refactors are.

WillDaSilva avatar Sep 23 '22 18:09 WillDaSilva

@WillDaSilva thanks for the info, that's definitely what I should have used here 😅

cjohnhanson avatar Sep 23 '22 18:09 cjohnhanson