materialize
materialize copied to clipboard
WIP: suss out problems arising from storaged parallelism
For running CI.
@philip-stoev The failures where we have too many true
results come from the fact that mz_materializations
(which these queries use) has an entry per worker. For example, on a 4-worker cluster you will get:
materialize=> select * from mz_materializations;
global_id | worker
-----------+--------
u2 | 0
u2 | 1
u2 | 2
u2 | 3
(4 rows)
There is an actual bug in upsert that I have a fix for. There is another bug in Debezium that I didn't yet fix.
And I think the "Cluster smoke test" might be failing because of this known bug/flake: https://github.com/MaterializeInc/materialize/issues/14533. But I'm not yet 100% sure.
Yes, I have a fix for the mz_materializations
, so please disregard those for the time being.
When ready, please push only our fix and none of the changes needed to get the --workers 4 test running. I have a separate branch that achieves that but in a different way.
I pushed one fix in https://github.com/MaterializeInc/materialize/pull/14917. I'm afraid the tests that use envelope debezium (without upsert) are harder to fix. I think our current approach for ENVELOPE DEBEZIUM
doesn't work with multiple workers because we don't maintain the order of messages that we read from Kafka. There's some decoding/exchange steps in between but our logic somewhat relies on the order being preserved. I don't want to spend much more time on this because we don't offer DEBEZIUM
(without UPSERT
) to users yet.