dolt
dolt copied to clipboard
Three Way Merging can result in an incorrect column ordering
Steps to reproduce:
git checkout nicktobey/schemamergetests
go test libraries/doltcore/merge/schema_merge_test.go -run "^\QTestSchemaMerge\E$/^\Qcolumn_add\E$/^\Qdrop_tests\E$/^\Qmerge_right_to_left\E$/^\Qleft_side_column_add_with_additional_column_after\E$/^\Qleft_side_adds_column_and_assigns_non-null_value,_extra_column_has_data_change_on_right\E$"
Output:
--- FAIL: TestSchemaMerge (0.00s)
--- FAIL: TestSchemaMerge/column_add/drop_tests (0.00s)
--- FAIL: TestSchemaMerge/column_add/drop_tests/merge_right_to_left (0.00s)
--- FAIL: TestSchemaMerge/column_add/drop_tests/merge_right_to_left/left_side_column_add_with_additional_column_after (0.00s)
--- FAIL: TestSchemaMerge/column_add/drop_tests/merge_right_to_left/left_side_column_add_with_additional_column_after/left_side_adds_column_and_assigns_non-null_value,_extra_column_has_data_change_on_right (0.00s)
schema_merge_test.go:934:
Error Trace: /Users/nick/Documents/dolt/go/libraries/doltcore/merge/schema_merge_test.go:934
/Users/nick/Documents/dolt/go/libraries/doltcore/merge/schema_merge_test.go:956
Error: Not equal:
expected: hash.Hash{0x49, 0xba, 0x43, 0xbc, 0x38, 0x51, 0x96, 0x39, 0x9a, 0x1b, 0xa6, 0x80, 0x5b, 0x95, 0xc0, 0xcb, 0x77, 0x44, 0x82, 0x1f}
actual : hash.Hash{0x20, 0x74, 0xe, 0xd4, 0x6f, 0xfe, 0xb6, 0x77, 0x6a, 0x17, 0xa7, 0x3f, 0xbf, 0x15, 0x28, 0xf3, 0x6e, 0xe, 0xb, 0xd6}
Diff:
--- Expected
+++ Actual
@@ -1,4 +1,4 @@
(hash.Hash) (len=20) {
- 00000000 49 ba 43 bc 38 51 96 39 9a 1b a6 80 5b 95 c0 cb |I.C.8Q.9....[...|
- 00000010 77 44 82 1f |wD..|
+ 00000000 20 74 0e d4 6f fe b6 77 6a 17 a7 3f bf 15 28 f3 | t..o..wj..?..(.|
+ 00000010 6e 0e 0b d6 |n...|
}
Test: TestSchemaMerge/column_add/drop_tests/merge_right_to_left/left_side_column_add_with_additional_column_after/left_side_adds_column_and_assigns_non-null_value,_extra_column_has_data_change_on_right
schema_merge_test.go:937: expected rows:
{ key: 01000000 value: 02000000, 04000000 }
schema_merge_test.go:938: actual rows:
{ key: 01000000 value: 04000000, 02000000 }
What I suspect is going on:
IIUC, when we perform a three-way merge while there's a schema change, we generate a new tuple description for the merged table by taking the columns from the left side, removing any that were dropped by the schema change, and appending to the end any that were adding by the right side. This means that the resulting tuple description does not necessarily match the merged schema in regards to the ordering of the columns.
However, I suspect that some part of the logic assumes that the order of columns in the merged tuple matches the merged schema, which can cause the values to get swapped around in the output.
I don't know whether this order is what's ultimately written to disk or whether it gets fixed up later in the run.
I can confirm that our schema merge code hasn't ever respected ordering of columns – new columns are added at the end of the merged schema. There are some TODOs in merge_schema.go
around the column merging logic. Nothing goes back and fixes the ordering up. The same order is used to write the table to disk.