specs icon indicating copy to clipboard operation
specs copied to clipboard

Separate concerns by adding a `dialect.type` field

Open khusmann opened this issue 1 year ago • 0 comments

Our current approach in Table Dialect is to mix-in new properties for new formats. This means properties like delimter can be set along with sheetName, which doesn't make sense. It's going to get more unwieldy & potentially contradictory the more features of different formats we support.

To separate concerns & make validation easier / better defined, I would suggest we add a dialect.type field that enables us to separate delimited, sql, workbook, structured. Then, like field types, this would form a discriminated union and dictate which properties would also be present in the dialect:

When "type" = "delimited", we could set delimiter, lineTerminator, etc.

When "type" = "sql", we could set table, etc.

When "type" = "workbook", we could set sheetNumber, sheetName, etc.

New formats could be added via new dialect types, in the same way new fields are added via new field types.

This would really help with declarative parsing systems (like pydantic, zod, etc.) by making illegal states unrepresentable.

It would also be 100% backwards compatible if we made "type" = "delimited" the default, when type was unset.

khusmann avatar Apr 17 '24 17:04 khusmann