clickhouse-docs
clickhouse-docs copied to clipboard
Improve documentation of formats
The following areas of improvement for formats are identified:
-
[X] All formats are on a single page - lots of clutter. They should be separated out similar to how we have functions in the SQL reference. →see #73717
-
[X] Each format should contain the same basic information such as Description, Data Types Matching (if relevant), Example Usage, Format Settings. Format Settings should ideally be autogenerated.
-
[ ] Many formats are missing full documentation. eg.
CustomSeparatedIgnoreSpacesWithNames,CustomSeparatedIgnoreSpacesWithNamesAndTypes -
[ ] Not all formats have example usage, quality and depth of examples varies considerably. It would be amazing to have each format have a few good examples using the example datasets.
-
[ ] Examples should ideally cover the same information such as how to insert data with the format, select data and how any relevant important settings work.
-
[ ] These are reference docs, we should maybe consider moving them to the SQL reference section and replace the page above with a simplified overview page that links to more detailed reference details.
Formats identified as incomplete (either missing a description or an example, or both) that need attention first:
- [ ] ArrowStream
- [ ] CSVWithNames
- [ ] CSVWithNamesAndTypes
- [x] CustomSeparatedIgnoreSpaces
- [x] CustomSeparatedIgnoreSpacesWithNames
- [x] CustomSeparatedIgnoreSpacesWithNamesAndTypes
- [x] CustomSeparatedWithNames
- [x] CustomSeparatedWithNamesAndTypes
- [ ] HiveText
- [x] JSONCompactEachRowWithNames
- [x] JSONCompactEachRowWithNamesAndTypes
- [x] JSONCompactStrings
- [x] JSONCompactStringsEachRowWithNamesAndTypes
- [x] JSONLines
- [ ] LineAsStringWithNames
- [ ] LineAsStringWithNamesAndTypes
- [ ] MySQLWire
- [ ] NDJSON
- [ ] Native
- [x] Null
- [ ] ODBCDriver2
- [ ] PostgreSQLWire
- [ ] PrettyCompact
- [ ] PrettyCompactMonoBlock
- [ ] PrettyCompactNoEscapes
- [ ] PrettyCompactNoEscapesMonoBlock
- [ ] PrettyJSONLines
- [ ] PrettyMonoBlock
- [ ] PrettyNDJSON
- [ ] PrettyNoEscapesMonoBlock
- [ ] PrettySpace
- [ ] PrettySpaceMonoBlock
- [ ] PrettySpaceNoEscapes
- [ ] PrettySpaceNoEscapesMonoBlock
- [x] ProtobufSingle
- [ ] Raw
- [ ] RawWithNames
- [ ] RawWithNamesAndTypes
- [ ] RowBinary
- [ ] RowBinaryWithNames
- [ ] RowBinaryWithNamesAndTypes
- [ ] TSVRaw
- [ ] TSVRawWithNames
- [ ] TSVRawWithNamesAndTypes
- [ ] TSVWithNames
- [ ] TSVWithNamesAndTypes
- [x] TabSeparatedRaw
- [x] TabSeparatedRawWithNames
- [x] TabSeparatedRawWithNamesAndTypes
- [x] TabSeparatedWithNames
- [x] TabSeparatedWithNamesAndTypes
- [ ] Values
- [ ] XML
@Blargian im not sure a left menu item per format is sustainable and its going to produce a very long scroll. Does each format really need its own page? could we have a page per subtype e.g. JSON, TSV for example?
CC @gingerwizard
Working through usage examples
CSV https://github.com/ClickHouse/ClickHouse/pull/81530
Grouping others into a single PR https://github.com/ClickHouse/ClickHouse/pull/81539