clickhouse-docs icon indicating copy to clipboard operation
clickhouse-docs copied to clipboard

Improve documentation of formats

Open Blargian opened this issue 11 months ago • 2 comments

The following areas of improvement for formats are identified:

  • [X] All formats are on a single page - lots of clutter. They should be separated out similar to how we have functions in the SQL reference. →see #73717

  • [X] Each format should contain the same basic information such as Description, Data Types Matching (if relevant), Example Usage, Format Settings. Format Settings should ideally be autogenerated.

  • [ ] Many formats are missing full documentation. eg. CustomSeparatedIgnoreSpacesWithNames, CustomSeparatedIgnoreSpacesWithNamesAndTypes

  • [ ] Not all formats have example usage, quality and depth of examples varies considerably. It would be amazing to have each format have a few good examples using the example datasets.

  • [ ] Examples should ideally cover the same information such as how to insert data with the format, select data and how any relevant important settings work.

  • [ ] These are reference docs, we should maybe consider moving them to the SQL reference section and replace the page above with a simplified overview page that links to more detailed reference details.

Blargian avatar Dec 22 '24 23:12 Blargian

Formats identified as incomplete (either missing a description or an example, or both) that need attention first:

  • [ ] ArrowStream
  • [ ] CSVWithNames
  • [ ] CSVWithNamesAndTypes
  • [x] CustomSeparatedIgnoreSpaces
  • [x] CustomSeparatedIgnoreSpacesWithNames
  • [x] CustomSeparatedIgnoreSpacesWithNamesAndTypes
  • [x] CustomSeparatedWithNames
  • [x] CustomSeparatedWithNamesAndTypes
  • [ ] HiveText
  • [x] JSONCompactEachRowWithNames
  • [x] JSONCompactEachRowWithNamesAndTypes
  • [x] JSONCompactStrings
  • [x] JSONCompactStringsEachRowWithNamesAndTypes
  • [x] JSONLines
  • [ ] LineAsStringWithNames
  • [ ] LineAsStringWithNamesAndTypes
  • [ ] MySQLWire
  • [ ] NDJSON
  • [ ] Native
  • [x] Null
  • [ ] ODBCDriver2
  • [ ] PostgreSQLWire
  • [ ] PrettyCompact
  • [ ] PrettyCompactMonoBlock
  • [ ] PrettyCompactNoEscapes
  • [ ] PrettyCompactNoEscapesMonoBlock
  • [ ] PrettyJSONLines
  • [ ] PrettyMonoBlock
  • [ ] PrettyNDJSON
  • [ ] PrettyNoEscapesMonoBlock
  • [ ] PrettySpace
  • [ ] PrettySpaceMonoBlock
  • [ ] PrettySpaceNoEscapes
  • [ ] PrettySpaceNoEscapesMonoBlock
  • [x] ProtobufSingle
  • [ ] Raw
  • [ ] RawWithNames
  • [ ] RawWithNamesAndTypes
  • [ ] RowBinary
  • [ ] RowBinaryWithNames
  • [ ] RowBinaryWithNamesAndTypes
  • [ ] TSVRaw
  • [ ] TSVRawWithNames
  • [ ] TSVRawWithNamesAndTypes
  • [ ] TSVWithNames
  • [ ] TSVWithNamesAndTypes
  • [x] TabSeparatedRaw
  • [x] TabSeparatedRawWithNames
  • [x] TabSeparatedRawWithNamesAndTypes
  • [x] TabSeparatedWithNames
  • [x] TabSeparatedWithNamesAndTypes
  • [ ] Values
  • [ ] XML

Blargian avatar Dec 23 '24 12:12 Blargian

@Blargian im not sure a left menu item per format is sustainable and its going to produce a very long scroll. Does each format really need its own page? could we have a page per subtype e.g. JSON, TSV for example?

gingerwizard avatar Dec 24 '24 14:12 gingerwizard

CC @gingerwizard

Blargian avatar Apr 11 '25 14:04 Blargian

Working through usage examples

CSV https://github.com/ClickHouse/ClickHouse/pull/81530

sdairs avatar Jun 09 '25 11:06 sdairs

Grouping others into a single PR https://github.com/ClickHouse/ClickHouse/pull/81539

sdairs avatar Jun 09 '25 13:06 sdairs