clickhouse-docs icon indicating copy to clipboard operation
clickhouse-docs copied to clipboard

Document `toXYZ` functions.

Open johnnymatthews opened this issue 1 year ago • 5 comments

Create documentation for the toXYZ functions. See #1833 for details on how to complete this issue.

  • [ ] toBool
  • [ ] toDateTime64OrDefault
  • [ ] toDateTime64OrNull
  • [ ] toDateTime64OrZero
  • [ ] toDecimal128
  • [ ] toDecimal128OrDefault
  • [ ] toDecimal128OrNull
  • [ ] toDecimal128OrZero
  • [ ] toDecimal256
  • [ ] toDecimal256OrDefault
  • [ ] toDecimal256OrNull
  • [ ] toDecimal256OrZero
  • [ ] toDecimal32
  • [ ] toDecimal32OrDefault
  • [ ] toDecimal32OrNull
  • [ ] toDecimal32OrZero
  • [ ] toDecimal64
  • [ ] toDecimal64OrDefault
  • [ ] toDecimal64OrNull
  • [ ] toDecimal64OrZero
  • [ ] toDecimalString
  • [ ] toFixedString
  • [ ] toFloat32
  • [ ] toFloat32OrDefault
  • [ ] toFloat32OrNull
  • [ ] toFloat32OrZero
  • [ ] toFloat64
  • [ ] toFloat64OrDefault
  • [ ] toFloat64OrNull
  • [ ] toFloat64OrZero
  • [ ] toIPv4OrDefault
  • [ ] toIPv4OrNull
  • [ ] toIPv4OrZero
  • [ ] toIPv6OrDefault
  • [ ] toIPv6OrNull
  • [ ] toIPv6OrZero
  • [ ] toISOWeek
  • [ ] toISOYear
  • [ ] toInt128
  • [ ] toInt128OrDefault
  • [ ] toInt128OrNull
  • [ ] toInt128OrZero
  • [ ] toInt16
  • [ ] toInt16OrDefault
  • [ ] toInt16OrNull
  • [ ] toInt16OrZero
  • [ ] toInt256
  • [ ] toInt256OrDefault
  • [ ] toInt256OrNull
  • [ ] toInt256OrZero
  • [ ] toInt32
  • [ ] toInt32OrDefault
  • [ ] toInt32OrNull
  • [ ] toInt32OrZero
  • [ ] toInt64
  • [ ] toInt64OrDefault
  • [ ] toInt64OrNull
  • [ ] toInt64OrZero
  • [ ] toInt8
  • [ ] toInt8OrDefault
  • [ ] toInt8OrNull
  • [ ] toInt8OrZero
  • [ ] toIntervalDay
  • [ ] toIntervalHour
  • [ ] toIntervalMicrosecond
  • [ ] toIntervalMillisecond
  • [ ] toIntervalMinute
  • [ ] toIntervalMonth
  • [ ] toIntervalNanosecond
  • [ ] toIntervalQuarter
  • [ ] toIntervalSecond
  • [ ] toIntervalWeek
  • [ ] toIntervalYear
  • [ ] toRelativeDayNum
  • [ ] toRelativeHourNum
  • [ ] toRelativeMinuteNum
  • [ ] toRelativeMonthNum
  • [ ] toRelativeQuarterNum
  • [ ] toRelativeSecondNum
  • [ ] toRelativeWeekNum
  • [ ] toRelativeYearNum
  • [ ] toStartOfMicrosecond
  • [ ] toStartOfMillisecond
  • [ ] toStartOfNanosecond
  • [ ] toTime
  • [ ] toTimezone
  • [ ] toTypeName
  • [ ] toUInt128
  • [ ] toUInt128OrNull
  • [ ] toUInt128OrZero
  • [ ] toUInt16
  • [ ] toUInt16OrDefault
  • [ ] toUInt16OrNull
  • [ ] toUInt16OrZero
  • [ ] toUInt256
  • [ ] toUInt256OrDefault
  • [ ] toUInt256OrNull
  • [ ] toUInt256OrZero
  • [ ] toUInt32
  • [ ] toUInt32OrDefault
  • [ ] toUInt32OrNull
  • [ ] toUInt32OrZero
  • [ ] toUInt64
  • [ ] toUInt64OrDefault
  • [ ] toUInt64OrNull
  • [ ] toUInt64OrZero
  • [ ] toUInt8
  • [ ] toUInt8OrDefault
  • [ ] toUInt8OrNull
  • [ ] toUInt8OrZero
  • [ ] toUnixTimestamp64Micro
  • [ ] toUnixTimestamp64Milli

Child issue of #1833.

johnnymatthews avatar Feb 11 '24 20:02 johnnymatthews

  • [x] toBool
  • [x] toDateTime64OrDefault
  • [x] toDateTime64OrNull
  • [x] toDateTime64OrZero
  • [x] toDecimal128
  • [x] toDecimal128OrDefault
  • [x] toDecimal128OrNull
  • [x] toDecimal128OrZero
  • [x] toDecimal256
  • [x] toDecimal256OrDefault
  • [x] toDecimal256OrNull
  • [x] toDecimal256OrZero
  • [x] toDecimal32
  • [x] toDecimal32OrDefault
  • [x] toDecimal32OrNull
  • [x] toDecimal32OrZero
  • [x] toDecimal64
  • [x] toDecimal64OrDefault
  • [x] toDecimal64OrNull
  • [x] toDecimal64OrZero
  • [ ] toDecimalString
  • [ ] toFixedString
  • [x] toFloat32
  • [x] toFloat32OrDefault
  • [x] toFloat32OrNull
  • [x] toFloat32OrZero
  • [x] toFloat64
  • [x] toFloat64OrDefault
  • [x] toFloat64OrNull
  • [x] toFloat64OrZero
  • [ ] toIPv4OrDefault
  • [ ] toIPv4OrNull
  • [ ] toIPv4OrZero
  • [ ] toIPv6OrDefault
  • [ ] toIPv6OrNull
  • [ ] toIPv6OrZero
  • [ ] toISOWeek
  • [ ] toISOYear
  • [x] toInt128
  • [x] toInt128OrDefault
  • [x] toInt128OrNull
  • [x] toInt128OrZero
  • [X] toInt16
  • [X] toInt16OrDefault
  • [X] toInt16OrNull
  • [X] toInt16OrZero
  • [x] toInt256
  • [x] toInt256OrDefault
  • [x] toInt256OrNull
  • [x] toInt256OrZero
  • [X] toInt32
  • [X] toInt32OrDefault
  • [X] toInt32OrNull
  • [X] toInt32OrZero
  • [x] toInt64
  • [x] toInt64OrDefault
  • [x] toInt64OrNull
  • [x] toInt64OrZero
  • [X] toInt8
  • [X] toInt8OrDefault
  • [X] toInt8OrNull
  • [X] toInt8OrZero
  • [x] toIntervalDay
  • [x] toIntervalHour
  • [x] toIntervalMicrosecond
  • [x] toIntervalMillisecond
  • [x] toIntervalMinute
  • [x] toIntervalMonth
  • [x] toIntervalNanosecond
  • [x] toIntervalQuarter
  • [x] toIntervalSecond
  • [x] toIntervalWeek
  • [x] toIntervalYear
  • [x] toRelativeDayNum
  • [x] toRelativeHourNum
  • [x] toRelativeMinuteNum
  • [x] toRelativeMonthNum
  • [x] toRelativeQuarterNum
  • [x] toRelativeSecondNum
  • [x] toRelativeWeekNum
  • [x] toRelativeYearNum
  • [X] toStartOfMicrosecond
  • [X] toStartOfMillisecond
  • [X] toStartOfNanosecond
  • [x] toTime
  • [ ] toTimezone
  • [ ] toTypeName
  • [x] toUInt128
  • [x] toUInt128OrNull
  • [x] toUInt128OrZero
  • [x] toUInt16
  • [x] toUInt16OrDefault
  • [x] toUInt16OrNull
  • [x] toUInt16OrZero
  • [x] toUInt256
  • [x] toUInt256OrDefault
  • [x] toUInt256OrNull
  • [x] toUInt256OrZero
  • [x] toUInt32
  • [x] toUInt32OrDefault
  • [x] toUInt32OrNull
  • [x] toUInt32OrZero
  • [x] toUInt64
  • [x] toUInt64OrDefault
  • [x] toUInt64OrNull
  • [x] toUInt64OrZero
  • [x] toUInt8
  • [x] toUInt8OrDefault
  • [x] toUInt8OrNull
  • [x] toUInt8OrZero
  • [X] toUnixTimestamp64Micro
  • [X] toUnixTimestamp64Milli

Blargian avatar May 27 '24 15:05 Blargian

@rschu1ze In the docs for type conversion functions we have toUInt(8|16|32|64|256), toUInt(8|16|32|64|256) and other variants where all the functions are grouped together. We also have reinterpretAs(U)Int* functions which are not grouped but rather listed individually : reInterpretUInt8, reInterpretUInt16, reInterpretUInt32... etc.

Should I split up the grouped functions to individual listings?

On the one hand it makes the Type Conversion Functions more verbose as there is a lot of copy-pasted information, but on the other hand I think this is maybe not the worst thing for documentation. In my opinion it looks neater, makes it easier to find what you are looking for and makes the docs more complete if there are examples for each function rather than just one generic example for each group.

I gather that there are differing opinions on this between the docs team and the core team based on the discussion here.

Blargian avatar Jul 17 '24 06:07 Blargian

@Blargian That is one of the questions where one can ask 10 people and get 10 different answers :-) My personal opinion is that we should not group 8|16|32|64|128|256 functions but list them individually BUT still try to create a strong cohesion by linking related functions in the "See also" sub-section. For example, toUInt16 could be linked

  • to toUInt16OrNull, toUInt16OrDefault, toUInt16OrZero (which covers all Or* function variants)
  • to toUInt8, toUInt32, toUInt64, toUInt128, toUInt256` (which covers the other data types)

rschu1ze avatar Jul 17 '24 08:07 rschu1ze

@rschu1ze I'm working now on documenting toRelative*Num functions and finding these a little strange. According to the current documentation these functions convert a date or date with time to the number of years, months, days etc from a fixed reference point in time. This is, at least from what I can see for types:

  • Date, DateTime : 1970-01-01
  • DateTime64 : 1900-01-01

Based on this:

SELECT toRelativeYearNum(toDate('1969-01-01'));
SELECT toRelativeYearNum(toDateTime('1969-01-01'));
SELECT toRelativeYearNum(toDateTime64('1899-01-01',6));

1970
1970
1900

However the behaviour is not as expected. See for instance this

SELECT toRelativeYearNum(toDate('1969-01-01'));
SELECT toRelativeYearNum(toDate('1970-02-01'));
SELECT toRelativeMonthNum(toDate('1970-02-01'));
SELECT toRelativeDayNum(toDate('1970-02-01'));

1970
1970
23642
31

I would expect rather

1970
1970
1
31

SELECT toRelativeYearNum(toDate('2024-01-01')); also unexpected - returns 2024 and not 54?

Blargian avatar Aug 22 '24 11:08 Blargian

Current docs for toRelative* functions say [...], starting from a certain fixed point in the past., they don't specify the exact starting point. And this is intentionally so, the purpose of the toRelative* functions is not to calculate the distance (in years, quarters, ..., seconds) of a single date or date with time to the starting point (for that purpose, users can simply subtract two dates), but to "normalize" two dates or dates with time relative to the same starting point such that their normalized differences can be compared.

In the reworked docs, we should keep the exact starting point for each function unspecified to give the database some leeway to change it in future.

Oh, in case you wonder, SELECT toRelativeMonthNum(toDate('1970-02-01')); is 23642 because "month" is calculated relative to 0000-00-00 00:00:00. Don't ask me why this is, it was implemented that way before I started here.

rschu1ze avatar Aug 22 '24 13:08 rschu1ze