clickhouse-docs
clickhouse-docs copied to clipboard
Document all ClickHouse functions.
Problem
There are many functions in ClickHouse, but not all are documented. See https://github.com/ClickHouse/clickhouse-docs/issues/1682 https://github.com/ClickHouse/clickhouse-docs/issues/1512 as examples. This issue serves as the main holding-issue for many child/sub-issues. Essentially, if you want to know which functions have recently had documentation added, you should treat this issue as the source-of-truth.
Solution
Document 100% of the functions in ClickHouse.
Process
- List all the functions that exist in ClickHouse.
- List all the functions that have docs.
- Create a list of functions that don't have docs.
- Loop through this new list to write a description and syntax example for each function.
- To get this info, check the tests that each function was written with.
- Check in with the author of each function to get a background on why it was created.
- Repeat step 4 until all the functions have docs.
Scope
List each function, what it does, and a syntax example.
Example
Here is a markdown example of what a function should look like the following (a copy of this is available in the clickhouse/clickhouse repository at /docs/en/sql-reference/functions/_function-template.md):
## Example
This function serves as an example of what function documentation should look like.
**Syntax**
```sql
Example(input, secondary_input)
```
**Parameters**
- `input`: This is an example input. [String literal](../syntax#syntax-string-literal)
- `secondary_input`: This is another example input. [Expression](../syntax#syntax-expressions)
**Returned value**
This function doesn't exists. But if it did, they it would return a bool.
**Implementation details**
Any additional information pertaining to this function that users might need to know about. Things like known-bugs or edge cases would fit into this section. For example, if a function used _bankers rounding_ instead of standard rounding.
**Example**
Query:
```sql
CREATE TABLE example_table
(
id UUID,
name String,
) ENGINE = MergeTree
ORDER BY id;
INSERT INTO example_table VALUES (generateUUIDv4(), Example());
SELECT * FROM example_table;
```
```response
9e22db8c-d343-4bc5-8a95-501f635c0fc4 Example
```
Things to note:
- The arguments must end in a valid type and link to that type within the docs.
- The example query must be valid and complete; a user should be able to copy and paste the query directly into fiddle.clickhouse.com and have the query run without any modifications.
- The example result must be similar to what the user can expect to see when running it in fiddle.clickhouse.com. Some functions, such as
generateUUIDv4(), will return a random string; as such, you shouldn't try to have your example response match what the user might see.
Out of scope
We will not:
- Create a fancy new way of searching through functions.
- Expand on or update existing function docs.
- Create in-depth guides or tutorials for each function.
- Add docs directly into clickhouse/clickhouse code.
- Showcase specific functions.
These items are valid points to improve the docs, but they're entirely separate projects that require planning and setup of their own.
Timeline
This project should be completed in Q1 2024. So, by the end of March 2024.
After going through all the available ClickHouse functions, and searching through the docs for appropriate documentation, I've concluded that we have 449 functions without adequate documentation. See this Google Sheet. Check the Info tab for details on how the existing docs were judged.
449 function is a lot, especially given the size of the docs team. Without significant crowd-sourcing methods, it's unlikely that we'll get all of these missing functions documented by the end of March 2024.
Here's a complete list of the missing functions:
- [x] FROM_UNIXTIME
See #1919.
- [x] JSON_QUERY
- [x] JSON_VALUE
See #1920.
- [x] L2SquaredNorm
See #1921.
- [x] TimeDiff
See #1922.
- [x] UTC_timestamp
See #1923.
- [ ] VAR_POP
- [ ] VAR_SAMP
See #1924.
- [ ] YEAR
See #1925.
- [x] __bitBoolMaskAnd
- [x] __bitBoolMaskOr
- [x] __bitSwapLastTwo
- [x] __bitWrapperFunc
See #1926. Won't be documented, these functions are only for internal usage.
- [ ] __getScalar
See #1927.
- [ ] addTupleOfIntervals
See #1928.
- [x] aggThrow
See #1929.
- [x] analysisOfVariance
See #1930.
- [ ] and
See #1931.
- [x] anova
See #1932.
- [x] anyLast_respect_nulls
- [x] any_respect_nulls
- [x] any_value
- [x] any_value_respect_nulls
See #1933.
- [x] array
- [x] arrayDotProduct
- [x] arrayEnumerateDenseRanked
- [x] arrayEnumerateUniqRanked
- [x] arrayFirstOrNull
- [x] arrayFlatten
- [x] arrayLastOrNull
- [x] arrayPartialShuffle
- [x] arrayShuffle
See #1934.
- [ ] caseWithExpr
- [ ] caseWithExpression
- [ ] caseWithoutExpr
- [ ] caseWithoutExpression
See #1935.
- [x] connectionId
- [x] connection_id
See #1936.
- [x] corrMatrix
- [x] corrStable
See #1937.
- [x] countMatchesCaseInsensitive
- [x] countSubstringsCaseInsensitive
- [x] countSubstringsCaseInsensitiveUTF8
See #1938.
- [ ] covarPopMatrix
- [ ] covarPopStable
- [ ] covarSampMatrix
- [ ] covarSampStable
- [x] cramersVBiasCorrected
See #1940.
- [ ] currentSchemas
- [ ] current_schemas
See #1941.
- [x] cutToFirstSignificantSubdomainCustomRFC
- [x] cutToFirstSignificantSubdomainCustomWithWWWRFC
- [x] cutToFirstSignificantSubdomainRFC
- [x] cutToFirstSignificantSubdomainWithWWWRFC
See #1942.
- [ ] dense_rank
See #1943.
- [x] detectProgrammingLanguage
- [x] detectTonality
See #1944.
- [x] dictGet
- [x] dictGetDate
- [x] dictGetDateOrDefault
- [x] dictGetDateTime
- [x] dictGetDateTimeOrDefault
- [x] dictGetFloat32
- [x] dictGetFloat32OrDefault
- [x] dictGetFloat64
- [x] dictGetFloat64OrDefault
- [x] dictGetIPv4
- [x] dictGetIPv4OrDefault
- [x] dictGetIPv6
- [x] dictGetIPv6OrDefault
- [x] dictGetInt16
- [x] dictGetInt16OrDefault
- [x] dictGetInt32
- [x] dictGetInt32OrDefault
- [x] dictGetInt64
- [x] dictGetInt64OrDefault
- [x] dictGetInt8
- [x] dictGetInt8OrDefault
- [x] dictGetString
- [x] dictGetStringOrDefault
- [x] dictGetUInt16
- [x] dictGetUInt16OrDefault
- [x] dictGetUInt32
- [x] dictGetUInt32OrDefault
- [x] dictGetUInt64
- [x] dictGetUInt64OrDefault
- [x] dictGetUInt8
- [x] dictGetUInt8OrDefault
- [x] dictGetUUID
- [x] dictGetUUIDOrDefault
See #1918.
- [x] displayName
See #1945.
- [x] domainRFC
- [x] domainWithoutWWWRFC
See #1946.
- [x] e
See #1947.
- [x] emptyArrayFloat32
- [x] emptyArrayFloat64
- [x] emptyArrayInt16
- [x] emptyArrayInt32
- [x] emptyArrayInt64
- [x] emptyArrayInt8
- [x] emptyArrayString
- [x] emptyArrayToSingle
- [x] emptyArrayUInt16
- [x] emptyArrayUInt32
- [x] emptyArrayUInt64
- [x] emptyArrayUInt8
See #1948.
- [x] exp
- [x] exponentialTimeDecayedAvg
- [x] exponentialTimeDecayedCount
- [x] exponentialTimeDecayedMax
- [x] exponentialTimeDecayedSum
See #1949.
- [ ] extract
- [ ] extractGroups
See #1950.
- [ ] filesystemUnreserved
See #1951.
- [x] firstSignificantSubdomainCustomRFC
- [x] firstSignificantSubdomainRFC
- [x] first_value_respect_nulls
See #1953.
- [x] flameGraph
See #1954.
- [x] flattenTuple
See #1955.
- [x] formatQueryOrNull
- [x] formatQuerySingleLineOrNull
See #1956.
- [x] fromUnixTimestamp64Micro
- [x] fromUnixTimestamp64Milli
See #1957.
- [x] fullHostName
See #1958.
- [ ] getSubcolumn
- [ ] getTypeSerializationStreams
See #1959.
- [ ] globalIn
- [ ] globalInIgnoreSet
- [ ] globalNotIn
- [ ] globalNotInIgnoreSet
- [ ] globalNotNullIn
- [ ] globalNotNullInIgnoreSet
- [ ] globalNullIn
- [ ] globalNullInIgnoreSet
- [ ] globalVariable
See #1960.
- [ ] greater
See #1961.
- [x] hasColumnInTable
- [x] hasSubsequenceCaseInsensitive
- [x] hasSubsequenceCaseInsensitiveUTF8
- [x] hasSubsequenceUTF8
- [x] hasThreadFuzzer
- [x] hasToken
- [x] hasTokenCaseInsensitive
- [x] hasTokenCaseInsensitiveOrNull
- [x] hasTokenOrNull
See #1962.
- [ ] ignore
See #1963.
- [ ] in
- [ ] inIgnoreSet
See #1964.
- [ ] initcap
See #1965.
- [ ] intHash32
- [ ] intHash64
See #1966.
- [x] isNotDistinctFrom
- [x] isNullable
See #1967.
- [ ] joinGetOrNull
See #1968.
- [x] kostikConsistentHash
See #1969.
- [ ] kql_array_sort_asc
- [ ] kql_array_sort_desc
See #1970.
- [ ] lagInFrame
See #1971.
- [ ] last_value_respect_nulls
See #1972.
- [x] lcase
See #1973.
- [ ] leadInFrame
See #1974.
- [x] left
- [x] leftUTF8
See #1975.
- [x] lengthUTF8
See #1976.
- [ ] less
- [ ] lessOrEquals
See #1977.
- [x] lowCardinalityIndices
- [x] lowCardinalityKeys
See #1978.
- [x] makeDate32
- [x] makeDateTime64
See #1979.
- [ ] mapPartialReverseSort
- [ ] mapPartialSort
See #1980.
- [ ] materialize
See #1981.
- [ ] maxMappedArrays
- [ ] minMappedArrays
See #1982.
- [ ] moduloLegacy
See #1983.
- [x] mortonDecode
- [x] mortonEncode
See #1984.
- [x] multiSearchAllPositionsCaseInsensitive
- [x] multiSearchAllPositionsCaseInsensitiveUTF8
- [x] multiSearchAllPositionsUTF8
- [x] multiSearchAnyCaseInsensitive
- [x] multiSearchAnyCaseInsensitiveUTF8
- [x] multiSearchAnyUTF8
- [x] multiSearchFirstIndexCaseInsensitive
- [x] multiSearchFirstIndexCaseInsensitiveUTF8
- [x] multiSearchFirstIndexUTF8
- [x] multiSearchFirstPositionCaseInsensitive
- [x] multiSearchFirstPositionCaseInsensitiveUTF8
- [x] multiSearchFirstPositionUTF8
See #1985.
- [x] ngramDistanceCaseInsensitive
- [x] ngramDistanceCaseInsensitiveUTF8
- [x] ngramDistanceUTF8
- [x] ngramSearchCaseInsensitive
- [x] ngramSearchCaseInsensitiveUTF8
- [x] ngramSearchUTF8
See #1986.
- [ ] nonNegativeDerivative
See #1987.
- [x] normL2Squared
See #1988.
- [ ] normalizeQueryKeepNames
- [ ] normalizedQueryHashKeepNames
See #1989.
- [ ] notInIgnoreSet
- [ ] notNullIn
- [ ] notNullInIgnoreSet
See #1990.
- [ ] nth_value
See #1991.
- [ ] nullIn
- [ ] nullInIgnoreSet
See #1992.
- [ ] or
See #1993.
- [ ] partitionId
See #1994.
- [x] portRFC
See #1995.
- [x] proportionsZTest
See #1996.
- [x] protocol
See #1997.
- [x] rand
- [x] rand32
- [x] rand64
- [x] randCanonical
See #1998.
- [ ] rank
See #1999.
- [x] readWKTPoint
- [x] readWKTRing
See #2000.
- [x] regionHierarchy
- [x] regionIn
- [x] regionToArea
- [x] regionToCity
- [x] regionToContinent
- [x] regionToCountry
- [x] regionToDistrict
- [x] regionToName
- [x] regionToPopulation
- [x] regionToTopContinent
See #2001.
- [x] reinterpretAsDate
- [x] reinterpretAsDateTime
- [x] reinterpretAsFixedString
- [x] reinterpretAsFloat32
- [x] reinterpretAsFloat64
- [x] reinterpretAsInt128
- [x] reinterpretAsInt16
- [x] reinterpretAsInt256
- [x] reinterpretAsInt32
- [x] reinterpretAsInt64
- [x] reinterpretAsInt8
- [x] reinterpretAsString
- [x] reinterpretAsUInt128
- [x] reinterpretAsUInt16
- [x] reinterpretAsUInt256
- [x] reinterpretAsUInt32
- [x] reinterpretAsUInt64
- [x] reinterpretAsUInt8
- [x] reinterpretAsUUID
See #2002.
- [x] reverseDNSQuery
See #2003.
- [x] revision
See #2004.
- [x] right
- [x] rightUTF8
See #2005.
- [x] roundAge
- [x] roundDown
- [x] roundDuration
- [x] roundToExp2
See #2006.
- [x] rowNumberInAllBlocks
- [x] rowNumberInBlock
See #2007.
- [ ] row_number
See #2008.
- [x] showCertificate
See #2009.
- [x] sigmoid
See #2010.
- [x] simpleJSONExtractBool
- [x] simpleJSONExtractFloat
- [x] simpleJSONExtractInt
- [x] simpleJSONExtractRaw
- [x] simpleJSONExtractString
- [x] simpleJSONExtractUInt
- [x] simpleJSONHas
See #2011.
- [x] sin
See #2012.
- [x] singleValueOrNull
See #2013.
- [x] sleep
- [x] sleepEachRow
See #2014.
- [x] stddevPop
- [x] stddevPopStable
- [x] stddevSamp
- [x] stddevSampStable
See #2015.
- [x] substringIndexUTF8
- [x] substringUTF8
See #2016.
- [x] subtractDays
- [x] subtractHours
- [x] subtractInterval
- [x] subtractMicroseconds
- [x] subtractMilliseconds
- [x] subtractMinutes
- [x] subtractMonths
- [x] subtractNanoseconds
- [x] subtractQuarters
- [x] subtractSeconds
- [x] subtractTupleOfIntervals
- [x] subtractWeeks
- [x] subtractYears
See #2017.
- [x] sum
- [x] sumMapFiltered
- [x] sumMapFilteredWithOverflow
- [x] sumMapWithOverflow
- [x] sumWithOverflow
See #2018.
- [x] svg
See #2019.
- [x] tanh
See #2020.
- [x] timeSlot
- [x] timeSlots
See #2021.
- [ ] toBool
- [ ] toDateTime64OrDefault
- [ ] toDateTime64OrNull
- [ ] toDateTime64OrZero
- [ ] toDecimal128
- [ ] toDecimal128OrDefault
- [ ] toDecimal128OrNull
- [ ] toDecimal128OrZero
- [ ] toDecimal256
- [ ] toDecimal256OrDefault
- [ ] toDecimal256OrNull
- [ ] toDecimal256OrZero
- [ ] toDecimal32
- [ ] toDecimal32OrDefault
- [ ] toDecimal32OrNull
- [ ] toDecimal32OrZero
- [ ] toDecimal64
- [ ] toDecimal64OrDefault
- [ ] toDecimal64OrNull
- [ ] toDecimal64OrZero
- [ ] toDecimalString
- [ ] toFixedString
- [ ] toFloat32
- [ ] toFloat32OrDefault
- [ ] toFloat32OrNull
- [ ] toFloat32OrZero
- [ ] toFloat64
- [ ] toFloat64OrDefault
- [ ] toFloat64OrNull
- [ ] toFloat64OrZero
- [ ] toIPv4OrDefault
- [ ] toIPv4OrNull
- [ ] toIPv4OrZero
- [ ] toIPv6OrDefault
- [ ] toIPv6OrNull
- [ ] toIPv6OrZero
- [ ] toISOWeek
- [ ] toISOYear
- [ ] toInt128
- [ ] toInt128OrDefault
- [ ] toInt128OrNull
- [ ] toInt128OrZero
- [ ] toInt16
- [ ] toInt16OrDefault
- [ ] toInt16OrNull
- [ ] toInt16OrZero
- [ ] toInt256
- [ ] toInt256OrDefault
- [ ] toInt256OrNull
- [ ] toInt256OrZero
- [ ] toInt32
- [ ] toInt32OrDefault
- [ ] toInt32OrNull
- [ ] toInt32OrZero
- [ ] toInt64
- [ ] toInt64OrDefault
- [ ] toInt64OrNull
- [ ] toInt64OrZero
- [ ] toInt8
- [ ] toInt8OrDefault
- [ ] toInt8OrNull
- [ ] toInt8OrZero
- [ ] toIntervalDay
- [ ] toIntervalHour
- [ ] toIntervalMicrosecond
- [ ] toIntervalMillisecond
- [ ] toIntervalMinute
- [ ] toIntervalMonth
- [ ] toIntervalNanosecond
- [ ] toIntervalQuarter
- [ ] toIntervalSecond
- [ ] toIntervalWeek
- [ ] toIntervalYear
- [ ] toRelativeDayNum
- [ ] toRelativeHourNum
- [ ] toRelativeMinuteNum
- [ ] toRelativeMonthNum
- [ ] toRelativeQuarterNum
- [ ] toRelativeSecondNum
- [ ] toRelativeWeekNum
- [ ] toRelativeYearNum
- [ ] toStartOfMicrosecond
- [ ] toStartOfMillisecond
- [ ] toStartOfNanosecond
- [ ] toTime
- [ ] toTimezone
- [ ] toTypeName
- [ ] toUInt128
- [ ] toUInt128OrNull
- [ ] toUInt128OrZero
- [ ] toUInt16
- [ ] toUInt16OrDefault
- [ ] toUInt16OrNull
- [ ] toUInt16OrZero
- [ ] toUInt256
- [ ] toUInt256OrDefault
- [ ] toUInt256OrNull
- [ ] toUInt256OrZero
- [ ] toUInt32
- [ ] toUInt32OrDefault
- [ ] toUInt32OrNull
- [ ] toUInt32OrZero
- [ ] toUInt64
- [ ] toUInt64OrDefault
- [ ] toUInt64OrNull
- [ ] toUInt64OrZero
- [ ] toUInt8
- [ ] toUInt8OrDefault
- [ ] toUInt8OrNull
- [ ] toUInt8OrZero
- [ ] toUnixTimestamp64Micro
- [ ] toUnixTimestamp64Milli
See #2022.
- [x] today
See #2023.
- [x] topLevelDomainRFC
See #2024.
- [x] transactionID
- [x] transactionLatestSnapshot
- [x] transactionOldestSnapshot
See #2025.
- [x] translateUTF8
See #2026.
- [x] trunc
- [x] truncate
See #2027.
- [x] tryBase58Decode
- [x] tryBase64Decode
See #2028.
- [ ] tumbleEnd
- [ ] tumbleStart
See #2029.
- [x] tupleIntDiv
- [x] tupleIntDivByNumber
- [x] tupleIntDivOrZero
- [x] tupleIntDivOrZeroByNumber
- [x] tupleModulo
- [x] tupleModuloByNumber
See #2030.
- [x] ucase
See #2031.
- [x] uniqCombined64
See #2032.
- [x] upper
- [x] upperUTF8
See #2033.
- [x] uptime
See #2034.
- [x] validateNestedArraySizes
See #2035.
- [x] varPopStable
- [x] varSampStable
See #2036.
- [x] version
See #2037.
- [x] visitParamExtractBool
- [x] visitParamExtractFloat
- [x] visitParamExtractInt
- [x] visitParamExtractRaw
- [x] visitParamExtractString
- [x] visitParamExtractUInt
- [x] visitParamHas
See #2038.
- [ ] windowID
See #2039.
- [x] wkt
See #2040.
- [x] wyHash64
See #2041.
- [x] yandexConsistentHash
See #2042.
@johnnymatthews I'd like to help out with this in between working on some code related contributions. Feel free to assign tasks to me, or alternatively I can just start from the top of the list.
Hey @Blargian, sorry for the delay in getting back to you. I've just seen a few comment and PRs from yourself, so thanks for that! If you're still interested in helping out, just keep plugging away at the issues. If you'd like specific issues to be assigned to yourself, just leave a comment in the issue saying something like "I'll work on this one" -- I'll assign it to you then :)