duckdb-extension-clickhouse-sql
duckdb-extension-clickhouse-sql copied to clipboard
DuckDB Community Extension implementing ClickHouse SQL Dialect macros for DuckDB
DuckDB ClickHouse SQL extension
The DuckDB chsql community extension implements popular ClickHouse SQL syntax macros and functions,
making it easier for users to transition between the two database systems ⭐ designed for Quackpipe
Installation
chsql is distributed as a DuckDB Community Extension and can be installed using SQL:
INSTALL chsql FROM community;
LOAD chsql;
If you previously installed the chsql
extension, upgrade using the FORCE command
FORCE INSTALL chsql FROM community;
LOAD chsql;
Usage Examples
Once installed, the macro functions provided by the extension can be used just like built-in functions.
Here's a random example out of 100s using the IPv4StringToNum
and IPv4NumToString
functions:
D INSTALL chsql FROM community;
D LOAD chsql;
D SELECT IPv4StringToNum('127.0.0.1'), IPv4NumToString(2130706433);
┌──────────────────────────────┬─────────────────────────────┐
│ ipv4stringtonum('127.0.0.1') │ ipv4numtostring(2130706433) │
│ int32 │ varchar │
├──────────────────────────────┼─────────────────────────────┤
│ 2130706433 │ 127.0.0.1 │
└──────────────────────────────┴─────────────────────────────┘
Remote Queries
The built-in ch_scan
function can be used to query remote ClickHouse servers using the HTTP/s API
D SELECT * FROM ch_scan("SELECT number * 2 FROM numbers(10)", "https://play.clickhouse.com");
Supported Functions
👉 The list of supported aliases is available on the dedicated extension page
👉 The combined list of supported functions can be obtained using an SQL Join
Motivation
Why is the DuckDB + chsql combo fun and useful
✔ DuckDB SQL is awesome and full of great functions.
✔ ClickHouse SQL is awesome and full of great functions.
✔ The DuckDB library is ~51M and modular. Can LOAD extensions.
❌ The ClickHouse monolith is ~551M and growing. No extensions.
✔ DuckDB is open source and protected by a no-profit foundation.
❌ ClickHouse is open core and controlled by for-profit corporation.
✔ DuckDB embedded is fast, mature and elegantly integrated in many languages.
❌ chdb is still experimental, unstable and currently only supports Python.
Functions
function | fun_type | description | comment | example |
---|---|---|---|---|
IPv4NumToString | macro | Cast IPv4 address from numeric to string format | SELECT IPv4NumToString(2130706433); | |
IPv4StringToNum | macro | Cast IPv4 address from string to numeric format | SELECT IPv4StringToNum('127.0.0.1'); | |
arrayExists | macro | Check if any element of the array satisfies the condition | SELECT arrayExists(x -> x = 1, [1, 2, 3]); | |
arrayJoin | macro | Unroll an array into multiple rows | SELECT arrayJoin([1, 2, 3]); | |
arrayMap | macro | Applies a function to each element of an array | SELECT arrayMap(x -> x + 1, [1, 2, 3]); | |
bitCount | macro | Counts the number of set bits in an integer | SELECT bitCount(15); | |
ch_scan | table_macro | Query a remote ClickHouse server using HTTP/s API | Returns the query results | SELECT * FROM ch_scan('SELECT version()','https://play.clickhouse.com', format := 'parquet'); |
domain | macro | Extracts the domain from a URL | SELECT domain('https://clickhouse.com/docs'); | |
empty | macro | Check if a string is empty | SELECT empty(''); | |
extractAllGroups | macro | Extracts all matching groups from a string using a regular expression | SELECT extractAllGroups('(\d+)', 'abc123'); | |
formatDateTime | macro | Formats a DateTime value into a string | SELECT formatDateTime(now(), '%Y-%m-%d'); | |
generateUUIDv4 | macro | Generate a UUID v4 value | SELECT generateUUIDv4(); | |
ifNull | macro | Returns the first argument if not NULL, otherwise the second | SELECT ifNull(NULL, 'default'); | |
intDiv | macro | Performs integer division | SELECT intDiv(10, 3); | |
intDivOZero | macro | Performs integer division but returns zero instead of throwing an error for division by zero | SELECT intDivOZero(10, 0); | |
intDivOrNull | macro | Performs integer division but returns NULL instead of throwing an error for division by zero | SELECT intDivOrNull(10, 0); | |
leftPad | macro | Pads a string on the left to a specified length | SELECT leftPad('abc', 5, '*'); | |
lengthUTF8 | macro | Returns the length of a string in UTF-8 characters | SELECT lengthUTF8('Привет'); | |
match | macro | Performs a regular expression match on a string | SELECT match('abc123', '\d+'); | |
minus | macro | Performs subtraction of two numbers | SELECT minus(5, 3); | |
modulo | macro | Calculates the remainder of division (modulus) | SELECT modulo(10, 3); | |
moduloOrZero | macro | Calculates modulus but returns zero instead of error on division by zero | SELECT moduloOrZero(10, 0); | |
notEmpty | macro | Check if a string is not empty | SELECT notEmpty('abc'); | |
numbers | table_macro | Generates a sequence of numbers starting from 0 | Returns a table with a single column (UInt64) | SELECT * FROM numbers(10); |
parseURL | macro | Extracts parts of a URL | SELECT parseURL('https://clickhouse.com', 'host'); | |
path | macro | Extracts the path from a URL | SELECT path('https://clickhouse.com/docs'); | |
plus | macro | Performs addition of two numbers | SELECT plus(5, 3); | |
protocol | macro | Extracts the protocol from a URL | SELECT protocol('https://clickhouse.com'); | |
read_parquet_mergetree | function | Merge parquet files using a primary sorting key for fast range queries | experimental | COPY (SELECT * FROM read_parquet_mergetree(['/folder/*.parquet'], 'sortkey') TO 'sorted.parquet'; |
rightPad | macro | Pads a string on the right to a specified length | SELECT rightPad('abc', 5, '*'); | |
splitByChar | macro | Splits a string by a given character | SELECT splitByChar(',', 'a,b,c'); | |
toDayOfMonth | macro | Extracts the day of the month from a date | SELECT toDayOfMonth('2023-09-10'); | |
toFixedString | macro | Converts a value to a fixed-length string | SELECT toFixedString('abc', 5); | |
toFloat | macro | Converts a value to a float | SELECT toFloat('123.45'); | |
toFloatOrNull | macro | Converts a value to float or returns NULL if the conversion fails | SELECT toFloatOrNull('abc'); | |
toFloatOrZero | macro | Converts a value to float or returns zero if the conversion fails | SELECT toFloatOrZero('abc'); | |
toHour | macro | Extracts the hour from a DateTime value | SELECT toHour(now()); | |
toInt128 | macro | Converts a value to a 128-bit integer | SELECT toInt128('123456789012345678901234567890'); | |
toInt128OrNull | macro | Converts to a 128-bit integer or returns NULL on failure | SELECT toInt128OrNull('abc'); | |
toInt128OrZero | macro | Converts to a 128-bit integer or returns zero on failure | SELECT toInt128OrZero('abc'); | |
toInt16 | macro | Converts a value to a 16-bit integer | SELECT toInt16('123'); | |
toInt16OrNull | macro | Converts to a 16-bit integer or returns NULL on failure | SELECT toInt16OrNull('abc'); | |
toInt16OrZero | macro | Converts to a 16-bit integer or returns zero on failure | SELECT toInt16OrZero('abc'); | |
toInt256 | macro | Converts a value to a 256-bit integer | SELECT toInt256('12345678901234567890123456789012345678901234567890123456789012345678901234567890'); | |
toInt256OrNull | macro | Converts to a 256-bit integer or returns NULL on failure | SELECT toInt256OrNull('abc'); | |
toInt256OrZero | macro | Converts to a 256-bit integer or returns zero on failure | SELECT toInt256OrZero('abc'); | |
toInt32 | macro | Converts a value to a 32-bit integer | SELECT toInt32('123'); | |
toInt32OrNull | macro | Converts to a 32-bit integer or returns NULL on failure | SELECT toInt32OrNull('abc'); | |
toInt32OrZero | macro | Converts to a 32-bit integer or returns zero on failure | SELECT toInt32OrZero('abc'); | |
toInt64 | macro | Converts a value to a 64-bit integer | SELECT toInt64('123'); | |
toInt64OrNull | macro | Converts to a 64-bit integer or returns NULL on failure | SELECT toInt64OrNull('abc'); | |
toInt64OrZero | macro | Converts to a 64-bit integer or returns zero on failure | SELECT toInt64OrZero('abc'); | |
toInt8 | macro | Converts a value to an 8-bit integer | SELECT toInt8('123'); | |
toInt8OrNull | macro | Converts to an 8-bit integer or returns NULL on failure | SELECT toInt8OrNull('abc'); | |
toInt8OrZero | macro | Converts to an 8-bit integer or returns zero on failure | SELECT toInt8OrZero('abc'); | |
toMinute | macro | Extracts the minute from a DateTime value | SELECT toMinute(now()); | |
toMonth | macro | Extracts the month from a Date value | SELECT toMonth('2023-09-10'); | |
toSecond | macro | Extracts the second from a DateTime value | SELECT toSecond(now()); | |
toString | macro | Converts a value to a string | SELECT toString(123); | |
toUInt16 | macro | Converts a value to an unsigned 16-bit integer | SELECT toUInt16('123'); | |
toUInt16OrNull | macro | Converts to an unsigned 16-bit integer or returns NULL on failure | SELECT toUInt16OrNull('abc'); | |
toUInt16OrZero | macro | Converts to an unsigned 16-bit integer or returns zero on failure | SELECT toUInt16OrZero('abc'); | |
toUInt32 | macro | Converts a value to an unsigned 32-bit integer | SELECT toUInt32('123'); | |
toUInt32OrNull | macro | Converts to an unsigned 32-bit integer or returns NULL on failure | SELECT toUInt32OrNull('abc'); | |
toUInt32OrZero | macro | Converts to an unsigned 32-bit integer or returns zero on failure | SELECT toUInt32OrZero('abc'); | |
toUInt64 | macro | Converts a value to an unsigned 64-bit integer | SELECT toUInt64('123'); | |
toUInt64OrNull | macro | Converts to an unsigned 64-bit integer or returns NULL on failure | SELECT toUInt64OrNull('abc'); | |
toUInt64OrZero | macro | Converts to an unsigned 64-bit integer or returns zero on failure | SELECT toUInt64OrZero('abc'); | |
toUInt8 | macro | Converts a value to an unsigned 8-bit integer | SELECT toUInt8('123'); | |
toUInt8OrNull | macro | Converts to an unsigned 8-bit integer or returns NULL on failure | SELECT toUInt8OrNull('abc'); | |
toUInt8OrZero | macro | Converts to an unsigned 8-bit integer or returns zero on failure | SELECT toUInt8OrZero('abc'); | |
toYYYYMM | macro | Formats a Date to 'YYYYMM' string format | SELECT toYYYYMM('2023-09-10'); | |
toYYYYMMDD | macro | Formats a Date to 'YYYYMMDD' string format | SELECT toYYYYMMDD('2023-09-10'); | |
toYYYYMMDDhhmmss | macro | Formats a DateTime to 'YYYYMMDDhhmmss' string format | SELECT toYYYYMMDDhhmmss(now()); | |
toYear | macro | Extracts the year from a Date or DateTime value | SELECT toYear('2023-09-10'); | |
topLevelDomain | macro | Extracts the top-level domain (TLD) from a URL | SELECT topLevelDomain('https://example.com'); | |
tupleConcat | macro | Concatenates two tuples into one tuple | SELECT tupleConcat((1, 'a'), (2, 'b')); | |
tupleDivide | macro | Performs element-wise division between two tuples | SELECT tupleDivide((10, 20), (2, 5)); | |
tupleDivideByNumber | macro | Divides each element of a tuple by a number | SELECT tupleDivideByNumber((10, 20), 2); | |
tupleIntDiv | macro | Performs element-wise integer division between two tuples | SELECT tupleIntDiv((10, 20), (3, 4)); | |
tupleIntDivByNumber | macro | Performs integer division of each element of a tuple by a number | SELECT tupleIntDivByNumber((10, 20), 3); | |
tupleMinus | macro | Performs element-wise subtraction between two tuples | SELECT tupleMinus((10, 20), (5, 3)); | |
tupleModulo | macro | Performs element-wise modulus between two tuples | SELECT tupleModulo((10, 20), (3, 6)); | |
tupleModuloByNumber | macro | Calculates the modulus of each element of a tuple by a number | SELECT tupleModuloByNumber((10, 20), 3); | |
tupleMultiply | macro | Performs element-wise multiplication between two tuples | SELECT tupleMultiply((10, 20), (2, 5)); | |
tupleMultiplyByNumber | macro | Multiplies each element of a tuple by a number | SELECT tupleMultiplyByNumber((10, 20), 3); | |
tuplePlus | macro | Performs element-wise addition between two tuples | SELECT tuplePlus((1, 2), (3, 4)); | |
url | table_macro | Performs queries against remote URLs using the specified format | Supports JSON, CSV, PARQUET, TEXT, BLOB | SELECT * FROM url('https://urleng.com/test','JSON'); |
JSONExtract | macro | Extracts JSON data based on key from a JSON object | SELECT JSONExtract(json_column, 'user.name'); | |
JSONExtractString | macro | Extracts JSON data as a VARCHAR from a JSON object | SELECT JSONExtractString(json_column, 'user.email'); | |
JSONExtractUInt | macro | Extracts JSON data as an unsigned integer from a JSON object | SELECT JSONExtractUInt(json_column, 'user.age'); | |
JSONExtractInt | macro | Extracts JSON data as a 32-bit integer from a JSON object | SELECT JSONExtractInt(json_column, 'user.balance'); | |
JSONExtractFloat | macro | Extracts JSON data as a double from a JSON object | SELECT JSONExtractFloat(json_column, 'user.score'); | |
JSONExtractRaw | macro | Extracts raw JSON data based on key from a JSON object | SELECT JSONExtractRaw(json_column, 'user.address'); | |
JSONHas | macro | Checks if a JSON key exists and is not null | SELECT JSONHas(json_column, 'user.active'); | |
JSONLength | macro | Returns the length of a JSON array | SELECT JSONLength(json_column, 'items'); | |
JSONType | macro | Determines the type of JSON element at the given path | SELECT JSONType(json_column, 'user.data'); | |
JSONExtractKeys | macro | Extracts keys from a JSON object | SELECT JSONExtractKeys(json_column); | |
JSONExtractValues | macro | Extracts all values as text from a JSON object | SELECT JSONExtractValues(json_column); | |
equals | macro | Checks if two values are equal | SELECT equals(column_a, column_b); | |
notEquals | macro | Checks if two values are not equal | SELECT notEquals(column_a, column_b); | |
less | macro | Checks if one value is less than another | SELECT less(column_a, column_b); | |
greater | macro | Checks if one value is greater than another | SELECT greater(column_a, column_b); | |
lessOrEquals | macro | Checks if one value is less than or equal to another | SELECT lessOrEquals(column_a, column_b); | |
greaterOrEquals | macro | Checks if one value is greater than or equal to another | SELECT greaterOrEquals(column_a, column_b); | |
dictGet | macro | Retrieves an attribute from a VARIABLE string or MAP | SELECT dictGet('dictionary_name', 'attribute'); |
Disclaimer
DuckDB ® is a trademark of DuckDB Foundation. ClickHouse® is a trademark of ClickHouse Inc. All trademarks, service marks, and logos mentioned or depicted are the property of their respective owners. The use of any third-party trademarks, brand names, product names, and company names is purely informative or intended as parody and does not imply endorsement, affiliation, or association with the respective owners.