elasticsearch
elasticsearch copied to clipboard
ESQL: Functions!
Description
Description
This is a list of "shovel ready" functions. It's functions we are fairly sure we want and we should be able to start working on them now. This list is not sorted at all, partly because making any one of these functions shouldn't be a huge effort so the cost of having a sorted list is comparatively high.
This list is not sacred. If you need a function to do something, stick it on the list. Maybe even build it yourself, it's fun!
NULL
- [x] COALESCE #98542
IP
- [ ] AUTO_BUCKET for IPs
- [ ] MASK
Math
- [x] PI
- [x] TAU
- [x] ACOS
- [x] ASIN
- [x] ATAN
- [x] ATAN2
- [x] AUTO_BUCKET for numbers
- [x] cube root - java calls this CBRT @ivancea
- [x] CEIL @alex-spies
- [ ] COPY_SIGN
- [x] COS
- [x] COSH
- [ ] EXP
- [x] FLOOR
- [x] GREATEST(a, b, c) - max of columns, not an agg #98630
- [ ] HYPOT
- [x] LEAST(a, b, c) - min of columns, not an agg #98630
- [ ] LOG - base e
- [x] LOG - base 10
- [ ] LOG - any base
- [ ] SCALB
- [x] SIGNUM - https://en.wikipedia.org/wiki/Sign_function @ioanatia https://github.com/elastic/elasticsearch/pull/106866
- [x] SIN
- [x] SINH
- [x] square root - java calls this SQRT #98449
- [ ] SUM(a, b, c) - sum of columns, not an agg
- [x] TAN
- [x] TANH
- [x] RADIANS_TO_DEGREES (
to_degrees) - [x] DEGREES_TO_RADIANS (
to_radians)
String
- [ ] ASCII
- [ ] BIT_LENGTH
- [ ] CHAR - presumably we'd want to emit unicode code points. Not sure how we'd validate or if we have to. It'd just utf-8 encode the code point and stick it a BytesRef
- [ ] INSERT
- [ ] TO_LOWERCASE - SQL calls this LCASE. There are locale issues with this, but I presume we'd use the
ROOTlocale. - [x] LEFT(foo, len) - basically an alias for SUBSTRING(foo, 0, len)
- [x] #106818 @tteofili
- [x] LTRIM #98590
- [ ] BYTES_LENGTH - Number of bytes in the string as encoded in utf-8. SQL calls this OCTET_LENGTH. Not sure this is useful for people though.
- [ ] POSITION
- [ ] REPEAT
- [ ] REPLACE
- [x] RIGHT - See LEFT
- [x] RTRIM #98590
- [ ] SPACE
- [x] TRIM
- [ ] UCASE - Locales will be fun here
Date
- [ ] DATE_ADD/DATEADD/TIMESTAMP_ADD/TIMESTAMPADD
- [ ] DATE_DIFF/DATEDIFF/TIMESTAMP_DIFF/TIMESTAMPDIFF
- [ ] DATE_PART/DATEPART
- [ ] DAY_OF_MONTH/DOM/DAY
- [ ] DAY_OF_WEEK/DAYOFWEEK/DOW
- [ ] DAY_OF_YEAR/DOY
- [ ] DAY_NAME/DAYNAME
- [x] EXTRACT
- [ ] HOUR_OF_DAY/HOUR
- [ ] ISO_DAY_OF_WEEK/ISODAYOFWEEK/ISODOW/IDOW
- [ ] ISO_WEEK_OF_YEAR/ISOWEEKOFYEAR/ISOWEEK/IWOY/IW
- [ ] MINUTE_OF_DAY
- [ ] MINUTE_OF_HOUR/MINUTE
- [ ] MONTH_OF_YEAR/MONTH
- [ ] MONTH_NAME/MONTHNAME
- [ ] SECOND_OF_MINUTE/SECOND
- [ ] QUARTER
- [ ] WEEK_OF_YEAR/WEEK
- [ ] YEAR
Encode/Decode
- [x] BASE64_DECODE/_ENCODE #107135
Secure Hashing
- [ ] SHA_0, SHA_1, SHA_256, SHA_512, SHAKE_128, SHAKE_256
- [ ] MD5
- [ ] GENERIC SECURE_HASH(
) by relying on the underlying MessageDigest.getInstance
Pinging @elastic/es-ql (Team:QL)
Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)
I wrote up a guide to making new functions about six months ago that's gone stale. I'll try and build another guide for it on Monday.
@nik9000 I had wrote partial logic of 'left' function. Now I wonder how do i write the manual document about it. Furthermore, the variable named 'length' should be optional or mandatory This is my pull request: https://github.com/elastic/elasticsearch/pull/98660
Here are some instructions for adding functions.
Pinging @elastic/es-analytics-geo (Team:Analytics)
Pinging @elastic/es-analytical-engine (Team:Analytics)
With DATE_EXTRACT being available I wonder if it still make sense to support many of the functions listed under the DATE section here.
With
DATE_EXTRACTbeing available I wonder if it still make sense to support many of the functions listed under theDATEsection here.
Yeah, I think we probably can just zap them. At most they can be aliases to DATE_EXTRACT. But they aren't nearly as important.
@costin, what do you think of just crossing them out of hte list?
Does TYPEOF make sense on this list (comparable)? Union types can come up ~when using CASE (e.g. var0 = CASE(boolField, 1, "foo")). Also,~ potentially when using multiple indices with different types.
I think TYPEOF makes a lot of sense with union types. I'd probably want to develop it with some union type work that uses it. And have a look to see if what we need lines up exactly with the sqllite function. If, say, we decide we should have different handling for null somehow we might not want to name it this. Not that we'd make that decision lightly - conforming to what folks are used to is quite an advantage. But I imagine a world where TYPEOF is always evaluated at query planning time on the data node.