elasticsearch icon indicating copy to clipboard operation
elasticsearch copied to clipboard

ESQL: Functions!

Open nik9000 opened this issue 2 years ago • 7 comments

Description

Description

This is a list of "shovel ready" functions. It's functions we are fairly sure we want and we should be able to start working on them now. This list is not sorted at all, partly because making any one of these functions shouldn't be a huge effort so the cost of having a sorted list is comparatively high.

This list is not sacred. If you need a function to do something, stick it on the list. Maybe even build it yourself, it's fun!

NULL

  • [x] COALESCE #98542

IP

  • [ ] AUTO_BUCKET for IPs
  • [ ] MASK

Math

  • [x] PI
  • [x] TAU
  • [x] ACOS
  • [x] ASIN
  • [x] ATAN
  • [x] ATAN2
  • [x] AUTO_BUCKET for numbers
  • [x] cube root - java calls this CBRT @ivancea
  • [x] CEIL @alex-spies
  • [ ] COPY_SIGN
  • [x] COS
  • [x] COSH
  • [ ] EXP
  • [x] FLOOR
  • [x] GREATEST(a, b, c) - max of columns, not an agg #98630
  • [ ] HYPOT
  • [x] LEAST(a, b, c) - min of columns, not an agg #98630
  • [ ] LOG - base e
  • [x] LOG - base 10
  • [ ] LOG - any base
  • [ ] SCALB
  • [x] SIGNUM - https://en.wikipedia.org/wiki/Sign_function @ioanatia https://github.com/elastic/elasticsearch/pull/106866
  • [x] SIN
  • [x] SINH
  • [x] square root - java calls this SQRT #98449
  • [ ] SUM(a, b, c) - sum of columns, not an agg
  • [x] TAN
  • [x] TANH
  • [x] RADIANS_TO_DEGREES (to_degrees)
  • [x] DEGREES_TO_RADIANS (to_radians)

String

  • [ ] ASCII
  • [ ] BIT_LENGTH
  • [ ] CHAR - presumably we'd want to emit unicode code points. Not sure how we'd validate or if we have to. It'd just utf-8 encode the code point and stick it a BytesRef
  • [ ] INSERT
  • [ ] TO_LOWERCASE - SQL calls this LCASE. There are locale issues with this, but I presume we'd use the ROOT locale.
  • [x] LEFT(foo, len) - basically an alias for SUBSTRING(foo, 0, len)
  • [x] #106818 @tteofili
  • [x] LTRIM #98590
  • [ ] BYTES_LENGTH - Number of bytes in the string as encoded in utf-8. SQL calls this OCTET_LENGTH. Not sure this is useful for people though.
  • [ ] POSITION
  • [ ] REPEAT
  • [ ] REPLACE
  • [x] RIGHT - See LEFT
  • [x] RTRIM #98590
  • [ ] SPACE
  • [x] TRIM
  • [ ] UCASE - Locales will be fun here

Date

  • [ ] DATE_ADD/DATEADD/TIMESTAMP_ADD/TIMESTAMPADD
  • [ ] DATE_DIFF/DATEDIFF/TIMESTAMP_DIFF/TIMESTAMPDIFF
  • [ ] DATE_PART/DATEPART
  • [ ] DAY_OF_MONTH/DOM/DAY
  • [ ] DAY_OF_WEEK/DAYOFWEEK/DOW
  • [ ] DAY_OF_YEAR/DOY
  • [ ] DAY_NAME/DAYNAME
  • [x] EXTRACT
  • [ ] HOUR_OF_DAY/HOUR
  • [ ] ISO_DAY_OF_WEEK/ISODAYOFWEEK/ISODOW/IDOW
  • [ ] ISO_WEEK_OF_YEAR/ISOWEEKOFYEAR/ISOWEEK/IWOY/IW
  • [ ] MINUTE_OF_DAY
  • [ ] MINUTE_OF_HOUR/MINUTE
  • [ ] MONTH_OF_YEAR/MONTH
  • [ ] MONTH_NAME/MONTHNAME
  • [ ] SECOND_OF_MINUTE/SECOND
  • [ ] QUARTER
  • [ ] WEEK_OF_YEAR/WEEK
  • [ ] YEAR

Encode/Decode

  • [x] BASE64_DECODE/_ENCODE #107135

Secure Hashing

  • [ ] SHA_0, SHA_1, SHA_256, SHA_512, SHAKE_128, SHAKE_256
  • [ ] MD5
  • [ ] GENERIC SECURE_HASH() by relying on the underlying MessageDigest.getInstance

nik9000 avatar Aug 16 '23 16:08 nik9000

Pinging @elastic/es-ql (Team:QL)

elasticsearchmachine avatar Aug 16 '23 17:08 elasticsearchmachine

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

elasticsearchmachine avatar Aug 16 '23 17:08 elasticsearchmachine

I wrote up a guide to making new functions about six months ago that's gone stale. I'll try and build another guide for it on Monday.

nik9000 avatar Aug 19 '23 14:08 nik9000

@nik9000 I had wrote partial logic of 'left' function. Now I wonder how do i write the manual document about it. Furthermore, the variable named 'length' should be optional or mandatory This is my pull request: https://github.com/elastic/elasticsearch/pull/98660

dreamquster avatar Aug 21 '23 09:08 dreamquster

Here are some instructions for adding functions.

nik9000 avatar Aug 21 '23 11:08 nik9000

Pinging @elastic/es-analytics-geo (Team:Analytics)

elasticsearchmachine avatar Jan 02 '24 19:01 elasticsearchmachine

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine avatar Feb 01 '24 05:02 elasticsearchmachine

With DATE_EXTRACT being available I wonder if it still make sense to support many of the functions listed under the DATE section here.

ioanatia avatar Jun 27 '24 11:06 ioanatia

With DATE_EXTRACT being available I wonder if it still make sense to support many of the functions listed under the DATE section here.

Yeah, I think we probably can just zap them. At most they can be aliases to DATE_EXTRACT. But they aren't nearly as important.

@costin, what do you think of just crossing them out of hte list?

nik9000 avatar Jun 27 '24 12:06 nik9000

Does TYPEOF make sense on this list (comparable)? Union types can come up ~when using CASE (e.g. var0 = CASE(boolField, 1, "foo")). Also,~ potentially when using multiple indices with different types.

drewdaemon avatar Oct 15 '24 19:10 drewdaemon

I think TYPEOF makes a lot of sense with union types. I'd probably want to develop it with some union type work that uses it. And have a look to see if what we need lines up exactly with the sqllite function. If, say, we decide we should have different handling for null somehow we might not want to name it this. Not that we'd make that decision lightly - conforming to what folks are used to is quite an advantage. But I imagine a world where TYPEOF is always evaluated at query planning time on the data node.

nik9000 avatar Oct 21 '24 17:10 nik9000