dask-sql
dask-sql copied to clipboard
[ENH] Support UNIX_$timeUnit functions
Related to #412
I'm trying to do math on timestamps, but ran into #411 and #412.
The Calcite docs suggest
UNIX_SECONDS(timestamp) | Returns the number of seconds since 1970-01-01 00:00:00
But that fails to parse:
import pandas as pd
from dask_sql import Context
c = Context()
df = pd.DataFrame({'dt0': ['2022-03-01 12:00:00'], 'dt1': ['2022-03-01 13:00:00']})
df['dt0'] = df['dt0'].astype('datetime64[s]')
df['dt1'] = df['dt1'].astype('datetime64[s]')
c.create_table('test_dt', df)
c.sql("SELECT UNIX_SECONDS(dt0) FROM test_dt").compute()
---------------------------------------------------------------------------
ParsingException Traceback (most recent call last)
Input In [123], in <module>
----> 1 c.sql("select UNIX_SECONDS(dt0) FROM test_dt").compute()
File ~/conda/envs/dsql-2-28/lib/python3.9/site-packages/dask_sql/context.py:458, in Context.sql(self, sql, return_futures, dataframes, gpu)
455 for df_name, df in dataframes.items():
456 self.create_table(df_name, df, gpu=gpu)
--> 458 rel, select_names, _ = self._get_ral(sql)
460 dc = RelConverter.convert(rel, context=self)
462 if dc is None:
File ~/conda/envs/dsql-2-28/lib/python3.9/site-packages/dask_sql/context.py:892, in Context._get_ral(self, sql)
887 logger.debug(f"Original exception raised by Java:\n {e}")
888 # We do not want to re-raise an exception here
889 # as this would print the full java stack trace
890 # if debug is not set.
891 # Instead, we raise a nice exception
--> 892 raise ParsingException(sql, str(e.message())) from None
894 # Internal, temporary results of calcite are sometimes
895 # named EXPR$N (with N a number), which is not very helpful
896 # to the user. We replace these cases therefore with
(...)
899 # but so far I did not find such a case.
900 # So please raise an issue if you have found one!
901 if sqlNodeClass == "org.apache.calcite.sql.SqlOrderBy":
ParsingException: Can not parse the given SQL: From line 1, column 8 to line 1, column 24: No match found for function signature UNIX_SECONDS(<TIMESTAMP>)
The problem is probably somewhere here:
select UNIX_SECONDS(dt0) FROM test_dt
^^^^^^^^^^^^^^^^^
cc @ayushdg , @jdye64