Daft
Daft copied to clipboard
temporal expressions parity with pyspark
missing temporal functions
- [ ] add_months
- [ ] convert_timezone
- [ ] curdate
- [ ] current_date
- [ ] current_timestamp
- [ ] current_timezone
- [ ] date_add
- [ ] date_diff
- [ ] date_format
- [ ] date_from_unix_date
- [ ] date_part
- [ ] date_sub
- [ ] date_trunc
- [ ] dateadd
- [ ] datediff
- [ ] datepart
- [ ] dayofmonth
- [x] dayofyear
- [ ] extract
- [ ] from_unixtime
- [ ] from_utc_timestamp
- [ ] last_day
- [ ] localtimestamp
- [ ] make_date
- [ ] make_dt_interval
- [ ] make_interval
- [ ] make_timestamp
- [ ] make_timestamp_ltz
- [ ] make_timestamp_ntz
- [ ] make_ym_interval
- [ ] months_between
- [ ] next_day
- [ ] now
- [ ] quarter
- [ ] session_window
- [ ] timestamp_micros
- [ ] timestamp_millis
- [ ] timestamp_seconds
- [ ] to_timestamp_ltz
- [ ] to_timestamp_ntz
- [x] to_unix_timestamp
- [ ] to_utc_timestamp
- [ ] trunc
- [ ] try_to_timestamp
- [ ] unix_date
- [ ] unix_micros
- [ ] unix_millis
- [ ] unix_seconds
- [ ] unix_timestamp
- [ ] weekday
- [ ] weekofyear
- [ ] window
- [ ] window_time
Additional Context
see https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/functions.html#datetime-functions
@universalmind303 I love it. I think that's also the way to step by step bring spark features into daft to make it more attractive.
For me the current_timstamp is a very common one.
Currently coding it like that instead of having a nice clean one row()
current_timestamp = dt.datetime.now() df = df.with_column("DateCol", daft.lit(current_timestamp))