xtdb icon indicating copy to clipboard operation
xtdb copied to clipboard

add LPAD and RPAD

Open refset opened this issue 2 months ago • 3 comments

e.g. often used for displaying data:

SELECT relname table_name,
       lpad(to_char(reltuples, 'FM9,999,999,999'), 13) row_count
FROM pg_class
LEFT JOIN pg_namespace
    ON (pg_namespace.oid = pg_class.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND relkind = 'r'
ORDER BY reltuples DESC;

=>

              table_name               |   row_count
---------------------------------------+---------------
 trips                                 | 1,120,559,744
 uber_trips_2015                       |    14,270,497
 spatial_ref_sys                       |         3,911
 central_park_weather_observations_raw |         2,372
 nyct2010                              |         2,167
 uber_taxi_zone_lookups                |           265
 yellow_tripdata_staging               |             0
 cab_types                             |             0
 uber_trips_staging                    |             0
 green_tripdata_staging                |             0

(overview of approximate counts, from https://tech.marksblogg.com/billion-nyc-taxi-rides-redshift.html)

refset avatar Nov 08 '25 20:11 refset

@refset what's happening with this one? ready to review?

jarohen avatar Nov 28 '25 12:11 jarohen

I'm happy with the logic itself, that the result is Postgres-compliant (maybe with the exception of NULL handling looking again now :thinking: probably need to test that also), and that the tests pass, but I wasn't 100% sure if the codegen pattern is idiomatic (or if there's a better option?) - a quick review would be great :pray:

Possibly also needs unicode handling (or testing at least)

refset avatar Nov 28 '25 16:11 refset

Was thinking further about unicode handling, and I just spotted that Datafusion has extensive SLT-format tests for their standard lib: https://github.com/apache/datafusion/blob/36ec9f1de0aeabca60b8f7ebe07d650b8ef03506/datafusion/sqllogictest/test_files/string/string_query.slt.part#L1385

And for comparison, their vectorized(!) implementation: https://github.com/apache/datafusion/blob/36ec9f1de0aeabca60b8f7ebe07d650b8ef03506/datafusion/functions/src/unicode/lpad.rs

refset avatar Nov 28 '25 20:11 refset