agate
agate copied to clipboard
best way to convert Date cols to Text after loading?
problem statement
I'm making a plugin to a python package that uses agate (namely: https://github.com/dbt-labs/dbt). The maintainers have made it to easy to plug in functionality after the csv has been loaded into memory as as an agate table, but harder to override the loading behavior.
My goal is to clearly and simply convert Date columns of an agate Table to Text.
date_orders.csv
id,user_id,order,date
1,1,2018-01-01,2018-01-05
2,3,2018-01-02,2018-01-05
3,94,2018-01-04,2018-01-07
current best approach
here's the best way I've thought of to make the Date columns into Text type so far. Is there a cleaner solution?
import agate
import os
fname='data/date_orders.csv'
orders = agate.Table.from_csv(fname)
orders.print_structure()
| column | data_type |
| ------- | --------- |
| id | Text |
| user_id | Text |
| order | Date |
| date | Date |
def cast_col_to_text(col_name):
return agate.Formula(agate.Text(), lambda r: r[col_name])
out = [(col.name, cast_col_to_text(col.name)) for col in orders.columns if 'Date' in str(col.data_type)]
orders = orders.compute(out,replace=True)
orders.print_structure()
| column | data_type |
| ------- | --------- |
| id | Text |
| user_id | Text |
| order | Text |
| date | Text |
easiest way
this way is obviously easiest, but harder for me to make happen
tester = agate.TypeTester(types=[agate.Text()])
orders_txt = agate.Table.from_csv(fname, column_types=tester)
orders_txt.print_structure(),orders_txt.print_table()
| column | data_type |
| ------- | --------- |
| id | Text |
| user_id | Text |
| order | Text |
| date | Text |