agate icon indicating copy to clipboard operation
agate copied to clipboard

best way to convert Date cols to Text after loading?

Open dataders opened this issue 4 years ago • 0 comments

problem statement

I'm making a plugin to a python package that uses agate (namely: https://github.com/dbt-labs/dbt). The maintainers have made it to easy to plug in functionality after the csv has been loaded into memory as as an agate table, but harder to override the loading behavior.

My goal is to clearly and simply convert Date columns of an agate Table to Text.

date_orders.csv

id,user_id,order,date
1,1,2018-01-01,2018-01-05
2,3,2018-01-02,2018-01-05
3,94,2018-01-04,2018-01-07

current best approach

here's the best way I've thought of to make the Date columns into Text type so far. Is there a cleaner solution?

import agate
import os

fname='data/date_orders.csv'

orders = agate.Table.from_csv(fname)
orders.print_structure()
| column  | data_type |
| ------- | --------- |
| id      | Text      |
| user_id | Text      |
| order   | Date      |
| date    | Date      |
def cast_col_to_text(col_name):
    return agate.Formula(agate.Text(), lambda r: r[col_name])

out = [(col.name, cast_col_to_text(col.name)) for col in orders.columns if 'Date' in str(col.data_type)]
orders = orders.compute(out,replace=True)

orders.print_structure()
| column  | data_type |
| ------- | --------- |
| id      | Text      |
| user_id | Text      |
| order   | Text      |
| date    | Text      |

easiest way

this way is obviously easiest, but harder for me to make happen

tester = agate.TypeTester(types=[agate.Text()])
orders_txt = agate.Table.from_csv(fname, column_types=tester)

orders_txt.print_structure(),orders_txt.print_table()
| column  | data_type |
| ------- | --------- |
| id      | Text      |
| user_id | Text      |
| order   | Text      |
| date    | Text      |

dataders avatar Aug 06 '21 23:08 dataders