daru
daru copied to clipboard
from_csv should support loading specified columns as date columns
CSVs frequently contain multiple date columns. These are not necessarily indexes. It would be good if from_csv function provided an easy way to load column(s) as dates. As it stands now, the CSV needs to be loaded and the columns converted in to dates.
For reference, Pandas read_csv function supports a parse_dates
argument for a similar purpose.
At the moment I have a simple wrapper csv_as_dataframe(path, date_columns=[])
to from_csv that reads the CSV and convert the columns using the below naive implementation:
date_columns.each do |col|
df[col] = df[col].map do |v|
if v
Date.parse(v)
else
nil
end
end
end
It can probably be better optimized to handler multiple columns at once so it's not N^2. Not sure if something like this is already supported in Daru as I am fairly new to it.
This would be a great feature.