breadroll
breadroll copied to clipboard
Feature request: strongly typed dataframes
The big advantage of using this over python would be speed, and also that it could be made completely type-safe. If the user knows the rows/columns ahead of time (which is most likely the case) I would suggest we add the possibility to pass a type parameter.
import Breadroll, { Dataframe } from 'breadroll';
interface Row {
name: string;
age: number;
city: string;
}
const csv: Breadroll = new Breadroll({ header: true, delimiter: ',' });
const df: Dataframe<Row> = await csv.open.local<Row>('./data/input.csv');
This would ideally make all functions completely type-safe: you should not be able to access columns that do not exist, you have autocomplete for filtering, etc.
Big thanks, @itsyoboieltr, this is a fantastic feature suggestion! ✨ it's already rocketed its way onto the roadmap, as we think about methods of implementation. We'll see how we can ship this :shipit: as soon as we can.
In addition to the type-level safety, maybe there should also be an option for runtime validation too, similar to how trpc does it - Input & Output Validators. This would probably be too costly for very large datasets, but for small-to-medium sized ones could prove to be useful, to make sure the data is sane. Interestingly, most of these validators also support some kind of data transformation, encoding-decoding, etc, so allowing their use in some way would immediately increase the array of features this package also provides :D
These are some great insights, data transformation like encoding and decoding are one of the next things we are looking at, since we already have the .apply method. 🚀 Thanks