Deedle
Deedle copied to clipboard
Performance of CSV writer
The CSV export of data frames (via method SaveCsv) is very slow. As an example out of my practice (based on v3.0.0): A 50,000 x 100 data frame with a resulting CSV file of 20mb size took 1min 20 secs to produce on my system. Column Types are 2/3 numbers (some integer, some double precision float) and 1/3 a two-valued discriminated union.
I figured it out: The default implementation of ToString() for DUs is very slow. After overriding it with a custom implementation, the previously mentioned data frame serializes in 6 seconds. This is still not impressive (around 3mb/sec), but at least makes it usable.