DataFrameInternal - Using OrderedCollection over Array2D
DataFrameInternal currently uses Array2D (Previously it used Matrix https://github.com/PolyMathOrg/DataFrame/issues/44)
Is there any specific reason such as speed/functionality for choosing Array2D?
Currently, while adding/removing a row, entire dataframe gets re-created. This becomes problematic for large data - eg: reading a csv file with thousands of rows results in calling addRow for every row in csv. DataFrameInternal is recreated for every such call.
I think using OrderedCollection would be better, since we can add elements at arbitrary indices. Are there any negatives for using OrderedCollection?
The way I was thinking of implementing this is having column-oriented OrderedCollections and contents will also be an OrderedCollection which holds these columns.
To access a row, fetch it's index, and iterate through contents, fetching index-th element for every column.
Will have to do a detailed performance profiling to see speed-downs in fetching row (if any), and speed-ups in adding rows.