machinelearning
machinelearning copied to clipboard
DataFrame LoadCsv improvements
This issue has been moved from a ticket on Developer Community.
The DataFrame.LoadCsv method could be improved in a number of ways:
- Use double when the precision in the data allows it. Now float is always used when the data is floating point.
- Allow the user to define NaN in floating point columns data. In R code that we are also using, NA is typically used.
- Speed improvements if possible.
- I think there is a problem with defining culture info in the parameter. I have to do: Thread.CurrentThread.CurrentCulture = new CultureInfo("en-US"); before the call to LoadCsv to make floating point data with decimal points load correctly (in Sweden).
Original Comments
Feedback Bot on 8/1/2024, 10:56 PM:
(private comment, text removed)