FSharp.Data icon indicating copy to clipboard operation
FSharp.Data copied to clipboard

Csv: possibility to treat 0/1 column as number

Open kirsar opened this issue 3 years ago • 3 comments

Is there any way to override default csv type inference behavior "Columns with only 0, 1, Yes, No, True, or False will be set to bool" (or may be inject a callback somewhere) and treat 0/1 column as number for random csv (no shema known upfront)? Thanks.

https://fsprojects.github.io/FSharp.Data/library/CsvProvider.html#Controlling-the-column-types

kirsar avatar Dec 08 '21 13:12 kirsar

In general, I don't think so. But

  1. If the column name is known (e.g., MyColumn) you could specify that the column with that name is integer
type Csv =
  CsvProvider<"file.csv",
              Schema="MyColumn=int",
              ResolutionFolder=ResolutionFolder>
  1. You could use the csv parser instead to bypass auto conversion https://fsprojects.github.io/FSharp.Data/library/CsvFile.html

nhirschey avatar Mar 03 '22 17:03 nhirschey

@nhirschey thanks! Unfortunately my content is totally dynamic, no suggestions can be done upfront So I introduced some post processing: if type inferred as boolean but column has only '0' and '1' cells, I forcely set it's type to integer.

certara-ksorokin avatar Mar 28 '22 14:03 certara-ksorokin

I guess we should interpet this as a request for a feature request for a new static parameter StrictBooleans to the CsvProvider so that the provider only infers a boolean type for samples containing precisely true/false values

Currently the spec says: "Columns with only 0, 1, Yes, No, True, or False will be set to bool." With this static parameter this would become "Columns with only 0, 1, Yes, No, True, or False will be set to bool. If StrictBooleans is active, then columns with only true or false will be set to bool."

dsyme avatar Mar 11 '24 22:03 dsyme