dataset
dataset copied to clipboard
bug: schema detection on CSV, empty entries are assumed to be strings
EG:
Name,Home_Runs,Rank
Barry Bonds,762,1
Hank Aaron,755,2
Babe Ruth,714,
Alex Rodriguez,696,4
Willie Mays,660,5
has schema:
"schema": {
"items": {
"items": [
{
"title": "name",
"type": "string"
},
{
"title": "home_runs",
"type": "integer"
},
{
"title": "rank",
"type": "string"
}
],
"type": "array"
},
"type": "array"
}
Instead of
{
"title": "rank",
"type": "string"
}
Propose that in ParseType checks for empty byte slice. If empty we say type is TypeEmpty.
If a row has multiple types, but one of those are TypeEmpty, we disregard those when we determine the schema.
Except then what value do we give this field when we marshal into go????