bug: schema detection on CSV, empty entries are assumed to be strings

Open ramfox opened this issue 7 years ago • 1 comments

EG:

Name,Home_Runs,Rank
Barry Bonds,762,1
Hank Aaron,755,2
Babe Ruth,714,
Alex Rodriguez,696,4
Willie Mays,660,5

has schema:

"schema": {
  "items": {
    "items": [
      {
        "title": "name",
        "type": "string"
      },
      {
        "title": "home_runs",
        "type": "integer"
      },
      {
        "title": "rank",
        "type": "string"
      }
    ],
    "type": "array"
  },
  "type": "array"
}

Instead of

{
   "title": "rank",
  "type": "string"
}

Jun 06 '18 17:06 ramfox

Propose that in ParseType checks for empty byte slice. If empty we say type is TypeEmpty.

If a row has multiple types, but one of those are TypeEmpty, we disregard those when we determine the schema.

Except then what value do we give this field when we marshal into go????

Jun 06 '18 20:06 ramfox