specs icon indicating copy to clipboard operation
specs copied to clipboard

Is there support for addressing worksheets within a spreadsheet?

Open ptsefton opened this issue 2 years ago • 1 comments

The Table Schema spec mentions spreadsheets, but makes no mention of the fact that spreadsheet formats (usually?) include multiple tables/worksheets - is there a way to describe a table in a spreadsheet/workbook?

ptsefton avatar May 31 '22 00:05 ptsefton

tldr

If you are using frictionless-py you can use the dialect.sheet property:

{
  "profile": "tabular-data-package",
  "resources": [
    {
      "name": "spreadsheet",
      "path": "data.xlsx",
      "dialect": {
        "sheet": "sheet1"
      }
    }
  ]
}

Not every tool is going to support this. frictionless-r for example doesn't.

Longer version

The table schema spec does not address the question of where the data is physically located (eg. excel spreadsheet, CSV file, RDBMS table), so I think the mention of spreadsheets on the table schema spec is only conceptual.

The tabular data resource spec on the other hand says that the data the tabular data resource describes MUST, if non-inline, be a CSV file. Although there is a fine proposal to change this in https://github.com/frictionlessdata/specs/issues/697, I think is fair to say that the frictionless specs don't support data in spreadsheets as of today.

However, the frictionless-py python package does have a ExcelDialect that allows you to do this.

fjuniorr avatar Jun 10 '22 21:06 fjuniorr

Thanks! It will be handled in the Table Dialect Spec - https://github.com/frictionlessdata/specs/issues/697

roll avatar Jan 03 '24 12:01 roll