vscode-data-preview
vscode-data-preview copied to clipboard
invalid encoding: PLAIN_DICTIONARY
Hi,
The extension fails to load the attached parquet file (zipped as github doesn't accept .parquet files). I am able to read the plain file with pandas.
The error in "Runtime Status" is "invalid encoding: PLAIN_DICTIONARY".
Vscode version: 1.70.2 (running on Ubuntu 22.04) Extension version: v2.3.0 FJUL.zip
Regards, Eugeniu
@EugeniuZ Data preview uses this TypeScript library for reading parquet data files:
https://github.com/kbajalc/parquets
At the time when Data Preview was created, it was one of the few libraries available to read parquet files without dependency on Python tools and toolchain.
Quite possible that library doesn't support plain dictionary encoding, as you have it in your parquet files.
New parquet-wasm library looks promising, and in order to resolve this issue, and enable loading of compressed parquet files too, I would need to switch parquet data provider to use better parquet TS/JS library.
more info at: https://github.com/RandomFractals/vscode-data-preview/issues/316#issuecomment-1277766785