parquetjs
parquetjs copied to clipboard
Cannot write a parquet file having a comma in one of its headers
I came across a very unfortunate problem using this library.
Whenever I try to read a parquet file created with this same tool and containing a comma ,
in any of its headers.
I get this error while await parquetReader.getCursor().next()
: TypeError: Cannot read property 'fields' of undefined
stacktrace:
at ParquetSchema.findField (../../.yarn/cache/parquetjs-npm-0.11.2-9df3a54481-63137e17bc.zip/node_modules/parquetjs/lib/schema.js:35:22)
at Object.exports.materializeRecords (../../.yarn/cache/parquetjs-npm-0.11.2-9df3a54481-63137e17bc.zip/node_modules/parquetjs/lib/shred.js:164:26)
at ParquetCursor.next (../../.yarn/cache/parquetjs-npm-0.11.2-9df3a54481-63137e17bc.zip/node_modules/parquetjs/lib/reader.js:62:40)
I guess this is caused by a rather "unsafe" operation in the parquetjs/lib/schema.js
file on line 28: path.split(",")
I would be very helpful for any help with this problem. Thank you!
Hi! I have written a solution for it! Look below :-)