parquetjs
parquetjs copied to clipboard
Can write file on AWS S3
Actually I am using parquetjs in Meteor.js . I want to create a parquet data file .
ParquetWriter.openFile(schema, filePath) , I am getting below error.
W20191226-23:56:11.534(5.5)? (STDERR) (node:6898) UnhandledPromiseRejectionWarning: missing required field: assets W20191226-23:56:11.534(5.5)? (STDERR) (node:6898) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 14).
It seems related to path or permission issue. But instead of create local file .
Is it possible to upload paraqut file AWS S3?
@staronline1985 hi, do you find the solution upload parquet to AWS S3
Yes , I had found the solution to upload parquet to AWS S3 . But I am getting issue with large files. For Example : I have read JSON or CSV format file and convert into Parquet format. It keep all data in memory until unless close to parquet writer. It will not work for me with large file. My job was read file json file from S3 and convert into parquet format and upload again on S3.
I think it should be stream based so read data as stream and convert and upload stream to S3
@staronline1985 I have the same mission. But for now. it just needs me covert local CSV file to parquet and upload s3. But it needs to create a local parquet file and then readFileSync as a buffer to upload . I want to upload S3 directly , don't save local. How to do that?
@staronline1985 I have the same mission. But for now. it just needs me covert local CSV file to parquet and upload s3. But it needs to create a local parquet file and then readFileSync as a buffer to upload . I want to upload S3 directly , don't save local. How to do that?
I am also doing same and waiting for parquetjs , if any possibility for same . Otherwise I will go with other repo.
You need to use ParquetTransformer
as mentioned in #76
You need to use
ParquetTransformer
as mentioned in #76
Do you have an example for doing it?
@govthamreddy do you have a working example of pushing it to s3?
Example from #76
https://github.com/ironSource/parquetjs/issues/76#issuecomment-1312158235