incubator-gluten icon indicating copy to clipboard operation
incubator-gluten copied to clipboard

#349 bug fix for convert parquet to dwrf memory double free error

Open KevinyhZou opened this issue 2 years ago • 6 comments

What changes were proposed in this pull request?

bug fix for convert parquet to dwrf memory double free error as describe in #349

How was this patch tested?

manual tests

  1. run tpch_convert_parquet_dwrf.sh it cause the error describe in #349
  2. modfiy the jni_common.h as in pr, and re-run the script, no errors, and the dwrf file was produced.

KevinyhZou avatar Sep 01 '22 13:09 KevinyhZou

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[Gluten-${ISSUES_ID}] ${detailed message}

See also:

github-actions[bot] avatar Sep 01 '22 13:09 github-actions[bot]

@KevinyhZou Is this error related to the size of the dataset?

JkSelf avatar Sep 06 '22 02:09 JkSelf

@KevinyhZou Is this error related to the size of the dataset?

the original parquet file's size is about 11KB, not big, do you mean the small data size may cause this error ? @JkSelf

KevinyhZou avatar Sep 06 '22 03:09 KevinyhZou

@KevinyhZou Is this error related to the size of the dataset?

the original parquet file's size is about 11KB, not big, do you mean the small data size may cause this error ? @JkSelf

I re-run the tpch_convert_parquet_dwrf.sh script and can not reproduce this issue. I tested the 448M lineitem file and 3K nation file. Here is my test code image

JkSelf avatar Sep 06 '22 03:09 JkSelf

that is strange. here below is my test scala code image

and sometimes it can pass, and sometimes not . and here is tpch_convert_parquet_dwrf.sh, I have modify some parameters as below image

@JkSelf does these modify can cause the error?

KevinyhZou avatar Sep 06 '22 03:09 KevinyhZou

@KevinyhZou It seems this issue is not related with the configurations. I use your configuration and still can not reproduce this issue.

image

JkSelf avatar Sep 09 '22 03:09 JkSelf

Since we cannot regenerate this case, and now we support parquet, you can try parquet format. If new information, you can reopen again.

jinchengchenghh avatar Nov 23 '22 01:11 jinchengchenghh

I face the same issues, and use this solution, thanks for your contribution. https://github.com/oap-project/gluten/pull/604/commits/2b0f6e422dc503d022aefdb03ba2980ae224c445

jinchengchenghh avatar Jan 16 '23 06:01 jinchengchenghh

Can you resolve the conflict? Or just solve it in my PR https://github.com/oap-project/gluten/pull/604?

jinchengchenghh avatar Jan 16 '23 07:01 jinchengchenghh

I face the same issues, and use this solution, thanks for your contribution. 2b0f6e4

OK

KevinyhZou avatar Jan 16 '23 07:01 KevinyhZou

https://github.com/oap-project/gluten/issues/349

github-actions[bot] avatar Jan 16 '23 08:01 github-actions[bot]