pg_bulkload
pg_bulkload copied to clipboard
invalid memory alloc request size 1074789376
pg_bulkload sometimes makes requests with unreasonable size to palloc, causing errors shown in the subject. What's worse, any previous error messages are overwritten, see below:
I tried to load a CSV file containing 500,000 lines with some parse errors, where it showed the error message correctly (no invalid memory alloc requests here)
$ pg_bulkload /tmp/p.ctl
NOTICE: BULK LOAD START
WARNING: Parse error Record 1: Input Record 1: Rejected - column 2. unterminated CSV quoted field
WARNING: Maximum parse error count exceeded - 1 error(s) found in input file
NOTICE: BULK LOAD END
0 Rows skipped.
0 Rows successfully loaded.
1 Rows not loaded due to parse errors.
0 Rows not loaded due to duplicate errors.
0 Rows replaced with new rows.
WARNING: some rows were not loaded due to errors.
But with 10,000,000 lines, things turned to this:
$ pg_bulkload /tmp/p.ctl
NOTICE: BULK LOAD START
WARNING: Parse error Record 1: Input Record 1: Rejected - column 2. invalid memory alloc request size 1074789376
WARNING: Maximum parse error count exceeded - 1 error(s) found in input file
NOTICE: BULK LOAD END
0 Rows skipped.
0 Rows successfully loaded.
1 Rows not loaded due to parse errors.
0 Rows not loaded due to duplicate errors.
0 Rows replaced with new rows.
pg_bulkload shouldn't make palloc requests of unreasonable size. :)
A culprit may be request sizes calculated by lib/parser_csv.c.
And this seems to occur only in cases when there are parse errors.
Re-opened to discuss a better memory management solution within CSVParserRead(). See issue #36 to see why I reverted the commit that implemented a poor solution (basically, that solution disabled a useful feature).