pgloader
pgloader copied to clipboard
CSV header with newline breaks import
The included CSV file has a newline in one of the header fields. This causes the import to fail with
FATAL non whitespace after quoted data [...]
When removing the newline in the header, the import succeeds. I expect the import to succeed in both cases.
- [x] pgloader --version
pgloader version "3.6.2c52da1"
compiled with SBCL 2.1.11.debian
-
[x] did you test a fresh compile from the source tree?
-
[x] did you search for other similar issues?
-
[x] how can I reproduce the bug?
load csv
from inline
(
noc,
occupation_title,
esdc_wage_rate_low_2021,
esdc_wage_rate_median_2021,
esdc_wage_rate_high_2021,
calculated_median_annual_salary_2021,
source_information
)
into postgresql:///pgloader?test_table
with
skip header = 1,
fields optionally enclosed by '"',
fields terminated by ','
before load do
$$ drop table if exists test_table; $$,
$$ create table if not exists test_table (
noc text primary key,
occupation_title text,
esdc_wage_rate_low_2021 text,
esdc_wage_rate_median_2021 text,
esdc_wage_rate_high_2021 text,
calculated_median_annual_salary_2021 text,
source_information text
);
$$
;
NOC,"Occupation Title","ESDC Wage Rate Low
2021","ESDC Wage Rate Median 2021","ESDC Wage Rate High 2021","Calculated Median Annual Salary 2021","Source Information"
0000,Title,1,2,3,4,"Data field
with newline is ok"
- [x] pgloader output you obtain
2022-07-02T17:16:32.004001-07:00 LOG pgloader version "3.6.2c52da1"
2022-07-02T17:16:32.004001-07:00 LOG Data errors in '/tmp/pgloader/'
2022-07-02T17:16:32.004001-07:00 LOG Parsing commands from file #P"/home/kratib/src/quartech/workbc-ssot/migration/newline-header.load"
2022-07-02T17:16:32.180009-07:00 WARNING Target table "\"public\".\"test_table\"" has 1 indexes defined against it.
2022-07-02T17:16:32.180009-07:00 WARNING That could impact loading performance badly.
2022-07-02T17:16:32.180009-07:00 WARNING Consider the option 'drop indexes'.
2022-07-02T17:16:32.204010-07:00 ERROR non whitespace after quoted data #<CSV-READER LINE-IDX:0 CHARACTER-LINE-IDX:8 CHARACTER-IDX:8 "2021," {1009727C23}> E
2022-07-02T17:16:32.204010-07:00 FATAL non whitespace after quoted data #<CSV-READER LINE-IDX:0 CHARACTER-LINE-IDX:8 CHARACTER-IDX:8 "2021," {1009727C23}> E
2022-07-02T17:16:32.208010-07:00 LOG report summary reset
table name errors rows bytes total time
----------------------- --------- --------- --------- --------------
fetch 0 0 0.000s
before load 0 2 0.048s
----------------------- --------- --------- --------- --------------
"public"."test_table" 2 0 0.016s
----------------------- --------- --------- --------- --------------
Files Processed 0 1 0.008s
COPY Threads Completion 0 2 0.016s
----------------------- --------- --------- --------- --------------
Total import time 2 0 0.024s
- [x] data that is being loaded, if relevant
No data is loaded.
- [x] How the data is different from what you expected, if relevant
I expect the newline in the header to be processed without causing an error.