pgloader icon indicating copy to clipboard operation
pgloader copied to clipboard

Pgloader is erroring out for PG URI with IPV6 Host

Open WazzaRD10 opened this issue 6 years ago • 4 comments

  • [x] pgloader --version

    pgloader version "3.6.1"
      compiled with SBCL 1.4.0-1.el7
    
  • [x] did you test a fresh compile from the source tree? Yes

  • [x] did you search for other similar issues? Yes. Bug #837 Seems to be related to the parsing of IpV6 address.

  • [x] how can I reproduce the bug? pgloader b.load



LOAD CSV
     FROM '/u02/netcracker/rbm/rbmvm/infinys_root/a.csv' WITH ENCODING UTF8 (COLUMN1,COLUMN2)
    INTO postgresql://netcrk:crknet@[fd95:ff55:7fb8:f1e5:f816:3eff:fe5d:96b7]:5432/postgres
    TARGET TABLE TESTACCOUNT(COLUMN1,COLUMN2)

      WITH  TRUNCATE,
       SKIP HEADER = 0,
	   FIELDS OPTIONALLY ENCLOSED BY '"',
	   FIELDS TERMINATED BY ','

      SET CLIENT_ENCODING TO 'UTF8,
            work_mem to '128MB',
          standard_conforming_strings to 'on'

    BEFORE LOAD DO $$ CREATE SCHEMA IF NOT EXISTS PUBLIC; $$,
 $$ CREATE TABLE IF NOT EXISTS PUBLIC.TESTACCOUNT(COLUMN1 INTEGER,COLUMN2 INTEGER); 
 $$;
  • [x] pgloader output you obtain
2019-07-26T17:28:53.045000+01:00 LOG pgloader version "3.6.1"
KABOOM!
FATAL error: At

   LOAD CSV FROM '/u02/netcracker/rbm/rbmvm/infinys_root/a.csv' WITH ENCODING UTF8 (COLUMN1,COLUMN2) INTO postgresql://netcrk:crknet@[fd95:ff55:7fb8:f1e5:f81
                                                                                                                                     ^ (Line 1, Column 131, Position 131)

In context PGSQL-URI:

While parsing DSN-DBNAME. Expected:

     the character / (SOLIDUS)

While parsing DSN-PORT. Expected:

     the character : (COLON)

While parsing NAMESTRING. Problem:

  The production

    #\[

  does not satisfy the predicate ALPHA-CHAR-P.

Expected:

     the character _ (LOW_LINE)
  or any character satisfying ALPHA-CHAR-P

While parsing SOCKET-DIRECTORY. Expected:

     the string "unix:"

While parsing IPV4-PART. Problem:

  The production

    #\[

  does not satisfy the predicate DIGIT-CHAR-P.

Expected:

     any character satisfying DIGIT-CHAR-P
An unhandled error condition has been signalled: At

   LOAD CSV FROM '/u02/netcracker/rbm/rbmvm/infinys_root/a.csv' WITH ENCODING UTF8 (COLUMN1,COLUMN2) INTO postgresql://netcrk:crknet@[fd95:ff55:7fb8:f1e5:f81
                                                                                                                                     ^ (Line 1, Column 131, Position 131)

In context PGSQL-URI:

While parsing DSN-DBNAME. Expected:

     the character / (SOLIDUS)

While parsing DSN-PORT. Expected:

     the character : (COLON)

While parsing NAMESTRING. Problem:

  The production

    #\[

  does not satisfy the predicate ALPHA-CHAR-P.

Expected:

     the character _ (LOW_LINE)
  or any character satisfying ALPHA-CHAR-P

While parsing SOCKET-DIRECTORY. Expected:

     the string "unix:"

While parsing IPV4-PART. Problem:

  The production

    #\[

  does not satisfy the predicate DIGIT-CHAR-P.

Expected:

     any character satisfying DIGIT-CHAR-P




What I am doing here?

At

   LOAD CSV FROM '/u02/netcracker/rbm/rbmvm/infinys_root/a.csv' WITH ENCODING UTF8 (COLUMN1,COLUMN2) INTO postgresql://netcrk:crknet@[fd95:ff55:7fb8:f1e5:f81
                                                                                                                                     ^ (Line 1, Column 131, Position 131)

In context PGSQL-URI:

While parsing DSN-DBNAME. Expected:

     the character / (SOLIDUS)

While parsing DSN-PORT. Expected:

     the character : (COLON)

While parsing NAMESTRING. Problem:

  The production

    #\[

  does not satisfy the predicate ALPHA-CHAR-P.

Expected:

     the character _ (LOW_LINE)
  or any character satisfying ALPHA-CHAR-P

While parsing SOCKET-DIRECTORY. Expected:

     the string "unix:"

While parsing IPV4-PART. Problem:

  The production

    #\[

  does not satisfy the predicate DIGIT-CHAR-P.

Expected:

     any character satisfying DIGIT-CHAR-P


  • [x] data that is being loaded, if relevant
1,1
1,2
1,3
  • [x] How the data is different from what you expected, if relevant Pgloader should be parsing the URI.

WazzaRD10 avatar Jul 26 '19 16:07 WazzaRD10

I can confirm this issue using version:

pgloader version "3.6.9"
compiled with SBCL 2.4.4

stackcoder avatar Oct 03 '24 22:10 stackcoder

The issue might be that we currently don't match hexadecimal digits (which include the characters 0-9 and A-F/a-f).

pgloader/src/parsers/parse-pgpass.lisp I think we should use:

(digit-char-p character 16)

This tells digit-char-p to check for characters valid in base 16 (hexadecimal), instead of just the default base 10 (which only includes 0-9). And this then would allow for all valid characters in an IPv6 address.

j7i avatar May 24 '25 19:05 j7i

I'm not sure how to contribute a pull request or how the workflow regarding permissions should be.

But it would be awesome if you could confirm my assumption, @dimitri, and either introduce the change or point out how I can help further more with this issue.

j7i avatar May 24 '25 20:05 j7i

It's correct that IPv6 host parsing is broken there. Thank you for finding that.

The fix is a bit different though, as what you're seeing there is not function application but a parser rule. Specifically, the symbol character is a terminal that always succeeds and consumes and produces a single character. The surrounding predicate form is then a rule that specifies that the given predicate (of arity 1) applied to the successful result of the inner expression must return true, for the outer to succeed.

In other words: we need to define a predicate of arity 1 to use here, or reformulate the rule.

There are two places where this bug appears (the one you found is not the one affecting your load file).

svantevonerichsen6906 avatar May 24 '25 22:05 svantevonerichsen6906