gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

Handle '\n', '\r' and '\t' in s3 URLs

Open jingwen-yang-yjw opened this issue 3 years ago • 0 comments

This is a solution for query time failures due to the presence of '\n', '\r' or '\t' in s3 URLs.

When we execute commands as follows:

$ createdb testdb
$ psql testdb
> CREATE READABLE EXTERNAL TABLE test_table (date text, time text, open float, high float,low float, volume int) LOCATION('s3://s3-us-west-2.amazonaws.com/@read_prefix@/oneline/
config=/home/gpadmin/s3.conf') format 'csv';
> SELECT count(*) FROM test_table;

The query will fail and we will see error information "Fail to parse URL".

This issue is caused by '\n' between "s3://s3-us-west-2.amazonaws.com/@read_prefix@/oneline/" and "config=/home/gpadmin/s3.conf" in s3 url.

This pr provides support for handling '\n', '\r' and '\t' in S3 URLs. And these changes only affect s3 URLs.

There is another pr solving the same issue, but its changes affect too much. https://github.com/greenplum-db/gpdb/pull/8694

jingwen-yang-yjw avatar Aug 01 '22 03:08 jingwen-yang-yjw