mysql_fdw
mysql_fdw copied to clipboard
Running in PL/pgSQL, WHERE clause push-down stops after 5 queries
Hi,
TL;DR: mysql_fdw queries can become dramatically slower within PL/pgSQL FOR loops, due to the query optimizer eliminating the WHERE clause push-down.
I'm trying to import MySQL data into Postgres 9.5, using PL/pgSQL. The MySQL tables are large, and so the WHERE clause push-down is crucial -- otherwise select * from bigtable where id=123
effectively becomes select * from bigtable
, which takes forever.
Unfortunately it seems the WHERE clause push-down is not happening after 5 iterations of a PL/pgSQL FOR loop.
For instance, say I have a MySQL table:
MariaDB [testdb]> select * from users ;
+------+--------+--------+
| id | name | photo |
+------+--------+--------+
| 1 | Bob | xxxx |
| 2 | Fred | yyyyy |
| 3 | Sally | zzzz |
| 4 | Mary | qzqzqz |
| 5 | Jimmy | abcabc |
| 6 | User6 | 12345 |
| 7 | User7 | 12346 |
| 8 | User8 | 12347 |
| 9 | User9 | 12348 |
| 10 | User10 | 12349 |
+------+--------+--------+
10 rows in set (0.00 sec)
This is accessible from a Postgres database:
testdb=# \d
List of relations
┌────────┬───────┬───────────────┬──────────┐
│ Schema │ Name │ Type │ Owner │
├────────┼───────┼───────────────┼──────────┤
│ public │ users │ foreign table │ postgres │
└────────┴───────┴───────────────┴──────────┘
(1 row)
testdb=# select * from users;
INFO: Successfully connected to MySQL database testdb at server 127.0.0.1 via TCP/IP with cipher <none> (server version: 5.5.5-10.0.27-MariaDB-0ubuntu0.16.04.1, protocol version: 10)
┌────┬────────┬────────────────┐
│ id │ name │ photo │
├────┼────────┼────────────────┤
│ 1 │ Bob │ \x78787878 │
│ 2 │ Fred │ \x7979797979 │
│ 3 │ Sally │ \x7a7a7a7a │
│ 4 │ Mary │ \x717a717a717a │
│ 5 │ Jimmy │ \x616263616263 │
│ 6 │ User6 │ \x3132333435 │
│ 7 │ User7 │ \x3132333436 │
│ 8 │ User8 │ \x3132333437 │
│ 9 │ User9 │ \x3132333438 │
│ 10 │ User10 │ \x3132333439 │
└────┴────────┴────────────────┘
(10 rows)
The WHERE clause push-down for plain SQL (not PL/pgSQL) queries work fine. For instance:
$ cat testqueries.sql
SELECT name FROM users WHERE id=1;
SELECT name FROM users WHERE id=2;
SELECT name FROM users WHERE id=3;
SELECT name FROM users WHERE id=4;
SELECT name FROM users WHERE id=5;
SELECT name FROM users WHERE id=6;
SELECT name FROM users WHERE id=7;
SELECT name FROM users WHERE id=8;
SELECT name FROM users WHERE id=9;
SELECT name FROM users WHERE id=10;
$ psql -tAq testdb < testqueries.sql
INFO: Successfully connected to MySQL database testdb at server 127.0.0.1 via TCP/IP with cipher <none> (server version: 5.5.5-10.0.27-MariaDB-0ubuntu0.16.04.1, protocol version: 10)
Bob
Fred
Sally
Mary
Jimmy
User6
User7
User8
User9
User10
I can tell the WHERE clause is passing through by watching /var/log/mysql/mysql.log:
94 Prepare SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 1))
94 Execute SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 1))
...
94 Prepare SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 2))
94 Execute SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 2))
...
94 Prepare SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 3))
94 Execute SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 3))
...
94 Prepare SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 10))
94 Execute SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 10))
But say I run an equivalent loop in PL/pgSQL:
$ cat slowloop.sql
create or replace function test() returns void as $$
DECLARE
n varchar;
BEGIN
FOR x IN 1..10 LOOP
select name into n from users where id=x;
raise notice 'Found user %', n;
end loop;
return;
END
$$ LANGUAGE plpgsql;
SELECT test();
$ psql -tAq testdb < slowloop.sql
INFO: Successfully connected to MySQL database testdb at server 127.0.0.1 via TCP/IP with cipher <none> (server version: 5.5.5-10.0.27-MariaDB-0ubuntu0.16.04.1, protocol version: 10)
CONTEXT: SQL statement "select name from users where id=x"
PL/pgSQL function test() line 6 at SQL statement
NOTICE: Found user Bob
NOTICE: Found user Fred
NOTICE: Found user Sally
NOTICE: Found user Mary
NOTICE: Found user Jimmy
NOTICE: Found user User6
NOTICE: Found user User7
NOTICE: Found user User8
NOTICE: Found user User9
NOTICE: Found user User10
The /var/log/mysql/mysql.log queries are (in full):
95 Query SET sql_mode='ANSI_QUOTES'
95 Query SET time_zone = '+00:00'
95 Query SET sql_mode='ANSI_QUOTES'
95 Prepare SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 1))
95 Execute SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 1))
95 Close stmt
95 Query SET sql_mode='ANSI_QUOTES'
95 Query SET time_zone = '+00:00'
95 Query SET sql_mode='ANSI_QUOTES'
95 Prepare SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 2))
95 Execute SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 2))
95 Close stmt
95 Query SET sql_mode='ANSI_QUOTES'
95 Query SET time_zone = '+00:00'
95 Query SET sql_mode='ANSI_QUOTES'
95 Prepare SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 3))
95 Execute SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 3))
95 Close stmt
95 Query SET sql_mode='ANSI_QUOTES'
95 Query SET time_zone = '+00:00'
95 Query SET sql_mode='ANSI_QUOTES'
95 Prepare SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 4))
95 Execute SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 4))
95 Close stmt
95 Query SET sql_mode='ANSI_QUOTES'
95 Query SET time_zone = '+00:00'
95 Query SET sql_mode='ANSI_QUOTES'
95 Prepare SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 5))
95 Execute SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 5))
95 Close stmt
95 Query SET sql_mode='ANSI_QUOTES'
95 Query SET time_zone = '+00:00'
95 Query SET sql_mode='ANSI_QUOTES'
95 Prepare SELECT `id`, `name` FROM `testdb`.`users`
95 Execute SELECT `id`, `name` FROM `testdb`.`users`
95 Close stmt
95 Query SET time_zone = '+00:00'
95 Query SET sql_mode='ANSI_QUOTES'
95 Prepare SELECT `id`, `name` FROM `testdb`.`users`
95 Execute SELECT `id`, `name` FROM `testdb`.`users`
95 Close stmt
95 Query SET time_zone = '+00:00'
95 Query SET sql_mode='ANSI_QUOTES'
95 Prepare SELECT `id`, `name` FROM `testdb`.`users`
95 Execute SELECT `id`, `name` FROM `testdb`.`users`
95 Close stmt
95 Query SET time_zone = '+00:00'
95 Query SET sql_mode='ANSI_QUOTES'
95 Prepare SELECT `id`, `name` FROM `testdb`.`users`
95 Execute SELECT `id`, `name` FROM `testdb`.`users`
95 Close stmt
95 Query SET time_zone = '+00:00'
95 Query SET sql_mode='ANSI_QUOTES'
95 Prepare SELECT `id`, `name` FROM `testdb`.`users`
95 Execute SELECT `id`, `name` FROM `testdb`.`users`
95 Close stmt
95 Quit
Notice that the WHERE clause disappears after 5 queries.
Searching the web, I found other references to PL/pgSQL slowing down after 5 queries:
https://www.postgresql.org/message-id/CALgmQD9ebXKPgbK5jjbaJ9kCWtOywjjpv-X9+eN0hn5gfGJOjQ@mail.gmail.com
I don't quite understand the replies - something about query plan optimizers - but wrapping the select in an EXECUTE does indeed make it fast again:
$ cat fastloop.sql
create or replace function test() returns void as $$
DECLARE
n varchar;
BEGIN
FOR x IN 1..10 LOOP
EXECUTE 'select name from users where id=$1' INTO n USING x;
raise notice 'Found user %', n;
end loop;
return;
END
$$ LANGUAGE plpgsql;
SELECT test();
If this is a query optimizer problem, can anything be done at the mysql_fdw layer to tell the optimizer to stop messing with it? I've tried use_remote_estimate 'true'
but it made no difference.
Thanks, Jeff
PS: the script to create the MySQL testdb
database:
$ cat testdb_mysql.sql
DROP DATABASE if exists testdb ;
CREATE DATABASE testdb;
USE testdb;
CREATE TABLE users (
id integer,
name text,
photo blob
);
INSERT INTO `users` VALUES (1,'Bob','xxxx');
INSERT INTO `users` VALUES (2,'Fred','yyyyy');
INSERT INTO `users` VALUES (3,'Sally','zzzz');
INSERT INTO `users` VALUES (4,'Mary','qzqzqz');
INSERT INTO `users` VALUES (5,'Jimmy','abcabc');
INSERT INTO `users` VALUES (6,'User6','12345');
INSERT INTO `users` VALUES (7,'User7','12346');
INSERT INTO `users` VALUES (8,'User8','12347');
INSERT INTO `users` VALUES (9,'User9','12348');
INSERT INTO `users` VALUES (10,'User10','12349');
and to create a linked Postgres testdb
database:
CREATE DATABASE testdb;
GRANT usage ON SCHEMA PUBLIC TO jturner;
\c testdb;
CREATE EXTENSION mysql_fdw;
CREATE SERVER mysql_server FOREIGN DATA WRAPPER mysql_fdw OPTIONS (host '127.0.0.1', port '3306', use_remote_estimate 'true');
CREATE USER MAPPING FOR postgres SERVER mysql_server OPTIONS (username 'root', password 'secret');
CREATE USER MAPPING FOR jturner SERVER mysql_server OPTIONS (username 'root', password 'secret');
GRANT USAGE ON FOREIGN SERVER mysql_server to jturner;
IMPORT FOREIGN SCHEMA testdb FROM SERVER mysql_server INTO public;
select * from users;
Hi Jeff,
Thanks for sharing the test cases, the where clause should get pushed down for every iteration. We will try and investigate this issue asap and revert back on this thread...
-- Ahsan
Let me add to this. I have noted the same behavior on Postgres 9.6. It is not restricted to for loops, but simple invocations also cause this. For instance, the following function returns quickly 5 times, but on the 6th time and subsequently runs slowly.
Example 1
CREATE OR REPLACE function ess.get_mysql_max_prid(_min_prid bigint) RETURNS bigint AS $$
DECLARE
max_prid bigint;
BEGIN
IF _min_prid IS NOT NULL THEN
RAISE NOTICE 'prid is %',_min_prid;
max_prid := max(prid) FROM mysql_extservice_all WHERE prid >= _min_prid;
ELSE
RAISE NOTICE 'min_prid is null';
SELECT max(prid) FROM mysql_extservice_all INTO max_prid;
END IF;
RETURN max_prid;
END;
$$ language 'plpgsql';
Push Down works
Immediately after creation, the following all return in less that 1 second
ess=> select * from ess.get_mysql_max_prid(6328025);
NOTICE: prid is 6328025
get_mysql_max_prid
--------------------
6348494
(1 row)
ess=> select * from ess.get_mysql_max_prid(6328025);
NOTICE: prid is 6328025
get_mysql_max_prid
--------------------
6348495
(1 row)
ess=> select * from ess.get_mysql_max_prid(6328025);
NOTICE: prid is 6328025
get_mysql_max_prid
--------------------
6348498
(1 row)
ess=> select * from ess.get_mysql_max_prid(6328025);
NOTICE: prid is 6328025
get_mysql_max_prid
--------------------
6348500
(1 row)
ess=> select * from ess.get_mysql_max_prid(6328025);
NOTICE: prid is 6328025
get_mysql_max_prid
--------------------
6348509
(1 row)
Problematic
Eventually, returns, but takes approximately 40 seconds. All subsequent invocations also take this long. Re-defining the stored procedure resets the behavior to working 5 times quickly.
ess=> select * from ess.get_mysql_max_prid(6328025);
NOTICE: prid is 6328025
--------------------
6348513
(1 row)
@waTeim @jefft Giant kudos for figuring out this behavior! 🎆
I just ended up using materialized views and only issuing simple SELECT
statements with mysql_fdw
because it was so buggy with pushing quals down causing queries to sporadically fail (JOIN
s were very problematic).
This is a huge issue and needs to be fixed. I pity the fool using mysql_fdw
in production without catching this in development!
This is an important issue, we will try and fix this on priority...
On Sun, Jan 8, 2017 at 4:30 AM, Jeffrey [email protected] wrote:
@waTeim https://github.com/waTeim @jefft https://github.com/jefft Giant kudos for figuring out this behavior! 🎆
I just ended up using MATERIALIZED VIEWS and only issuing simple SELECT statements with mysql_fdw because it was so buggy with pushing quals down causing queries to sporadically fail.
This is a huge issue and needs to be fixed. I pity the fool using mysql_fdw in production without catching this in development!
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/EnterpriseDB/mysql_fdw/issues/118#issuecomment-271117676, or mute the thread https://github.com/notifications/unsubscribe-auth/AHmqpLtKg_H5E9u7hxDCA_HX9_jxIdgfks5rQCAfgaJpZM4KEly9 .
-- Ahsan Hadi Snr Director Product Development EnterpriseDB Corporation The Enterprise Postgres Company
Phone: +92-51-8358874 Mobile: +92-333-5162114
Website: www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.
Hi, We are finally getting around to looking at this issue. It seems that the planner stops pushing down the where clause after a certain number of iterations. The same behaviour is reproducible with postgres_fdw using a postgres foreign server. We will continue to look at this problem and keep your posted...
@jefft Push down support after 5 calls is supported now, can you please check.
@jefft: Can you check or close this issue? I reviewed the fix and it looks like the test cases cover the reported issue.
@ibrarahmad, @jmealo, using latest from git, instead of slow queries I now get outright breakage after the fifth query:
postgres@jturner-desktop:~$ psql -tAq testdb < slowloop.sql
NOTICE: Found user Bob
NOTICE: Found user Fred
NOTICE: Found user Sally
NOTICE: Found user Mary
NOTICE: Found user Jimmy
ERROR: failed to execute the MySQL query:
You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '?))' at line 1
CONTEXT: SQL statement "select name from users where id=x"
PL/pgSQL function test() line 6 at SQL statement
postgres@jturner-desktop:~$
In mysql.log I see:
170504 11:41:45 2 Connect root@localhost as anonymous on testdb
2 Query SET sql_mode='ANSI_QUOTES'
2 Query EXPLAIN SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 1))
2 Query SET time_zone = '+00:00'
2 Query SET sql_mode='ANSI_QUOTES'
2 Prepare SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 1))
2 Execute SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 1))
2 Close stmt
2 Query SET sql_mode='ANSI_QUOTES'
2 Query EXPLAIN SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 2))
2 Query SET time_zone = '+00:00'
2 Query SET sql_mode='ANSI_QUOTES'
2 Prepare SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 2))
2 Execute SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 2))
2 Close stmt
2 Query SET sql_mode='ANSI_QUOTES'
2 Query EXPLAIN SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 3))
2 Query SET time_zone = '+00:00'
2 Query SET sql_mode='ANSI_QUOTES'
2 Prepare SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 3))
2 Execute SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 3))
2 Close stmt
2 Query SET sql_mode='ANSI_QUOTES'
2 Query EXPLAIN SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 4))
2 Query SET time_zone = '+00:00'
2 Query SET sql_mode='ANSI_QUOTES'
2 Prepare SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 4))
2 Execute SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 4))
2 Close stmt
2 Query SET sql_mode='ANSI_QUOTES'
2 Query EXPLAIN SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 5))
2 Query SET time_zone = '+00:00'
2 Query SET sql_mode='ANSI_QUOTES'
2 Prepare SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 5))
2 Execute SELECT `name` FROM `testdb`.`users` WHERE ((`id` = 5))
2 Close stmt
2 Query SET sql_mode='ANSI_QUOTES'
2 Query EXPLAIN SELECT `name` FROM `testdb`.`users` WHERE ((`id` = ?))
2 Quit
Here is the full series of commands for replicating, tested in a clean Ubuntu 16.04.2 (full console output in transcript.txt):
apt update
apt install -y postgresql-9.5 mariadb-server mariadb-client libmysqlclient-dev postgresql-server-dev-9.5 make gcc
perl -i -pe 's/^#general_log/general_log/g' /etc/mysql/mariadb.conf.d/50-server.cnf; systemctl restart mysql # turn on mysql.log
# Change 'root' auth from socket to password so it works from psql.
echo "grant all privileges on *.* to 'root'@'localhost' identified by 'secret'" | mysql
cat <<'EOF' > ~/.my.cnf
[mysql]
user=root
password=secret
EOF
mkdir -p ~/src/github.com/EnterpriseDB
git clone https://github.com/EnterpriseDB/mysql_fdw.git ~/src/github.com/EnterpriseDB/mysql_fdw # At commit 86ad3af
cd src/github.com/EnterpriseDB/mysql_fdw/
sudo make USE_PGXS=1 install
cat <<'EOF' > /tmp/testdb_mysql.sql
DROP DATABASE if exists testdb ;
CREATE DATABASE testdb;
USE testdb;
CREATE TABLE users (
id integer,
name text,
photo blob
);
INSERT INTO `users` VALUES (1,'Bob','xxxx');
INSERT INTO `users` VALUES (2,'Fred','yyyyy');
INSERT INTO `users` VALUES (3,'Sally','zzzz');
INSERT INTO `users` VALUES (4,'Mary','qzqzqz');
INSERT INTO `users` VALUES (5,'Jimmy','abcabc');
INSERT INTO `users` VALUES (6,'User6','12345');
INSERT INTO `users` VALUES (7,'User7','12346');
INSERT INTO `users` VALUES (8,'User8','12347');
INSERT INTO `users` VALUES (9,'User9','12348');
INSERT INTO `users` VALUES (10,'User10','12349');
EOF
cat <<'EOF' > /tmp/testdb_psql.sql
CREATE DATABASE testdb;
GRANT usage ON SCHEMA PUBLIC TO postgres;
\c testdb;
CREATE EXTENSION mysql_fdw;
CREATE SERVER mysql_server FOREIGN DATA WRAPPER mysql_fdw OPTIONS (host '127.0.0.1', port '3306', use_remote_estimate 'true');
CREATE USER MAPPING FOR postgres SERVER mysql_server OPTIONS (username 'root', password 'secret');
GRANT USAGE ON FOREIGN SERVER mysql_server to postgres;
IMPORT FOREIGN SCHEMA testdb FROM SERVER mysql_server INTO public;
select * from users;
EOF
su - postgres -c psql < /tmp/testdb_psql.sql
cat <<'EOF' > /tmp/slowloop.sql
create or replace function test() returns void as $$
DECLARE
n varchar;
BEGIN
FOR x IN 1..10 LOOP
select name into n from users where id=x;
raise notice 'Found user %', n;
end loop;
return;
END
$$ LANGUAGE plpgsql;
SELECT test();
EOF
su - postgres -c 'psql testdb' < /tmp/slowloop.sql
Hi @ahsanhadi @jefft , Has this issue been fixed? I'm the fool using mysql_fdw in production without catching this issue in development!
I use postgres12.6 & mysql_fdw 2.5.5
This is my table definition: database diagram
--PG
CREATE TABLE DDTINTO_MASTER (
AC_DATE int NOT NULL DEFAULT 0,
AC_NO varchar(12) NOT NULL DEFAULT '',
CCY varchar(3) NOT NULL DEFAULT '',
QUALIFIER char(1) NOT NULL DEFAULT '',
DEP decimal(16, 2) NOT NULL DEFAULT 0.00
....
TS TIMESTAMP WITHOUT TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP
) PARTITION BY RANGE (AC_DATE);
CREATE TABLE ddtinto_201909 PARTITION OF DDTINTO_MASTER FOR VALUES FROM (10000101) TO (20191001);
CREATE TABLE ddtinto_201910 PARTITION OF DDTINTO_MASTER FOR VALUES FROM (20191001) TO (20191101);
...
CREATE INDEX ddtinto_pk on DDTINTO_MASTER (ac_no, ccy, qualifier, ac_date);
CREATE VIEW DDTINTO AS
SELECT * FROM DDTINTO_CURRENT
UNION ALL
SELECT * FROM DDTINTO_MASTER;
--MySQL
create view view_ddtinto_current
as
select * from ddtinto
where ac_date >= 20210401;
Sometimes , FDW can push all where clauses to MySQL, sometimes can only push part of the where caluses. Here is the latter's explain plan, Remote query lack 2 where clauses: ac_no and ccy
2021-06-04 00:22:10.311 CST [2695] LOG: duration: 928091.882 ms execute S_8: SELECT TRIM(AC_NO) AC_NO
, TRIM(CCY) CCY
, AC_DATE
, DEP
, QUALIFIER
FROM DDTINTO
WHERE QUALIFIER IN ('A', 'C')
and AC_NO = $1
and CCY = $2
and AC_DATE >= $3
and AC_DATE <= $4
ORDER BY AC_DATE DESC
limit $5
2021-06-04 00:22:10.311 CST [2695] DETAIL: parameters: $1 = '889005309160', $2 = 'HKD', $3 = '20210306', $4 = '20210604', $5 = '180'
2021-06-04 00:22:10.312 CST [2695] LOG: duration: 928091.872 ms plan:
Query Text: SELECT TRIM(AC_NO) AC_NO
, TRIM(CCY) CCY
, AC_DATE
, DEP
, QUALIFIER
FROM DDTINTO
WHERE QUALIFIER IN ('A', 'C')
and AC_NO = $1
and CCY = $2
and AC_DATE >= $3
and AC_DATE <= $4
ORDER BY AC_DATE DESC
limit $5
Limit (cost=1273.68..1273.94 rows=102 width=94) (actual time=928091.796..928091.825 rows=88 loops=1)
Output: (btrim((ddtinto_current.ac_no)::text)), (btrim((ddtinto_current.ccy)::text)), ddtinto_current.ac_date, ddtinto_current.dep, ddtinto_current.qualifier
Buffers: shared hit=64
-> Sort (cost=1273.68..1276.23 rows=1020 width=94) (actual time=928091.794..928091.806 rows=88 loops=1)
Output: (btrim((ddtinto_current.ac_no)::text)), (btrim((ddtinto_current.ccy)::text)), ddtinto_current.ac_date, ddtinto_current.dep, ddtinto_current.qualifier
Sort Key: ddtinto_current.ac_date DESC
Sort Method: quicksort Memory: 31kB
Buffers: shared hit=64
-> Result (cost=25.00..1222.71 rows=1020 width=94) (actual time=231239.514..928091.626 rows=88 loops=1)
Output: btrim((ddtinto_current.ac_no)::text), btrim((ddtinto_current.ccy)::text), ddtinto_current.ac_date, ddtinto_current.dep, ddtinto_current.qualifier
Buffers: shared hit=64
-> Append (cost=25.00..1207.41 rows=1020 width=87) (actual time=231239.509..928091.454 rows=88 loops=1)
Buffers: shared hit=64
Subplans Removed: 18
-> Foreign Scan on public.ddtinto_current (cost=25.00..1025.00 rows=1000 width=88) (actual time=231239.507..928091.234 rows=33 loops=1)
Output: ddtinto_current.ac_no, ddtinto_current.ccy, ddtinto_current.ac_date, ddtinto_current.dep, ddtinto_current.qualifier
Filter: (((ddtinto_current.ac_no)::text = ($1)::text) AND ((ddtinto_current.ccy)::text = ($2)::text))
Rows Removed by Filter: 34667775
Remote server startup cost: 25
Remote query: SELECT `ac_date`, `ac_no`, `ccy`, `qualifier`, `dep` FROM `ocbs387`.`view_ddtinto_current` WHERE (`qualifier` IN ('A', 'C')) AND ((`ac_date` >= ?)) AND ((`ac_date` <= ?))
-> Index Scan using ddtinto_202103_ac_no_ccy_qualifier_ac_date_idx on public.ddtinto_202103 (cost=0.56..9.10 rows=1 width=26) (actual time=0.051..0.089 rows=25 loops=1)
Output: ddtinto_202103.ac_no, ddtinto_202103.ccy, ddtinto_202103.ac_date, ddtinto_202103.dep, ddtinto_202103.qualifier
Index Cond: (((ddtinto_202103.ac_no)::text = ($1)::text) AND ((ddtinto_202103.ccy)::text = ($2)::text) AND (ddtinto_202103.ac_date >= $3) AND (ddtinto_202103.ac_date <= $4))
Filter: (ddtinto_202103.qualifier = ANY ('{A,C}'::bpchar[]))
Buffers: shared hit=29
-> Index Scan using ddtinto_202104_ac_no_ccy_qualifier_ac_date_idx on public.ddtinto_202104 (cost=0.56..9.08 rows=1 width=26) (actual time=0.022..0.063 rows=30 loops=1)
Output: ddtinto_202104.ac_no, ddtinto_202104.ccy, ddtinto_202104.ac_date, ddtinto_202104.dep, ddtinto_202104.qualifier
Index Cond: (((ddtinto_202104.ac_no)::text = ($1)::text) AND ((ddtinto_202104.ccy)::text = ($2)::text) AND (ddtinto_202104.ac_date >= $3) AND (ddtinto_202104.ac_date <= $4))
Filter: (ddtinto_202104.qualifier = ANY ('{A,C}'::bpchar[]))
Rows Removed by Filter: 1
Buffers: shared hit=35
Hi @MondayLiu,
The original issue reported above seems to be fixed however, I suspect some other issue by looking at your test case. I could not reproduce this exact issue at my end. It would be good if you share exact DDLs from PG and MySQL with sample data from PG and MySQL to reproduce the issue.
Moreover, can you please confirm on below points:
- Are you observing this issue after 5th consecutive execution? or what is the pattern you observed for this particular issue?
- Did you check the execution time difference in case of complete where clause pushdown and partial where clause pushdown? It might be possible that the planner is choosing the cheapest plan in this case.
- By Looking at explain plan, it looks like index scan is happening on ac_no, ccy and ac_date columns. What if you disable the index scan for a particular query? What plan you are getting after disabling the same? Use below command to disable the index scan: set enable_indexscan to off;
Thanks and Regards, Suraj Kharage
Hi @surajkharage19 , Thanks for response! After I reproduce in development! I'll new another issue. So far, only the following replies can be made:
- The probability of 97/1020 leads to partial where clause pushdown.
Here is complete where clause pushdown:
2021-06-04 00:06:28.015 CST [2617] LOG: duration: 741.972 ms plan:
Query Text: SELECT TRIM(AC_NO) AC_NO
, TRIM(CCY) CCY
, AC_DATE
, DEP
, QUALIFIER
FROM DDTINTO
WHERE QUALIFIER IN ('A', 'C')
and AC_NO = $1
and CCY = $2
and AC_DATE >= $3
and AC_DATE <= $4
ORDER BY AC_DATE DESC
limit $5
Limit (cost=1366.36..1366.81 rows=180 width=93) (actual time=741.891..741.921 rows=89 loops=1)
Output: (btrim((ddtinto_current.ac_no)::text)), (btrim((ddtinto_current.ccy)::text)), ddtinto_current.ac_date, ddtinto_current.dep, ddtinto_current.qualifier
Buffers: shared hit=22 read=43 dirtied=38
-> Sort (cost=1366.36..1369.02 rows=1065 width=93) (actual time=741.889..741.902 rows=89 loops=1)
Output: (btrim((ddtinto_current.ac_no)::text)), (btrim((ddtinto_current.ccy)::text)), ddtinto_current.ac_date, ddtinto_current.dep, ddtinto_current.qualifier
Sort Key: ddtinto_current.ac_date DESC
Sort Method: quicksort Memory: 31kB
Buffers: shared hit=22 read=43 dirtied=38
-> Result (cost=25.00..1321.14 rows=1065 width=93) (actual time=45.109..741.696 rows=89 loops=1)
Output: btrim((ddtinto_current.ac_no)::text), btrim((ddtinto_current.ccy)::text), ddtinto_current.ac_date, ddtinto_current.dep, ddtinto_current.qualifier
Buffers: shared hit=22 read=43 dirtied=38
-> Append (cost=25.00..1305.17 rows=1065 width=84) (actual time=45.104..741.483 rows=89 loops=1)
Buffers: shared hit=22 read=43 dirtied=38
-> Foreign Scan on public.ddtinto_current (cost=25.00..1025.00 rows=1000 width=88) (actual time=45.102..45.129 rows=33 loops=1)
Output: ddtinto_current.ac_no, ddtinto_current.ccy, ddtinto_current.ac_date, ddtinto_current.dep, ddtinto_current.qualifier
Remote server startup cost: 25
Remote query: SELECT `ac_date`, `ac_no`, `ccy`, `qualifier`, `dep` FROM `ocbs387`.`view_ddtinto_current` WHERE (`qualifier` IN ('A', 'C')) AND ((`ac_date` >= 20210306)) AND ((`ac_date` <= 20210604)) AND ((`ac_no` = '883005072877')) AND ((`ccy` = 'HKD'))
-> Index Scan using ddtinto_202103_ac_no_ccy_qualifier_ac_date_idx on public.ddtinto_202103 (cost=0.56..129.39 rows=31 width=26) (actual time=8.999..356.128 rows=26 loops=1)
Output: ddtinto_202103.ac_no, ddtinto_202103.ccy, ddtinto_202103.ac_date, ddtinto_202103.dep, ddtinto_202103.qualifier
Index Cond: (((ddtinto_202103.ac_no)::text = '883005072877'::text) AND ((ddtinto_202103.ccy)::text = 'HKD'::text) AND (ddtinto_202103.ac_date >= 20210306) AND (ddtinto_202103.ac_date <= 20210604))
Filter: (ddtinto_202103.qualifier = ANY ('{A,C}'::bpchar[]))
Buffers: shared hit=5 read=25 dirtied=23
-> Index Scan using ddtinto_202104_ac_no_ccy_qualifier_ac_date_idx on public.ddtinto_202104 (cost=0.56..145.46 rows=34 width=26) (actual time=30.783..340.156 rows=30 loops=1)
Output: ddtinto_202104.ac_no, ddtinto_202104.ccy, ddtinto_202104.ac_date, ddtinto_202104.dep, ddtinto_202104.qualifier
Index Cond: (((ddtinto_202104.ac_no)::text = '883005072877'::text) AND ((ddtinto_202104.ccy)::text = 'HKD'::text) AND (ddtinto_202104.ac_date >= 20210306) AND (ddtinto_202104.ac_date <= 20210604))
Filter: (ddtinto_202104.qualifier = ANY ('{A,C}'::bpchar[]))
Rows Removed by Filter: 1
Buffers: shared hit=17 read=18 dirtied=15
-
> 500 milliseconds SQL.
2021-06-04 00:02:40.887 CST [2581] LOG: duration: 258338.636 ms
2021-06-04 00:06:28.015 CST [2581] LOG: duration: 741.972 ms
2021-06-04 00:06:41.625 CST [2581] LOG: duration: 502.321 ms
2021-06-04 00:07:11.666 CST [2581] LOG: duration: 527.322 ms
2021-06-04 00:07:21.340 CST [2564] LOG: duration: 233966.558 ms
2021-06-04 00:08:24.679 CST [2620] LOG: duration: 592.067 ms
2021-06-04 00:08:24.679 CST [2946] LOG: duration: 525.514 ms
2021-06-04 00:08:45.064 CST [2618] LOG: duration: 500.426 ms
2021-06-04 00:08:47.226 CST [2620] LOG: duration: 533.847 ms
2021-06-04 00:08:48.200 CST [2618] LOG: duration: 847.036 ms
2021-06-04 00:09:05.815 CST [2946] LOG: duration: 604.529 ms
2021-06-04 00:09:05.815 CST [2945] LOG: duration: 531.357 ms
2021-06-04 00:09:13.763 CST [2947] LOG: duration: 511.709 ms
2021-06-04 00:09:17.470 CST [2620] LOG: duration: 583.923 ms
2021-06-04 00:09:17.470 CST [2947] LOG: duration: 527.431 ms
2021-06-04 00:09:59.000 CST [2621] LOG: duration: 514.290 ms
2021-06-04 00:10:03.128 CST [2946] LOG: duration: 544.570 ms
2021-06-04 00:10:44.414 CST [2635] LOG: duration: 515.128 ms
2021-06-04 00:10:51.638 CST [2636] LOG: duration: 530.766 ms
2021-06-04 00:11:05.345 CST [2636] LOG: duration: 689.052 ms
2021-06-04 00:11:05.345 CST [2996] LOG: duration: 621.094 ms
2021-06-04 00:11:27.960 CST [6838] LOG: duration: 550.326 ms
2021-06-04 00:11:44.759 CST [6908] LOG: duration: 547.756 ms
2021-06-04 00:11:49.326 CST [6871] LOG: duration: 836.809 ms
2021-06-04 00:11:49.326 CST [2983] LOG: duration: 787.258 ms
2021-06-04 00:11:49.792 CST [6908] LOG: duration: 670.352 ms
2021-06-04 00:11:49.793 CST [6838] LOG: duration: 601.234 ms
2021-06-04 00:12:07.626 CST [2949] LOG: duration: 1065.761 ms
2021-06-04 00:12:51.691 CST [6908] LOG: duration: 575.303 ms
2021-06-04 00:13:22.716 CST [2996] LOG: duration: 831.634 ms
2021-06-04 00:13:25.306 CST [6944] LOG: duration: 549.983 ms
2021-06-04 00:13:34.876 CST [6944] LOG: duration: 941.696 ms
...
...
2021-06-04 00:22:10.311 CST [2695] LOG: duration: 928091.882 ms
2021-06-04 00:23:04.557 CST [7526] LOG: duration: 507.980 ms
2021-06-04 00:23:16.398 CST [7562] LOG: duration: 525.549 ms
2021-06-04 00:23:16.523 CST [7510] LOG: duration: 592.123 ms
2021-06-04 00:23:23.602 CST [7527] LOG: duration: 505.949 ms
2021-06-04 00:23:31.135 CST [7510] LOG: duration: 671.336 ms
2021-06-04 00:23:31.136 CST [7578] LOG: duration: 594.377 ms
2021-06-04 00:23:42.087 CST [7649] LOG: duration: 904.054 ms
2021-06-04 00:23:49.156 CST [7563] LOG: duration: 527.979 ms
2021-06-04 00:23:50.266 CST [2580] LOG: duration: 1013959.478 ms
2021-06-04 00:24:24.573 CST [7526] LOG: duration: 645.630 ms
2021-06-04 00:24:24.573 CST [7563] LOG: duration: 711.659 ms
2021-06-04 00:24:34.395 CST [7563] LOG: duration: 637.527 ms
2021-06-04 00:24:34.396 CST [7611] LOG: duration: 536.465 ms
2021-06-04 00:24:36.734 CST [7650] LOG: duration: 678.372 ms
2021-06-04 00:24:47.554 CST [7611] LOG: duration: 813.259 ms
2021-06-04 00:24:47.554 CST [7735] LOG: duration: 733.719 ms
2021-06-04 00:25:06.778 CST [7649] LOG: duration: 529.749 ms
2021-06-04 00:25:06.779 CST [7701] LOG: duration: 503.374 ms
2021-06-04 00:25:24.519 CST [7650] LOG: duration: 540.833 ms
2021-06-04 00:25:24.519 CST [7649] LOG: duration: 614.708 ms
2021-06-04 00:25:30.435 CST [7650] LOG: duration: 503.289 ms
2021-06-04 00:25:32.233 CST [7578] LOG: duration: 593.404 ms
2021-06-04 00:25:32.233 CST [7701] LOG: duration: 530.180 ms
2021-06-04 00:25:44.033 CST [7578] LOG: duration: 831.246 ms
2021-06-04 00:25:44.033 CST [7735] LOG: duration: 910.863 ms
2021-06-04 00:26:00.344 CST [7735] LOG: duration: 656.323 ms
2021-06-04 00:26:00.344 CST [7824] LOG: duration: 604.716 ms
2021-06-04 00:26:03.642 CST [7702] LOG: duration: 572.389 ms
2021-06-04 00:26:03.642 CST [7858] LOG: duration: 634.322 ms
2021-06-04 00:26:06.218 CST [7824] LOG: duration: 515.326 ms
2021-06-04 00:26:10.056 CST [7701] LOG: duration: 503.993 ms
2021-06-04 00:26:14.506 CST [7701] LOG: duration: 536.509 ms
2021-06-04 00:26:46.010 CST [7858] LOG: duration: 541.532 ms
2021-06-04 00:26:57.437 CST [7824] LOG: duration: 569.747 ms
2021-06-04 00:26:57.934 CST [2942] LOG: duration: 1186707.760 ms
2021-06-04 00:27:01.335 CST [7751] LOG: duration: 1141.671 ms
2021-06-04 00:27:01.335 CST [7702] LOG: duration: 886.833 ms
2021-06-04 00:27:01.335 CST [7751] LOG: duration: 1141.649 ms
2021-06-04 00:27:12.155 CST [2564] LOG: duration: 1178133.430 ms
2021-06-04 00:27:12.636 CST [7736] LOG: duration: 509.778 ms
2021-06-04 00:27:12.636 CST [7860] LOG: duration: 613.868 ms
...
...
- PG holds historical data and MySQL holds current data.There are 80 million records in total. Two-thirds are in PG.
- All 5 where caluse in SQL are required.
Thanks and Regards, Monday Liu
Thanks for sharing the information.
This issue seems occurring randomly, not following any pattern. There might be chances that the planner is doing some optimization based on parameter values. I would appreciate it if you could provide the exact test case to reproduce this issue and that will help us with further troubleshooting. Also, are you observing the same behavior when you execute the above query (with exact values) on psql?