mongo_fdw icon indicating copy to clipboard operation
mongo_fdw copied to clipboard

Filtering Condition doesn't push down

Open ljy013 opened this issue 8 years ago • 16 comments

I'm testing with mongo_fdw on PG9.5. And I experienced issue. It seems that Filetering condidion doesn't push down.

Firstly, I queried on mongodb shell like below. And it looks filter correctly.

db.test2.find({code:"A", ymd:"20150101"})

2016-07-22T14:58:37.528+0900 D SHARDING [conn27] command: fdw02.$cmd { find: "test2", filter: { code: "A", ymd: "20150101" } } ntoreturn: 1 options: 0 2016-07-22T14:58:37.529+0900 D EXECUTOR [conn27] Scheduling remote command request: RemoteCommand 1789626 -- target:...: db:fdw02 cmd:{ find: "test2", filter: { code: "A", acrs_ymd: "20150101" }, shardVersion: [ Timestamp 0|0, ObjectId('000000000000000000000000') ] }

But When I queried on PG through FDW, there's no filter condition.

select * from tbm_prgs_fdw_temp where code = 'A' and ymd = '20150101';

2016-07-22T14:58:52.761+0900 D SHARDING [conn29] command: fdw02.$cmd { count: "test2" } ntoreturn: 1 options: 0 2016-07-22T14:58:52.761+0900 D NETWORK [conn29] creating pcursor over QSpec { ns: "fdw02.$cmd", n2skip: 0, n2return: 1, options: 0, query: { count: "test2" }, fields: {} } and CInfo { v_ns: "fdw02.test2", filter: {} }

ljy013 avatar Jul 22 '16 06:07 ljy013

Can you please share the explain plan of the query you are executing using mongodb_fdw...?

ahsanhadi avatar Sep 26 '16 15:09 ahsanhadi

mongo_fdw need two count commands for full table and one isMaster command before you exec a select sql with filter.

lifubang avatar Sep 27 '16 02:09 lifubang

It looks like filtering is not being pushed down for me too:

explain (analyze) select * from events where event_id = '00002950-55e7-4b16-aa82-cb86f7a6aa31';

Foreign Scan on events (cost=0.00..0.00 rows=1000 width=1508) (actual time=0.288..3628.348 rows=11 loops=1) Filter: ((event_id)::text = '00002950-55e7-4b16-aa82-cb86f7a6aa31'::text) Rows Removed by Filter: 2455023 Foreign Namespace: db.events Planning time: 1.033 ms Execution time: 3628.448 ms (6 rows)

Any idea @lifubang @ahsanhadi? I have postgresql-9.6.3 and mongodb 3.4.6

chinhngt avatar Jul 27 '17 04:07 chinhngt

@chinhngt I think the event_id is not ObjectId type. If you use ObjectId type column in filter, It will be improved(in PG, the type is "NAME"). The string type in where clause is very slow when use mongo_fdw. If you have no ObjectId type column, you can use numeric type column. For example, you can add a column id, the type of it is long.

lifubang avatar Jul 28 '17 05:07 lifubang

Hi,

I ran a query on the sample warehouse table

"select * from warehouse where warehouse_name='UPS';"

and i can see the filter condition in the mongodb logs. It seems that explain analyze not working with mongdb_fdw but the where clause is getting pushed down and it can checked in the log...

Can you also check your mongo log...

-- Ahsan

ahsanhadi avatar Aug 10 '17 15:08 ahsanhadi

Hi @lifubang,

select * from tbm_prgs_fdw_temp where code = 'A' and ymd = '20150101';

Currently, this condition in WHERE clause is not going to pushdown because boolean expression or nested operator expression not yet supported. We're currently working on supporting pushdown of nested operator expression and boolean expression as well in WHERE condition.

vaibhavdalvi93 avatar Jun 21 '22 14:06 vaibhavdalvi93

What is the status of this. Was this ever addressed?

esatterwhite avatar May 30 '23 20:05 esatterwhite

@esatterwhite, support of pushing down boolean expression or nested operator expression in WHERE has been already added in the latest version of mongo_fdw i.e. 5.5.0. Please use the latest version of mongo_fdw and check. Kindly, let us know your valuable feedback.

vaibhavdalvi93 avatar May 31 '23 04:05 vaibhavdalvi93

@vaibhavdalvi93 mostly unrelated, but the fdw always wants to connect to localhost.


CREATE SERVER mongo_server                                                                        
        FOREIGN DATA WRAPPER mongo_fdw
        OPTIONS (address 'mongo', port '27017', authentication_database 'admin', replica_set 'test-mongo-repl', read_preference 'secondaryPreferred');

CREATE USER MAPPING for CURRENT_USER SERVER mongo_server OPTIONS(username 'admin', password 'password');

CREATE FOREIGN TABLE mg_accounts (_id name, account text, billing json, billingemail text, company text, key text[]) SERVER mongo_server OPTIONS(database 'db', collection 'accounts');

select * from mg_accounts;

\des+
                                                                                                          List of foreign servers
     Name     |  Owner   | Foreign-data wrapper | Access privileges | Type | Version |                                                              FDW options                                                              | Description 
--------------+----------+----------------------+-------------------+------+---------+---------------------------------------------------------------------------------------------------------------------------------------+-------------
 mongo_server | postgres | mongo_fdw            |                   |      |         | (address 'mongo', port '27017', authentication_database 'admin', replica_set 'test-mongo-repl', read_preference 'secondaryPreferred') | 

ERROR:  could not connect to server mongo_server
HINT:  Mongo error: "No suitable servers found (`serverSelectionTryOnce` set): [connection refused calling ismaster on 'localhost:27017']"

esatterwhite avatar May 31 '23 17:05 esatterwhite

Oh, interesting, If I alter the port to non-default, it tries the server, if i set it back to the default port, it goes back to the default host

logdna=# alter SERVER mongo_server OPTIONS(SET port '27018');
ALTER SERVER
db=# \des+
                                                                                                          List of foreign servers
     Name     |  Owner   | Foreign-data wrapper | Access privileges | Type | Version |                                                              FDW options                                                              | Description 
--------------+----------+----------------------+-------------------+------+---------+---------------------------------------------------------------------------------------------------------------------------------------+-------------
 mongo_server | postgres | mongo_fdw            |                   |      |         | (address 'mongo', port '27018', authentication_database 'admin', replica_set 'test-mongo-repl', read_preference 'secondaryPreferred') | 
(1 row)

db=# select * from mg_accounts;
ERROR:  could not connect to server mongo_server
HINT:  Mongo error: "No suitable servers found (`serverSelectionTryOnce` set): [connection refused calling ismaster on 'mongo:27018']"

db=# alter SERVER mongo_server OPTIONS(SET port '27017');
ALTER SERVER

db=# select * from mg_accounts;
ERROR:  could not connect to server mongo_server
HINT:  Mongo error: "No suitable servers found (`serverSelectionTryOnce` set): [connection refused calling ismaster on 'localhost:27017']"

esatterwhite avatar May 31 '23 17:05 esatterwhite

I figured out the problem, nevermind me 😆

esatterwhite avatar May 31 '23 20:05 esatterwhite

@esatterwhite, support of pushing down boolean expression or nested operator expression in WHERE has been already added in the latest version of mongo_fdw i.e. 5.5.0. Please use the latest version of mongo_fdw and check. Kindly, let us know your valuable feedback.

@vaibhavdalvi93 I think I'm still seeing a problem with booleans.

Using pgcli, I'm showing an extension version of 1.1 and a mongo_fdw_version of 50500 which I think is the latest version:

dev_v3_platform> \dx
+-----------+---------+------------+------------------------------------------------------------+
| Name      | Version | Schema     | Description                                                |
|-----------+---------+------------+------------------------------------------------------------|
| citext    | 1.6     | public     | data type for case-insensitive character strings           |
| mongo_fdw | 1.1     | public     | foreign data wrapper for MongoDB access                    |
...

dev_v3_platform> select mongo_fdw_version();
+-------------------+
| mongo_fdw_version |
|-------------------|
| 50500             |
+-------------------+

If I run the following query I will get back records where active is true:

dev_v3_platform> SELECT "customerId", "active" FROM mongo_devices_prod WHERE "active" = false LIMIT 5;
+-------------------+--------+
| customerId        | active |
|-------------------+--------|
| x8t8K4NpvqBQftMKo | True   |
| gaQ3wAmk8zaJ3dFa5 | True   |
| x8t8K4NpvqBQftMKo | True   |
| x8t8K4NpvqBQftMKo | True   |
| gaQ3wAmk8zaJ3dFa5 | True   |
+-------------------+--------+

I can get it to work if I change the WHERE clause to this:

dev_v3_platform> SELECT "customerId", "active" FROM mongo_devices_prod WHERE "active" is false LIMIT 5;
+-------------------+--------+
| customerId        | active |
|-------------------+--------|
| mariCyaD5YzAvF2rj | False  |
| pTTcPKDG4Zzrx2org | False  |
| mariCyaD5YzAvF2rj | False  |
| gaQ3wAmk8zaJ3dFa5 | False  |
| gaQ3wAmk8zaJ3dFa5 | False  |
+-------------------+--------+

I'm assuming that is not the expected situation.

simon-wolf avatar Jun 27 '23 05:06 simon-wolf

Thanks, @simon-wolf for reporting an issue.

This is a bug. The boolean expression in WHERE clause is getting push-down but wrong remote query i.e MongoDB query document is getting formed and this is the reason for wrong output.

I could solve this issue. If possible, you can use attached patch to solve this issue. Please share your valuable feedback on this. v1-Fix-bool-expr-in-WHERE.txt

We will open internal issue and try to commit it in near future.

Thanks, Vaibhav

vaibhavdalvi93 avatar Jun 28 '23 05:06 vaibhavdalvi93

Thank you @vaibhavdalvi93. I'm using the extension via CruncyBridge but I will update my support ticket with them to see if they can test it. If not then my apologies and hopefully they will update to the version containing the eventual fix once it has been released.

Thank you for your help with this and for all of you hard work on the extension.

simon-wolf avatar Jun 28 '23 07:06 simon-wolf

@vaibhavdalvi93 CrunchyBridge installed the patch on my instance and I can confirm that it resolved the issue:

dev_v3_platform> SELECT "customerId", "active" FROM mongo_devices_prod WHERE "active" = false LIMIT 5;
+-------------------+--------+
| customerId        | active |
|-------------------+--------|
| mariCyaD5YzAvF2rj | False  |
| pTTcPKDG4Zzrx2org | False  |
| mariCyaD5YzAvF2rj | False  |
| gaQ3wAmk8zaJ3dFa5 | False  |
| gaQ3wAmk8zaJ3dFa5 | False  |
+-------------------+--------+

simon-wolf avatar Jun 30 '23 07:06 simon-wolf

Thanks, @simon-wolf for acknowledgement and glad to hear that your issue got resolved.

vaibhavdalvi93 avatar Jun 30 '23 09:06 vaibhavdalvi93