pg_stat_plans icon indicating copy to clipboard operation
pg_stat_plans copied to clipboard

RFC: IN list compaction for normalized_query

Open terrorobe opened this issue 12 years ago • 0 comments

We noticed that many applications generate variable-sized lists for bulk modification of result sets which can lead to many separate normalized_queries although they belong to the same statement (class).

 DELETE FROM "post_taggings" WHERE post_id = ? AND tag_id IN (?)
 DELETE FROM "post_taggings" WHERE post_id = ? AND tag_id IN (?,?)
 DELETE FROM "post_taggings" WHERE post_id = ? AND tag_id IN (?,?,?)
 DELETE FROM "post_taggings" WHERE post_id = ? AND tag_id IN (?,?,?,?)
[..31 total variations of this query were recorded..]

A quick test showed that this seems to be related to the fact that the cases "has one list member" and "has many list members" are hashed to different plans though the normalized_query is recorded for the exact number of elements the first "has many list members" query used. This causes problems on databases where pg_stat_plans are reset on a regular basis.

plantest=# select normalized_query, calls from pg_stat_plans_queries where dbid=152414 and normalized_query like 'select id%';
 normalized_query | calls 
------------------+-------
(0 rows)

plantest=# select id from foo where id in (10, 100, 1000, 10000, 10000, 43, 543, 62, 745, 124);
  id   
-------
    10
    43
    62
   100
   124
   543
   745
  1000
 10000
(9 rows)

plantest=# select id from foo where id in (10, 100);
 id  
-----
  10
 100
(2 rows)

plantest=# select id from foo where id in (10, 100, 1000);
  id  
------
   10
  100
 1000
(3 rows)

plantest=# select id from foo where id in (10, 100);
 id  
-----
  10
 100
(2 rows)

plantest=# select id from foo where id in (10, 100, 1000);
  id  
------
   10
  100
 1000
(3 rows)

plantest=# select id from foo where id in (10);
 id 
----
 10
(1 row)


plantest=# select normalized_query, calls from pg_stat_plans_queries where dbid=152414 and normalized_query like 'select id%';
                        normalized_query                        | calls 
----------------------------------------------------------------+-------
 select id from foo where id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?); |     5
 select id from foo where id in (?);                            |     1
(2 rows)

plantest=# select pg_stat_plans_reset();
 pg_stat_plans_reset 
---------------------

(1 row)

plantest=# select normalized_query, calls from pg_stat_plans_queries where dbid=152414 and normalized_query like 'select id%';
 normalized_query | calls 
------------------+-------
(0 rows)

plantest=# select id from foo where id in (10, 100);
 id  
-----
  10
 100
(2 rows)

plantest=# select id from foo where id in (10, 100, 1000);
  id  
------
   10
  100
 1000
(3 rows)

plantest=# select id from foo where id in (10, 100, 1000, 10000);
  id   
-------
    10
   100
  1000
 10000
(4 rows)

plantest=# select id from foo where id in (10, 100, 1000, 10000, 10000, 43, 543, 62, 745, 124);
  id   
-------
    10
    43
    62
   100
   124
   543
   745
  1000
 10000
(9 rows)

plantest=# select id from foo where id in (10);
 id 
----
 10
(1 row)

plantest=# select id from foo where id in (10);
 id 
----
 10
(1 row)

plantest=# select normalized_query, calls from pg_stat_plans_queries where dbid=152414 and normalized_query like 'select id%';
            normalized_query            | calls 
----------------------------------------+-------
 select id from foo where id in (?, ?); |     4
 select id from foo where id in (?);    |     2
(2 rows)

plantest=# 

Would it be feasible to compact lists with more than two placeholders to just two in the normalized_query?

If not, it'd be a good if we could add this to a "Limitations" section in the documentation.

terrorobe avatar Sep 24 '13 12:09 terrorobe