puppetdb icon indicating copy to clipboard operation
puppetdb copied to clipboard

Reduce "extract" cardinality with DISTINCT select

Open exo-cedric opened this issue 5 years ago • 4 comments

We've been experiencing very bad PostreSQL performances lately, related to facts <-> node queries such as:

SELECT
  paged_results.*
FROM
  (
    SELECT
      facts.certname, facts.environment, facts.name, facts.value, facts.path, facts.type, facts.depth, facts.value_float, facts.value_integer
    FROM
      (
        SELECT
          facts.value_integer, facts.certname, facts.path, facts.name, facts.value, facts.value_float, facts.type, facts.environment, facts.depth
        FROM
          (
            SELECT
              fs.certname, fp.path as path, fp.name as name, fp.depth as depth, fv.value_integer as value_integer, fv.value_float as value_float, COALESCE(fv.value_string, fv.value_json, cast(fv.value_integer as text), cast(fv.value_float as text), cast(fv.value_boolean as text)) as value, vt.type as type, env.name as environment
            FROM
              factsets fs
              INNER JOIN facts as f on fs.id = f.factset_id
              INNER JOIN fact_values as fv on f.fact_value_id = fv.id
              INNER JOIN fact_paths as fp on f.fact_path_id = fp.id
              INNER JOIN value_types as vt on vt.id=fv.value_type_id
              LEFT OUTER JOIN environments as env on fs.environment_id = env.id
            WHERE
              depth = 0
          ) AS facts
        WHERE
          (
            facts.certname IN (SELECT name FROM certnames WHERE deactivated IS NULL)
          )
        AND
          (
            (
              certname IN
                (
                  SELECT
                    r1.certname
                  FROM
                    (
                      SELECT
                        facts.value_integer, facts.certname, facts.path, facts.name, facts.value, facts.value_float, facts.type, facts.environment, facts.depth
                      FROM
                        (
                          SELECT
                            fs.certname, fp.path as path, fp.name as name, fp.depth as depth, fv.value_integer as value_integer, fv.value_float as value_float, COALESCE(fv.value_string, fv.value_json, cast(fv.value_integer as text), cast(fv.value_float as text), cast(fv.value_boolean as text)) as value, vt.type as type, env.name as environment
                          FROM
                            factsets fs
                            INNER JOIN facts as f on fs.id = f.factset_id
                            INNER JOIN fact_values as fv on f.fact_value_id = fv.id
                            INNER JOIN fact_paths as fp on f.fact_path_id = fp.id
                            INNER JOIN value_types as vt on vt.id=fv.value_type_id
                            LEFT OUTER JOIN environments as env on fs.environment_id = env.id
                          WHERE
                            depth = 0
                        ) AS facts
                      WHERE
                        (
                          certname IN
                            (
                              SELECT
                                r1.certname
                              FROM
                                (
                                  SELECT
                                    facts.value_integer, facts.certname, facts.path, facts.name, facts.value, facts.value_float, facts.type, facts.environment, facts.depth
                                  FROM
                                    (
                                      SELECT
                                        fs.certname, fp.path as path, fp.name as name, fp.depth as depth, fv.value_integer as value_integer, fv.value_float as value_float, COALESCE(fv.value_string, fv.value_json, cast(fv.value_integer as text), cast(fv.value_float as text), cast(fv.value_boolean as text)) as value, vt.type as type, env.name as environment
                                      FROM
                                        factsets fs
                                        INNER JOIN facts as f on fs.id = f.factset_id
                                        INNER JOIN fact_values as fv on f.fact_value_id = fv.id
                                        INNER JOIN fact_paths as fp on f.fact_path_id = fp.id
                                        INNER JOIN value_types as vt on vt.id=fv.value_type_id
                                        LEFT OUTER JOIN environments as env on fs.environment_id = env.id
                                      WHERE
                                        depth = 0
                                    ) AS facts
                                  WHERE
                                    (facts.name = $1)
                                  AND
                                    (facts.value = $2 and depth = 0)
                                ) AS r1
                            )
                        )
                      AND
                        (
                          certname IN
                            (
                              SELECT
                                r1.certname
                              FROM
                                (
                                  SELECT
                                    catalog_resources.resource, catalogs.certname, catalog_resources.tags, catalog_resources.exported, catalog_resources.line, catalog_resources.title, catalog_resources.type, catalog_resources.environment, catalog_resources.file, catalog_resources.catalog
                                  FROM
                                    (
                                      SELECT
                                        c.hash as catalog, e.name as environment, catalog_id, resource, type, title, tags, exported, file, line
                                      FROM
                                        catalog_resources AS cr,
                                        catalogs AS c
                                        LEFT OUTER JOIN environments AS e on c.environment_id = e.id
                                      WHERE
                                        c.id = cr.catalog_id
                                    ) AS catalog_resources
                                    JOIN catalogs ON catalog_resources.catalog_id = catalogs.id
                                  WHERE
                                    (catalog_resources.exported = $3)
                                  AND
                                    (catalog_resources.type = $4)
                                  AND
                                    (catalog_resources.title = $5)
                                ) AS r1
                            )
                        )
                      AND
                        (
                          certname IN
                            (
                              SELECT
                                r1.certname
                              FROM
                                (
                                  SELECT
                                    facts.value_integer, facts.certname, facts.path, facts.name, facts.value, facts.value_float, facts.type, facts.environment, facts.depth
                                  FROM
                                    (
                                      SELECT
                                        fs.certname, fp.path as path, fp.name as name, fp.depth as depth, fv.value_integer as value_integer, fv.value_float as value_float, COALESCE(fv.value_string, fv.value_json, cast(fv.value_integer as text), cast(fv.value_float as text), cast(fv.value_boolean as text)) as value, vt.type as type, env.name as environment
                                      FROM
                                        factsets fs
                                        INNER JOIN facts as f on fs.id = f.factset_id
                                        INNER JOIN fact_values as fv on f.fact_value_id = fv.id
                                        INNER JOIN fact_paths as fp on f.fact_path_id = fp.id
                                        INNER JOIN value_types as vt on vt.id=fv.value_type_id
                                        LEFT OUTER JOIN environments as env on fs.environment_id = env.id
                                      WHERE
                                        depth = 0
                                    ) AS facts
                                  WHERE
                                    (facts.name = $6)
                                  AND
                                    (facts.value = $7 and depth = 0)
                                ) AS r1
                            )
                        )
                    ) AS r1
                )
            )
          AND
            (
              (facts.name = $8)
              OR (facts.name = $9)
            )
          )
      ) AS facts
  ) AS paged_results

With correspdonding EXPLAIN ANALYZE plan:

                                                                                           QUERY PLAN                                                                                            
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Semi Join  (cost=4535.36..30365.04 rows=1 width=208) (actual time=3368.252..53367.631 rows=164 loops=1)
   Join Filter: (fs.certname = fs_1.certname)
   Rows Removed by Join Filter: 129446332
   ->  Hash Semi Join  (cost=68.28..1460.61 rows=166 width=241) (actual time=1.413..135.446 rows=2824 loops=1)
         Hash Cond: (fs.certname = certnames.name)
         ->  Hash Left Join  (cost=20.62..1410.68 rows=166 width=208) (actual time=0.573..126.424 rows=2830 loops=1)
               Hash Cond: (fs.environment_id = env.id)
               ->  Nested Loop  (cost=19.24..1407.01 rows=166 width=210) (actual time=0.542..119.632 rows=2830 loops=1)
                     Join Filter: (fv.value_type_id = vt.id)
                     Rows Removed by Join Filter: 14150
                     ->  Nested Loop  (cost=19.24..1391.00 rows=166 width=212) (actual time=0.529..104.421 rows=2830 loops=1)
                           ->  Nested Loop  (cost=18.82..1297.55 rows=166 width=89) (actual time=0.521..63.425 rows=2830 loops=1)
                                 ->  Nested Loop  (cost=18.55..1246.86 rows=166 width=56) (actual time=0.513..24.865 rows=2830 loops=1)
                                       ->  Bitmap Heap Scan on fact_paths fp  (cost=8.58..15.48 rows=2 width=48) (actual time=0.024..0.030 rows=2 loops=1)
                                             Recheck Cond: (((name)::text = $8::text) OR ((name)::text = $9::text))
                                             Filter: (depth = 0)
                                             Heap Blocks: exact=1
                                             ->  BitmapOr  (cost=8.58..8.58 rows=2 width=0) (actual time=0.016..0.016 rows=0 loops=1)
                                                   ->  Bitmap Index Scan on fact_paths_name  (cost=0.00..4.29 rows=1 width=0) (actual time=0.010..0.010 rows=1 loops=1)
                                                         Index Cond: ((name)::text = $8::text)
                                                   ->  Bitmap Index Scan on fact_paths_name  (cost=0.00..4.29 rows=1 width=0) (actual time=0.002..0.002 rows=1 loops=1)
                                                         Index Cond: ((name)::text = $9::text)
                                       ->  Bitmap Heap Scan on facts f  (cost=9.96..613.70 rows=199 width=24) (actual time=0.451..10.651 rows=1415 loops=2)
                                             Recheck Cond: (fact_path_id = fp.id)
                                             Heap Blocks: exact=2278
                                             ->  Bitmap Index Scan on facts_fact_path_id_idx  (cost=0.00..9.91 rows=199 width=0) (actual time=0.261..0.261 rows=1417 loops=2)
                                                   Index Cond: (fact_path_id = fp.id)
                                 ->  Index Scan using factsets_pkey on factsets fs  (cost=0.28..0.30 rows=1 width=49) (actual time=0.009..0.010 rows=1 loops=2830)
                                       Index Cond: (id = f.factset_id)
                           ->  Index Scan using fact_values_pkey on fact_values fv  (cost=0.41..0.55 rows=1 width=139) (actual time=0.009..0.012 rows=1 loops=2830)
                                 Index Cond: (id = f.fact_value_id)
                     ->  Materialize  (cost=0.00..1.09 rows=6 width=14) (actual time=0.000..0.002 rows=6 loops=2830)
                           ->  Seq Scan on value_types vt  (cost=0.00..1.06 rows=6 width=14) (actual time=0.003..0.004 rows=6 loops=1)
               ->  Hash  (cost=1.17..1.17 rows=17 width=14) (actual time=0.016..0.016 rows=17 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 9kB
                     ->  Seq Scan on environments env  (cost=0.00..1.17 rows=17 width=14) (actual time=0.004..0.007 rows=17 loops=1)
         ->  Hash  (cost=29.53..29.53 rows=1450 width=33) (actual time=0.823..0.823 rows=1450 loops=1)
               Buckets: 2048  Batches: 1  Memory Usage: 109kB
               ->  Seq Scan on certnames  (cost=0.00..29.53 rows=1450 width=33) (actual time=0.007..0.367 rows=1450 loops=1)
                     Filter: (deactivated IS NULL)
                     Rows Removed by Filter: 3
   ->  Materialize  (cost=4467.09..28901.92 rows=1 width=132) (actual time=0.028..10.405 rows=45838 loops=2824)
         ->  Nested Loop Semi Join  (cost=4467.09..28901.92 rows=1 width=132) (actual time=65.357..3061.500 rows=48581 loops=1)
               ->  Nested Loop Semi Join  (cost=4465.56..28777.73 rows=6 width=99) (actual time=65.284..2102.333 rows=50566 loops=1)
                     ->  Hash Semi Join  (cost=4464.45..26490.56 rows=245 width=66) (actual time=65.229..791.577 rows=92175 loops=1)
                           Hash Cond: (fs_1.certname = fs_2.certname)
                           ->  Hash Join  (cost=3771.17..24859.96 rows=356035 width=33) (actual time=54.832..663.668 rows=356065 loops=1)
                                 Hash Cond: (f_1.fact_path_id = fp_1.id)
                                 ->  Hash Join  (cost=3608.12..19801.43 rows=356035 width=41) (actual time=52.560..495.905 rows=356065 loops=1)
                                       Hash Cond: (f_1.factset_id = fs_1.id)
                                       ->  Hash Join  (cost=3551.43..14849.26 rows=356035 width=16) (actual time=51.724..340.207 rows=356065 loops=1)
                                             Hash Cond: (f_1.fact_value_id = fv_1.id)
                                             ->  Seq Scan on facts f_1  (cost=0.00..6402.35 rows=356035 width=24) (actual time=0.006..72.073 rows=356065 loops=1)
                                             ->  Hash  (cost=2815.46..2815.46 rows=58877 width=8) (actual time=51.443..51.443 rows=58894 loops=1)
                                                   Buckets: 65536  Batches: 1  Memory Usage: 2813kB
                                                   ->  Hash Join  (cost=1.14..2815.46 rows=58877 width=8) (actual time=0.026..37.389 rows=58894 loops=1)
                                                         Hash Cond: (fv_1.value_type_id = vt_1.id)
                                                         ->  Seq Scan on fact_values fv_1  (cost=0.00..2004.77 rows=58877 width=16) (actual time=0.005..12.692 rows=58894 loops=1)
                                                         ->  Hash  (cost=1.06..1.06 rows=6 width=8) (actual time=0.007..0.007 rows=6 loops=1)
                                                               Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                               ->  Seq Scan on value_types vt_1  (cost=0.00..1.06 rows=6 width=8) (actual time=0.001..0.003 rows=6 loops=1)
                                       ->  Hash  (cost=38.53..38.53 rows=1453 width=49) (actual time=0.815..0.815 rows=1453 loops=1)
                                             Buckets: 2048  Batches: 1  Memory Usage: 132kB
                                             ->  Seq Scan on factsets fs_1  (cost=0.00..38.53 rows=1453 width=49) (actual time=0.007..0.381 rows=1453 loops=1)
                                 ->  Hash  (cost=109.53..109.53 rows=4282 width=8) (actual time=2.221..2.221 rows=4271 loops=1)
                                       Buckets: 8192  Batches: 1  Memory Usage: 231kB
                                       ->  Seq Scan on fact_paths fp_1  (cost=0.00..109.53 rows=4282 width=8) (actual time=0.006..1.215 rows=4271 loops=1)
                                             Filter: (depth = 0)
                           ->  Hash  (cost=693.27..693.27 rows=1 width=33) (actual time=10.358..10.358 rows=338 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 30kB
                                 ->  Nested Loop  (cost=10.94..693.27 rows=1 width=33) (actual time=0.520..10.144 rows=338 loops=1)
                                       Join Filter: (fv_2.value_type_id = vt_2.id)
                                       Rows Removed by Join Filter: 1690
                                       ->  Nested Loop  (cost=10.94..692.13 rows=1 width=41) (actual time=0.515..9.060 rows=338 loops=1)
                                             ->  Nested Loop  (cost=10.66..691.83 rows=1 width=16) (actual time=0.506..7.912 rows=338 loops=1)
                                                   ->  Nested Loop  (cost=10.25..643.86 rows=83 width=16) (actual time=0.468..2.846 rows=1452 loops=1)
                                                         ->  Index Scan using fact_paths_name on fact_paths fp_2  (cost=0.28..8.30 rows=1 width=8) (actual time=0.009..0.010 rows=1 loops=1)
                                                               Index Cond: ((name)::text = $1::text)
                                                               Filter: (depth = 0)
                                                         ->  Bitmap Heap Scan on facts f_2  (cost=9.96..633.57 rows=199 width=24) (actual time=0.456..2.437 rows=1452 loops=1)
                                                               Recheck Cond: (fact_path_id = fp_2.id)
                                                               Heap Blocks: exact=1106
                                                               ->  Bitmap Index Scan on facts_fact_path_id_idx  (cost=0.00..9.91 rows=199 width=0) (actual time=0.271..0.271 rows=1452 loops=1)
                                                                     Index Cond: (fact_path_id = fp_2.id)
                                                   ->  Index Scan using fact_values_pkey on fact_values fv_2  (cost=0.41..0.57 rows=1 width=16) (actual time=0.003..0.003 rows=0 loops=1452)
                                                         Index Cond: (id = f_2.fact_value_id)
                                                         Filter: (COALESCE(value_string, value_json, (value_integer)::text, (value_float)::text, (value_boolean)::text) = $2::text)
                                                         Rows Removed by Filter: 1
                                             ->  Index Scan using factsets_pkey on factsets fs_2  (cost=0.28..0.30 rows=1 width=49) (actual time=0.002..0.003 rows=1 loops=338)
                                                   Index Cond: (id = f_2.factset_id)
                                       ->  Seq Scan on value_types vt_2  (cost=0.00..1.06 rows=6 width=8) (actual time=0.001..0.001 rows=6 loops=338)
                     ->  Nested Loop  (cost=1.11..9.33 rows=1 width=33) (actual time=0.014..0.014 rows=1 loops=92175)
                           ->  Nested Loop  (cost=0.56..0.90 rows=1 width=49) (actual time=0.007..0.007 rows=1 loops=92175)
                                 ->  Index Scan using catalogs_certname_key on catalogs  (cost=0.28..0.46 rows=1 width=41) (actual time=0.003..0.003 rows=1 loops=92175)
                                       Index Cond: (certname = fs_1.certname)
                                 ->  Index Scan using catalogs_pkey on catalogs c  (cost=0.28..0.43 rows=1 width=12) (actual time=0.002..0.002 rows=1 loops=92175)
                                       Index Cond: (id = catalogs.id)
                           ->  Index Scan using catalog_resources_pkey on catalog_resources cr  (cost=0.55..8.42 rows=1 width=8) (actual time=0.006..0.006 rows=1 loops=92175)
                                 Index Cond: ((catalog_id = c.id) AND (type = $4::text) AND (title = $5::text))
                                 Filter: (NOT exported)
               ->  Nested Loop  (cost=1.53..17.95 rows=1 width=33) (actual time=0.018..0.018 rows=1 loops=50566)
                     ->  Nested Loop  (cost=1.40..17.75 rows=1 width=41) (actual time=0.015..0.015 rows=1 loops=50566)
                           ->  Index Scan using fact_paths_name on fact_paths fp_3  (cost=0.28..8.30 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=50566)
                                 Index Cond: ((name)::text = $6::text)
                                 Filter: (depth = 0)
                           ->  Nested Loop  (cost=1.11..9.44 rows=1 width=49) (actual time=0.012..0.012 rows=1 loops=50566)
                                 ->  Nested Loop  (cost=0.70..8.86 rows=1 width=49) (actual time=0.008..0.008 rows=1 loops=50566)
                                       ->  Index Scan using factsets_certname_idx on factsets fs_3  (cost=0.28..0.41 rows=1 width=49) (actual time=0.003..0.003 rows=1 loops=50566)
                                             Index Cond: (certname = fs_1.certname)
                                       ->  Index Scan using facts_factset_id_fact_path_id_fact_key on facts f_3  (cost=0.42..8.44 rows=1 width=24) (actual time=0.004..0.004 rows=1 loops=50566)
                                             Index Cond: ((factset_id = fs_3.id) AND (fact_path_id = fp_3.id))
                                 ->  Index Scan using fact_values_pkey on fact_values fv_3  (cost=0.41..0.57 rows=1 width=16) (actual time=0.003..0.003 rows=1 loops=50566)
                                       Index Cond: (id = f_3.fact_value_id)
                                       Filter: (COALESCE(value_string, value_json, (value_integer)::text, (value_float)::text, (value_boolean)::text) = $7::text)
                                       Rows Removed by Filter: 0
                     ->  Index Only Scan using value_types_pkey on value_types vt_3  (cost=0.13..0.19 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=48581)
                           Index Cond: (id = fv_3.value_type_id)
                           Heap Fetches: 48581
 Planning time: 13.054 ms
 Execution time: 53369.740 ms
(120 rows)

By replacing all SELECT r1.* clauses by SELECT DISTINCT r1.*, I have been able to significantly improve performances - almost two orders of magnitude - as shown by corresponding EXPLAIN ANALYZE:

                                                                                           QUERY PLAN                                                                                            
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Semi Join  (cost=13714.97..13771.89 rows=1 width=208) (actual time=92.938..806.151 rows=164 loops=1)
   ->  Nested Loop Left Join  (cost=13714.69..13771.48 rows=1 width=241) (actual time=92.924..805.231 rows=164 loops=1)
         ->  Nested Loop  (cost=13714.55..13771.31 rows=1 width=243) (actual time=92.915..804.644 rows=164 loops=1)
               ->  Nested Loop  (cost=13714.42..13771.15 rows=1 width=245) (actual time=92.909..804.121 rows=164 loops=1)
                     ->  Nested Loop  (cost=13714.01..13770.58 rows=1 width=122) (actual time=92.899..803.095 rows=164 loops=1)
                           ->  Nested Loop  (cost=13705.00..13738.18 rows=1 width=82) (actual time=92.849..800.398 rows=82 loops=1)
                                 ->  Unique  (cost=13704.72..13729.86 rows=1 width=33) (actual time=92.833..799.694 rows=82 loops=1)
                                       ->  Merge Semi Join  (cost=13704.72..13729.86 rows=1 width=33) (actual time=92.831..788.967 rows=48199 loops=1)
                                             Merge Cond: (fs_1.certname = fs_3.certname)
                                             ->  Nested Loop  (cost=13011.84..13036.95 rows=6 width=99) (actual time=38.166..743.452 rows=50163 loops=1)
                                                   ->  Nested Loop  (cost=13011.71..13035.99 rows=6 width=107) (actual time=38.157..598.485 rows=50163 loops=1)
                                                         ->  Nested Loop  (cost=13011.43..13034.10 rows=6 width=115) (actual time=38.144..375.697 rows=50163 loops=1)
                                                               ->  Nested Loop  (cost=13011.01..13030.71 rows=6 width=115) (actual time=38.136..93.860 rows=50163 loops=1)
                                                                     ->  Merge Semi Join  (cost=13010.59..13010.80 rows=1 width=107) (actual time=38.117..38.543 rows=88 loops=1)
                                                                           Merge Cond: (fs_1.certname = catalogs.certname)
                                                                           ->  Sort  (cost=701.18..701.19 rows=1 width=74) (actual time=24.172..24.309 rows=334 loops=1)
                                                                                 Sort Key: fs_1.certname
                                                                                 Sort Method: quicksort  Memory: 72kB
                                                                                 ->  Nested Loop  (cost=693.13..701.17 rows=1 width=74) (actual time=22.322..23.684 rows=338 loops=1)
                                                                                       ->  HashAggregate  (cost=692.85..692.86 rows=1 width=33) (actual time=22.311..22.394 rows=338 loops=1)
                                                                                             Group Key: fs_2.certname
                                                                                             ->  Nested Loop  (cost=10.94..692.85 rows=1 width=33) (actual time=0.538..22.050 rows=338 loops=1)
                                                                                                   Join Filter: (fv_2.value_type_id = vt_2.id)
                                                                                                   Rows Removed by Join Filter: 1690
                                                                                                   ->  Nested Loop  (cost=10.94..691.71 rows=1 width=41) (actual time=0.533..20.899 rows=338 loops=1)
                                                                                                         ->  Nested Loop  (cost=10.66..691.41 rows=1 width=16) (actual time=0.526..19.777 rows=338 loops=1)
                                                                                                               ->  Nested Loop  (cost=10.25..643.86 rows=82 width=16) (actual time=0.487..2.475 rows=1452 loops=1)
                                                                                                                     ->  Index Scan using fact_paths_name on fact_paths fp_2  (cost=0.28..8.30 rows=1 width=8) (actual time=0.014..0.016 rows=1 loops=1)
                                                                                                                           Index Cond: ((name)::text = $1::text)
                                                                                                                           Filter: (depth = 0)
                                                                                                                     ->  Bitmap Heap Scan on facts f_2  (cost=9.96..633.57 rows=199 width=24) (actual time=0.468..2.077 rows=1452 loops=1)
                                                                                                                           Recheck Cond: (fact_path_id = fp_2.id)
                                                                                                                           Heap Blocks: exact=1106
                                                                                                                           ->  Bitmap Index Scan on facts_fact_path_id_idx  (cost=0.00..9.91 rows=199 width=0) (actual time=0.278..0.278 rows=1453 loops=1)
                                                                                                                                 Index Cond: (fact_path_id = fp_2.id)
                                                                                                               ->  Index Scan using fact_values_pkey on fact_values fv_2  (cost=0.41..0.57 rows=1 width=16) (actual time=0.003..0.011 rows=0 loops=1452)
                                                                                                                     Index Cond: (id = f_2.fact_value_id)
                                                                                                                     Filter: (COALESCE(value_string, value_json, (value_integer)::text, (value_float)::text, (value_boolean)::text) = $2::text)
                                                                                                                     Rows Removed by Filter: 1
                                                                                                         ->  Index Scan using factsets_pkey on factsets fs_2  (cost=0.28..0.30 rows=1 width=49) (actual time=0.002..0.003 rows=1 loops=338)
                                                                                                               Index Cond: (id = f_2.factset_id)
                                                                                                   ->  Seq Scan on value_types vt_2  (cost=0.00..1.06 rows=6 width=8) (actual time=0.001..0.002 rows=6 loops=338)
                                                                                       ->  Index Scan using factsets_certname_idx on factsets fs_1  (cost=0.28..8.29 rows=1 width=49) (actual time=0.003..0.003 rows=1 loops=338)
                                                                                             Index Cond: (certname = fs_2.certname)
                                                                           ->  Sort  (cost=12309.41..12309.50 rows=38 width=33) (actual time=13.856..13.985 rows=434 loops=1)
                                                                                 Sort Key: catalogs.certname
                                                                                 Sort Method: quicksort  Memory: 58kB
                                                                                 ->  HashAggregate  (cost=12307.65..12308.03 rows=38 width=33) (actual time=13.443..13.530 rows=434 loops=1)
                                                                                       Group Key: catalogs.certname
                                                                                       ->  Nested Loop  (cost=0.83..12307.55 rows=38 width=33) (actual time=0.041..13.094 rows=434 loops=1)
                                                                                             ->  Nested Loop  (cost=0.55..12290.94 rows=38 width=16) (actual time=0.034..11.461 rows=434 loops=1)
                                                                                                   ->  Seq Scan on catalogs c  (cost=0.00..53.52 rows=1452 width=12) (actual time=0.005..0.333 rows=1452 loops=1)
                                                                                                   ->  Index Scan using catalog_resources_pkey on catalog_resources cr  (cost=0.55..8.42 rows=1 width=8) (actual time=0.007..0.007 rows=0 loops=1452)
                                                                                                         Index Cond: ((catalog_id = c.id) AND (type = $4::text) AND (title = $5::text))
                                                                                                         Filter: (NOT exported)
                                                                                             ->  Index Scan using catalogs_pkey on catalogs  (cost=0.28..0.43 rows=1 width=41) (actual time=0.003..0.003 rows=1 loops=434)
                                                                                                   Index Cond: (id = c.id)
                                                                     ->  Index Scan using facts_factset_id_fact_path_id_fact_key on facts f_1  (cost=0.42..17.43 rows=248 width=24) (actual time=0.010..0.445 rows=570 loops=88)
                                                                           Index Cond: (factset_id = fs_1.id)
                                                               ->  Index Scan using fact_values_pkey on fact_values fv_1  (cost=0.41..0.55 rows=1 width=16) (actual time=0.004..0.004 rows=1 loops=50163)
                                                                     Index Cond: (id = f_1.fact_value_id)
                                                         ->  Index Scan using fact_paths_pkey on fact_paths fp_1  (cost=0.28..0.30 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=50163)
                                                               Index Cond: (id = f_1.fact_path_id)
                                                               Filter: (depth = 0)
                                                   ->  Index Only Scan using value_types_pkey on value_types vt_1  (cost=0.13..0.15 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=50163)
                                                         Index Cond: (id = fv_1.value_type_id)
                                                         Heap Fetches: 50163
                                             ->  Sort  (cost=692.88..692.88 rows=1 width=33) (actual time=22.500..22.703 rows=1211 loops=1)
                                                   Sort Key: fs_3.certname
                                                   Sort Method: quicksort  Memory: 148kB
                                                   ->  HashAggregate  (cost=692.85..692.86 rows=1 width=33) (actual time=21.199..21.438 rows=1270 loops=1)
                                                         Group Key: fs_3.certname
                                                         ->  Nested Loop  (cost=10.94..692.85 rows=1 width=33) (actual time=0.500..20.291 rows=1270 loops=1)
                                                               Join Filter: (fv_3.value_type_id = vt_3.id)
                                                               Rows Removed by Join Filter: 6350
                                                               ->  Nested Loop  (cost=10.94..691.71 rows=1 width=41) (actual time=0.494..15.919 rows=1270 loops=1)
                                                                     ->  Nested Loop  (cost=10.66..691.41 rows=1 width=16) (actual time=0.484..9.746 rows=1270 loops=1)
                                                                           ->  Nested Loop  (cost=10.25..643.86 rows=82 width=16) (actual time=0.473..3.028 rows=1452 loops=1)
                                                                                 ->  Index Scan using fact_paths_name on fact_paths fp_3  (cost=0.28..8.30 rows=1 width=8) (actual time=0.007..0.008 rows=1 loops=1)
                                                                                       Index Cond: ((name)::text = $6::text)
                                                                                       Filter: (depth = 0)
                                                                                 ->  Bitmap Heap Scan on facts f_3  (cost=9.96..633.57 rows=199 width=24) (actual time=0.457..2.626 rows=1452 loops=1)
                                                                                       Recheck Cond: (fact_path_id = fp_3.id)
                                                                                       Heap Blocks: exact=1103
                                                                                       ->  Bitmap Index Scan on facts_fact_path_id_idx  (cost=0.00..9.91 rows=199 width=0) (actual time=0.274..0.274 rows=1452 loops=1)
                                                                                             Index Cond: (fact_path_id = fp_3.id)
                                                                           ->  Index Scan using fact_values_pkey on fact_values fv_3  (cost=0.41..0.57 rows=1 width=16) (actual time=0.004..0.004 rows=1 loops=1452)
                                                                                 Index Cond: (id = f_3.fact_value_id)
                                                                                 Filter: (COALESCE(value_string, value_json, (value_integer)::text, (value_float)::text, (value_boolean)::text) = $7::text)
                                                                                 Rows Removed by Filter: 0
                                                                     ->  Index Scan using factsets_pkey on factsets fs_3  (cost=0.28..0.30 rows=1 width=49) (actual time=0.004..0.004 rows=1 loops=1270)
                                                                           Index Cond: (id = f_3.factset_id)
                                                               ->  Seq Scan on value_types vt_3  (cost=0.00..1.06 rows=6 width=8) (actual time=0.001..0.002 rows=6 loops=1270)
                                 ->  Index Scan using factsets_certname_idx on factsets fs  (cost=0.28..8.29 rows=1 width=49) (actual time=0.005..0.006 rows=1 loops=82)
                                       Index Cond: (certname = fs_1.certname)
                           ->  Nested Loop  (cost=9.00..32.39 rows=2 width=56) (actual time=0.021..0.028 rows=2 loops=82)
                                 ->  Bitmap Heap Scan on fact_paths fp  (cost=8.58..15.48 rows=2 width=48) (actual time=0.009..0.010 rows=2 loops=82)
                                       Recheck Cond: (((name)::text = $8::text) OR ((name)::text = $9::text))
                                       Filter: (depth = 0)
                                       Heap Blocks: exact=82
                                       ->  BitmapOr  (cost=8.58..8.58 rows=2 width=0) (actual time=0.007..0.007 rows=0 loops=82)
                                             ->  Bitmap Index Scan on fact_paths_name  (cost=0.00..4.29 rows=1 width=0) (actual time=0.004..0.004 rows=1 loops=82)
                                                   Index Cond: ((name)::text = $8::text)
                                             ->  Bitmap Index Scan on fact_paths_name  (cost=0.00..4.29 rows=1 width=0) (actual time=0.002..0.002 rows=1 loops=82)
                                                   Index Cond: ((name)::text = $9::text)
                                 ->  Index Scan using facts_factset_id_fact_path_id_fact_key on facts f  (cost=0.42..8.44 rows=1 width=24) (actual time=0.007..0.007 rows=1 loops=164)
                                       Index Cond: ((factset_id = fs.id) AND (fact_path_id = fp.id))
                     ->  Index Scan using fact_values_pkey on fact_values fv  (cost=0.41..0.55 rows=1 width=139) (actual time=0.004..0.005 rows=1 loops=164)
                           Index Cond: (id = f.fact_value_id)
               ->  Index Scan using value_types_pkey on value_types vt  (cost=0.13..0.15 rows=1 width=14) (actual time=0.002..0.002 rows=1 loops=164)
                     Index Cond: (id = fv.value_type_id)
         ->  Index Scan using environments_pkey on environments env  (cost=0.14..0.16 rows=1 width=14) (actual time=0.002..0.002 rows=1 loops=164)
               Index Cond: (fs.environment_id = id)
   ->  Index Scan using certnames_pkey on certnames  (cost=0.28..0.39 rows=1 width=33) (actual time=0.004..0.004 rows=1 loops=164)
         Index Cond: (name = fs.certname)
         Filter: (deactivated IS NULL)
 Planning time: 10.041 ms
 Execution time: 806.752 ms
(118 rows)

I have tracked this to the compile-extract function, which I believe would benefit from the following single-liner PR.

Does it make sense ?

exo-cedric avatar Mar 03 '20 16:03 exo-cedric

Can one of the admins verify this patch?

puppetlabs-jenkins avatar Mar 03 '20 16:03 puppetlabs-jenkins

Waiting for CLA signature by @exo-cedric

@exo-cedric - We require a Contributor License Agreement (CLA) for people who contribute to Puppet, but we have an easy click-through license with instructions, which is available at https://cla.puppet.com/

Note: if your contribution is trivial and you think it may be exempt from the CLA, please post a short reply to this comment with details. http://docs.puppet.com/community/trivial_patch_exemption.html

puppetcla avatar Mar 04 '20 05:03 puppetcla

Note: if your contribution is trivial and you think it may be exempt from the CLA, please post a short reply to this comment with details. http://docs.puppet.com/community/trivial_patch_exemption.html

According to https://puppet.com/community/trivial-patch-exemption-policy/ (mark corrected link), I believe this one-liner is trivial, as per:

Puppet Definition of Trivial: A trivial patch is a pull request that does not contain creative work. As a rule of thumb, changes are trivial patches if they:
- are fewer than 10 lines, and
- introduce no new functionality.

(Is it ?)

exo-cedric avatar Mar 04 '20 08:03 exo-cedric

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Feb 24 '21 17:02 CLAassistant