foreman
foreman copied to clipboard
Fixes #35689 - Use better fact search query
Issues: #35689
Can you use Benchmark.measure to determine which implementation is faster? It would also be interesting to run "EXPLAIN ANALYZE SELECT ....` on the DB to see which one is better.
I don't have a box with lots of facts right now and it will take me a bit to get one so I can't really benchmark it, however here are the explains:
Before the changes
Foreman
> Host.search_for('facts.architecture=x86_64').explain
=>
EXPLAIN for: SELECT "hosts".* FROM "hosts" WHERE "hosts"."type" = $1 AND ((hosts.id in (SELECT "fact_values"."host_id" FROM "fact_values" INNER JOIN "fact_names" ON "fact_names"."id" = "fact_values"."fact_name_id" WHERE (fact_names.name = 'architecture') AND (fact_values.value = 'x86_64')))) ORDER BY "hosts"."name" ASC [["type", "Host::Managed"]]
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=27266.33..27266.50 rows=67 width=3199)
Sort Key: hosts.name
-> Nested Loop (cost=27236.52..27264.30 rows=67 width=3199)
-> HashAggregate (cost=27236.23..27236.90 rows=67 width=4)
Group Key: fact_values.host_id
-> Nested Loop (cost=231.73..27236.06 rows=67 width=4)
-> Index Scan using index_fact_names_on_name_and_type on fact_names (cost=0.28..8.29 rows=1 width=4)
Index Cond: ((name)::text = 'architecture'::text)
-> Bitmap Heap Scan on fact_values (cost=231.45..27226.78 rows=99 width=8)
Recheck Cond: (fact_name_id = fact_names.id)
Filter: (value = 'x86_64'::text)
-> Bitmap Index Scan on index_fact_values_on_fact_name_id_and_host_id (cost=0.00..231.43 rows=9999 width=0)
Index Cond: (fact_name_id = fact_names.id)
-> Index Scan using hosts_pkey on hosts (cost=0.29..0.41 rows=1 width=3199)
Index Cond: (id = fact_values.host_id)
Filter: ((type)::text = 'Host::Managed'::text)
(16 rows)
Discovery
> Host::Discovered.search_for('facts.architecture=x86_64').explain
=>
EXPLAIN for: SELECT "hosts".* FROM "hosts" INNER JOIN "fact_values" fact_values_634325 ON (hosts.id = fact_values_634325.host_id)
INNER JOIN "fact_names" fact_names_634325 ON (fact_names_634325.id = fact_values_634325.fact_name_id) WHERE "hosts"."type" = $1 AND ((fact_names_634325."name" = 'architecture') AND ("hosts"."id" IN (SELECT "host_id" FROM "fact_values" WHERE "fact_values_634325"."value" = 'x86_64' ))) ORDER BY "hosts"."created_at" DESC NULLS LAST [["type", "Host::Discovered"]]
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=27336.40..27336.40 rows=1 width=3199)
Sort Key: hosts.created_at DESC NULLS LAST
-> Hash Join (cost=240.27..27336.39 rows=1 width=3199)
Hash Cond: (fact_values_634325.host_id = hosts.id)
Join Filter: (SubPlan 1)
-> Nested Loop (cost=234.20..27312.54 rows=6774 width=24)
-> Index Scan using index_fact_names_on_name_and_type on fact_names fact_names_634325 (cost=0.28..8.29 rows=1 width=4)
Index Cond: ((name)::text = 'architecture'::text)
-> Bitmap Heap Scan on fact_values fact_values_634325 (cost=233.92..27204.26 rows=9999 width=28)
Recheck Cond: (fact_name_id = fact_names_634325.id)
-> Bitmap Index Scan on index_fact_values_on_fact_name_id_and_host_id (cost=0.00..231.43 rows=9999 width=0)
Index Cond: (fact_name_id = fact_names_634325.id)
-> Hash (cost=6.05..6.05 rows=1 width=3199)
-> Index Scan using index_hosts_on_type_and_location_id on hosts (cost=0.29..6.05 rows=1 width=3199)
Index Cond: ((type)::text = 'Host::Discovered'::text)
SubPlan 1
-> Result (cost=0.00..150863.99 rows=8358799 width=4)
One-Time Filter: (fact_values_634325.value = 'x86_64'::text)
-> Seq Scan on fact_values (cost=0.00..150863.99 rows=8358799 width=4)
(19 rows)
After the changes
Foreman
> Host.search_for('facts.architecture=x86_64').explain
=>
EXPLAIN for: SELECT "hosts".* FROM "hosts" LEFT JOIN fact_values AS fact_values_128255 ON fact_values_128255.host_id = hosts.id LEFT JOIN fact_names AS fact_names_128255 ON fact_names_128255.id = fact_values_128255.fact_name_id WHERE "hosts"."type" = $1 AND ((fact_names_128255.name = 'architecture' AND (fact_values_128255.value = 'x86_64'))) ORDER BY "hosts"."name" ASC [["type", "Host::Managed"]]
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=27259.29..27259.46 rows=67 width=3199)
Sort Key: hosts.name
-> Nested Loop (cost=232.01..27257.26 rows=67 width=3199)
-> Nested Loop (cost=231.73..27236.06 rows=67 width=4)
-> Index Scan using index_fact_names_on_name_and_type on fact_names fact_names_128255 (cost=0.28..8.29 rows=1 width=4)
Index Cond: ((name)::text = 'architecture'::text)
-> Bitmap Heap Scan on fact_values fact_values_128255 (cost=231.45..27226.78 rows=99 width=8)
Recheck Cond: (fact_name_id = fact_names_128255.id)
Filter: (value = 'x86_64'::text)
-> Bitmap Index Scan on index_fact_values_on_fact_name_id_and_host_id (cost=0.00..231.43 rows=9999 width=0)
Index Cond: (fact_name_id = fact_names_128255.id)
-> Index Scan using hosts_pkey on hosts (cost=0.29..0.32 rows=1 width=3199)
Index Cond: (id = fact_values_128255.host_id)
Filter: ((type)::text = 'Host::Managed'::text)
(14 rows)
Discovery
> Host::Discovered.search_for('facts.architecture=x86_64').explain
=>
EXPLAIN for: SELECT "hosts".* FROM "hosts" LEFT JOIN fact_values AS fact_values_a84d6e ON fact_values_a84d6e.host_id = hosts.id LEFT JOIN fact_names AS fact_names_a84d6e ON fact_names_a84d6e.id = fact_values_a84d6e.fact_name_id WHERE "hosts"."type" = $1 AND ((fact_names_a84d6e.name = 'architecture' AND (fact_values_a84d6e.value = 'x86_64'))) ORDER BY "hosts"."created_at" DESC NULLS LAST [["type", "Host::Discovered"]]
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=22.83..22.84 rows=1 width=3199)
Sort Key: hosts.created_at DESC NULLS LAST
-> Nested Loop (cost=1.00..22.82 rows=1 width=3199)
-> Index Scan using index_fact_names_on_name_and_type on fact_names fact_names_a84d6e (cost=0.28..8.29 rows=1 width=4)
Index Cond: ((name)::text = 'architecture'::text)
-> Nested Loop (cost=0.72..14.52 rows=1 width=3203)
-> Index Scan using index_hosts_on_type_and_location_id on hosts (cost=0.29..6.05 rows=1 width=3199)
Index Cond: ((type)::text = 'Host::Discovered'::text)
-> Index Scan using index_fact_values_on_fact_name_id_and_host_id on fact_values fact_values_a84d6e (cost=0.43..8.45 rows=1 width=8)
Index Cond: ((fact_name_id = fact_names_a84d6e.id) AND (host_id = hosts.id))
Filter: (value = 'x86_64'::text)
(11 rows)
With selective joins and searching on multiple facts
Before
Foreman
> ::Host.search_for('facts.architecture=x86_64 and facts.chassistype=Other').explain
=>
EXPLAIN for: SELECT "hosts".* FROM "hosts" WHERE "hosts"."type" = $1 AND (((hosts.id in (SELECT "fact_values"."host_id" FROM "fact_values" INNER JOIN "fact_names" ON "fact_names"."id" = "fact_values"."fact_name_id" WHERE (fact_names.name = 'architecture') AND (fact_values.value = 'x86_64'))) AND (hosts.id in (SELECT "fact_values"."host_id" FROM "fact_values" INNER JOIN "fact_names" ON "fact_names"."id" = "fact_values"."fact_name_id" WHERE (fact_names.name = 'chassistype') AND (fact_values.value = 'Other'))))) ORDER BY "hosts"."name" ASC [["type", "Host::Managed"]]
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=27682.51..27682.52 rows=1 width=3199)
Sort Key: hosts.name
-> Nested Loop Semi Join (cost=27236.51..27682.50 rows=1 width=3199)
-> Nested Loop (cost=27235.80..27246.48 rows=26 width=3203)
-> HashAggregate (cost=27235.51..27235.77 rows=26 width=4)
Group Key: fact_values_1.host_id
-> Nested Loop (cost=231.71..27235.45 rows=26 width=4)
-> Index Scan using index_fact_names_on_name_and_type on fact_names fact_names_1 (cost=0.28..8.29 rows=1 width=4)
Index Cond: ((name)::text = 'chassistype'::text)
-> Bitmap Heap Scan on fact_values fact_values_1 (cost=231.43..27226.76 rows=39 width=8)
Recheck Cond: (fact_name_id = fact_names_1.id)
Filter: (value = 'Other'::text)
-> Bitmap Index Scan on index_fact_values_on_fact_name_id_and_host_id (cost=0.00..231.43 rows=9999 width=0)
Index Cond: (fact_name_id = fact_names_1.id)
-> Index Scan using hosts_pkey on hosts (cost=0.29..0.41 rows=1 width=3199)
Index Cond: (id = fact_values_1.host_id)
Filter: ((type)::text = 'Host::Managed'::text)
-> Nested Loop (cost=0.71..16.76 rows=1 width=4)
-> Index Scan using index_fact_names_on_name_and_type on fact_names (cost=0.28..8.29 rows=1 width=4)
Index Cond: ((name)::text = 'architecture'::text)
-> Index Scan using index_fact_values_on_fact_name_id_and_host_id on fact_values (cost=0.43..8.45 rows=1 width=8)
Index Cond: ((fact_name_id = fact_names.id) AND (host_id = hosts.id))
Filter: (value = 'x86_64'::text)
(23 rows)
Discovery
> ::Host::Discovered.search_for('facts.architecture=x86_64 and facts.chassistype=Other').explain
=>
EXPLAIN for: SELECT "hosts".* FROM "hosts" INNER JOIN "fact_values" fact_values_253030 ON (hosts.id = fact_values_253030.host_id)
INNER JOIN "fact_names" fact_names_253030 ON (fact_names_253030.id = fact_values_253030.fact_name_id) INNER JOIN "fact_values" fact_values_22845 ON (hosts.id = fact_values_22845.host_id)
INNER JOIN "fact_names" fact_names_22845 ON (fact_names_22845.id = fact_values_22845.fact_name_id) WHERE "hosts"."type" = $1 AND ((fact_names_253030."name" = 'architecture') AND (fact_names_22845."name" = 'chassistype') AND (("hosts"."id" IN (SELECT "host_id" FROM "fact_values" WHERE "fact_values_253030"."value"
= 'x86_64' )) AND ("hosts"."id" IN (SELECT "host_id" FROM "fact_values" WHERE "fact_values_22845"."value" = 'Other' )))) ORDER BY "hosts"."created_at" DESC NULLS LAST [["type", "Host::Discovered"]]
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=54832.55..54832.56 rows=1 width=3199)
Sort Key: hosts.created_at DESC NULLS LAST
-> Hash Join (cost=27637.48..54832.54 rows=1 width=3199)
Hash Cond: (fact_values_253030.host_id = hosts.id)
Join Filter: ((SubPlan 1) AND (SubPlan 2))
-> Hash Join (cost=27631.42..54812.77 rows=5221 width=48)
Hash Cond: (fact_values_253030.host_id = fact_values_22845.host_id)
-> Nested Loop (cost=234.20..27312.54 rows=6774 width=24)
-> Index Scan using index_fact_names_on_name_and_type on fact_names fact_names_253030 (cost=0.28..8.29 rows=1 width=4)
Index Cond: ((name)::text = 'architecture'::text)
-> Bitmap Heap Scan on fact_values fact_values_253030 (cost=233.92..27204.26 rows=9999 width=28)
Recheck Cond: (fact_name_id = fact_names_253030.id)
-> Bitmap Index Scan on index_fact_values_on_fact_name_id_and_host_id (cost=0.00..231.43 rows=9999 width=0)
Index Cond: (fact_name_id = fact_names_253030.id)
-> Hash (cost=27312.54..27312.54 rows=6774 width=24)
-> Nested Loop (cost=234.20..27312.54 rows=6774 width=24)
-> Index Scan using index_fact_names_on_name_and_type on fact_names fact_names_22845 (cost=0.28..8.29 rows=1 width=4)
Index Cond: ((name)::text = 'chassistype'::text)
-> Bitmap Heap Scan on fact_values fact_values_22845 (cost=233.92..27204.26 rows=9999 width=28)
Recheck Cond: (fact_name_id = fact_names_22845.id)
-> Bitmap Index Scan on index_fact_values_on_fact_name_id_and_host_id (cost=0.00..231.43 rows=9999 width=0)
Index Cond: (fact_name_id = fact_names_22845.id)
-> Hash (cost=6.05..6.05 rows=1 width=3199)
-> Index Scan using index_hosts_on_type_and_location_id on hosts (cost=0.29..6.05 rows=1 width=3199)
Index Cond: ((type)::text = 'Host::Discovered'::text)
SubPlan 1
-> Result (cost=0.00..150863.99 rows=8358799 width=4)
One-Time Filter: (fact_values_253030.value = 'x86_64'::text)
-> Seq Scan on fact_values (cost=0.00..150863.99 rows=8358799 width=4)
SubPlan 2
-> Result (cost=0.00..150863.99 rows=8358799 width=4)
One-Time Filter: (fact_values_22845.value = 'Other'::text)
-> Seq Scan on fact_values fact_values_1 (cost=0.00..150863.99 rows=8358799 width=4)
(33 rows)
After
Foreman
> ::Host.search_for('facts.architecture=x86_64 and facts.chassistype=Other').explain
=>
EXPLAIN for: SELECT "hosts".* FROM "hosts" LEFT JOIN fact_values AS fact_values_55e851 ON fact_values_55e851.host_id = hosts.id LEFT JOIN fact_names AS fact_names_55e851 ON fact_names_55e851.id = fact_values_55e851.fact_name_id AND fact_names_55e851.name = 'architecture' LEFT JOIN fact_values AS fact_values_76baee ON fact_values_76baee.host_id = hosts.id LEFT JOIN fact_names AS fact_names_76baee ON fact_names_76baee.id = fact_values_76baee.fact_name_id AND fact_names_76baee.name = 'chassistype' WHERE "hosts"."type" = $1 AND (((fact_names_55e851.name = 'architecture' AND (fact_values_55e851.value = 'x86_64')) AND (fact_names_76baee.name = 'chassistype' AND (fact_values_76baee.value = 'Other')))) ORDER BY "hosts"."name" ASC [["type", "Host::Managed"]]
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=27465.15..27465.16 rows=1 width=3199)
Sort Key: hosts.name
-> Nested Loop (cost=232.71..27465.14 rows=1 width=3199)
-> Nested Loop (cost=232.42..27464.82 rows=1 width=8)
-> Index Scan using index_fact_names_on_name_and_type on fact_names fact_names_55e851 (cost=0.28..8.29 rows=1 width=4)
Index Cond: ((name)::text = 'architecture'::text)
-> Nested Loop (cost=232.14..27455.54 rows=99 width=12)
-> Nested Loop (cost=231.71..27235.45 rows=26 width=4)
-> Index Scan using index_fact_names_on_name_and_type on fact_names fact_names_76baee (cost=0.28..8.29 rows=1 width=4)
Index Cond: ((name)::text = 'chassistype'::text)
-> Bitmap Heap Scan on fact_values fact_values_76baee (cost=231.43..27226.76 rows=39 width=8)
Recheck Cond: (fact_name_id = fact_names_76baee.id)
Filter: (value = 'Other'::text)
-> Bitmap Index Scan on index_fact_values_on_fact_name_id_and_host_id (cost=0.00..231.43 rows=9999 width=0)
Index Cond: (fact_name_id = fact_names_76baee.id)
-> Index Scan using index_fact_values_on_fact_name_id_and_host_id on fact_values fact_values_55e851 (cost=0.43..8.45 rows=1 width=8)
Index Cond: ((fact_name_id = fact_names_55e851.id) AND (host_id = fact_values_76baee.host_id))
Filter: (value = 'x86_64'::text)
-> Index Scan using hosts_pkey on hosts (cost=0.29..0.32 rows=1 width=3199)
Index Cond: (id = fact_values_55e851.host_id)
Filter: ((type)::text = 'Host::Managed'::text)
(21 rows)
Discovery
> ::Host::Discovered.search_for('facts.architecture=x86_64 and facts.chassistype=Other').explain
=>
EXPLAIN for: SELECT "hosts".* FROM "hosts" LEFT JOIN fact_values AS fact_values_9dcff5 ON fact_values_9dcff5.host_id = hosts.id LEFT JOIN fact_names AS fact_names_9dcff5 ON fact_names_9dcff5.id = fact_values_9dcff5.fact_name_id AND fact_names_9dcff5.name = 'architecture' LEFT JOIN fact_values AS fact_values_47b143 ON fact_values_47b143.host_id = hosts.id LEFT JOIN fact_names AS fact_names_47b143 ON fact_names_47b143.id = fact_values_47b143.fact_name_id AND fact_names_47b143.name = 'chassistype' WHERE "hosts"."type" = $1 AND (((fact_names_9dcff5.name = 'architecture' AND (fact_values_9dcff5.value = 'x86_64')) AND (fact_names_47b143.name = 'chassistype' AND (fact_values_47b143.value = 'Other')))) ORDER BY "hosts"."created_at" DESC NULLS LAST [["type", "Host::Discovered"]]
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=39.60..39.61 rows=1 width=3199)
Sort Key: hosts.created_at DESC NULLS LAST
-> Nested Loop (cost=1.71..39.59 rows=1 width=3199)
-> Index Scan using index_fact_names_on_name_and_type on fact_names fact_names_47b143 (cost=0.28..8.29 rows=1 width=4)
Index Cond: ((name)::text = 'chassistype'::text)
-> Nested Loop (cost=1.43..31.29 rows=1 width=3203)
-> Nested Loop (cost=1.00..22.82 rows=1 width=3203)
-> Index Scan using index_fact_names_on_name_and_type on fact_names fact_names_9dcff5 (cost=0.28..8.29 rows=1 width=4)
Index Cond: ((name)::text = 'architecture'::text)
-> Nested Loop (cost=0.72..14.52 rows=1 width=3207)
-> Index Scan using index_hosts_on_type_and_location_id on hosts (cost=0.29..6.05 rows=1 width=3199)
Index Cond: ((type)::text = 'Host::Discovered'::text)
-> Index Scan using index_fact_values_on_fact_name_id_and_host_id on fact_values fact_values_9dcff5 (cost=0.43..8.45 rows=1 width=8)
Index Cond: ((fact_name_id = fact_names_9dcff5.id) AND (host_id = hosts.id))
Filter: (value = 'x86_64'::text)
-> Index Scan using index_fact_values_on_fact_name_id_and_host_id on fact_values fact_values_47b143 (cost=0.43..8.45 rows=1 width=8)
Index Cond: ((fact_name_id = fact_names_47b143.id) AND (host_id = fact_values_9dcff5.host_id))
Filter: (value = 'Other'::text)
(18 rows)