trunk-player
trunk-player copied to clipboard
Large radio_transmission or unit ID tagging results in extremely high query time with COUNT query
Hi folks,
I spun up a trunk-player instance in recent months, and noticed a notable decline in performance with Trunk Players pSQL database on certain queries. Notably, this particular query being the major culprit to the slow render time to the client/web site front end on the /scan/ sub-pages. Changing the age cutoff makes a minimal difference in speeds. Changing max listening age makes no difference as well - below is six or so months (i think).
SELECT COUNT(*) AS "__count"
FROM "radio_transmission"
WHERE ("radio_transmission"."talkgroup_info_id" IN (SELECT U0."id" AS Col1 FROM "radio_talkgroup" U0 INNER JOIN "radio_scanlist_talkgroups" U1 ON (U0."id" = U1."talkgroup_id") WHERE U1."scanlist_id" = 2)
AND "radio_transmission"."start_datetime" > '2019-11-07T06:47:43.917344+00:00'::timestamptz)
A verbose explain of the query returns the following details...
QUERY PLAN
--
Aggregate (cost=7130.31..7130.32 rows=1 width=8)
Output: count(*)
-> Nested Loop (cost=45.17..7015.20 rows=46043 width=0)
-> HashAggregate (cost=44.74..44.82 rows=8 width=8)
Output: u0.id, u1.talkgroup_id
Group Key: u0.id
-> Hash Join (cost=17.13..44.72 rows=8 width=8)
Output: u0.id, u1.talkgroup_id
Hash Cond: (u0.id = u1.talkgroup_id)
-> Seq Scan on public.radio_talkgroup u0 (cost=0.00..21.72 rows=772 width=4)
Output: u0.id, u0.dec_id, u0.alpha_tag, u0.description, u0.slug, u0.common_name, u0.public, u0.comments, u0.system_id, u0.mode, u0.priority, u0._home_site_id, u0._service_type_id, u0.last_transmission, u0.recent_usage, u0.play_source_id
-> Hash (cost=17.03..17.03 rows=8 width=4)
Output: u1.talkgroup_id
-> Bitmap Heap Scan on public.radio_scanlist_talkgroups u1 (cost=4.34..17.03 rows=8 width=4)
Output: u1.talkgroup_id
Recheck Cond: (u1.scanlist_id = 2)
-> Bitmap Index Scan on radio_scanlist_talkgroups_scanlist_id_23533624 (cost=0.00..4.34 rows=8 width=0)
Index Cond: (u1.scanlist_id = 2)
-> Index Scan using radio_transmission_talkgroup_info_id_d3982633 on public.radio_transmission (cost=0.43..813.75 rows=5755 width=4)
Output: radio_transmission.id, radio_transmission.start_datetime, radio_transmission.audio_file, radio_transmission.talkgroup, radio_transmission.freq, radio_transmission.emergency, radio_transmission.talkgroup_info_id, radio_transmission.play_length, radio_transmission.slug, radio_transmission.source_id, radio_transmission.system_id, radio_transmission.audio_file_url_path, radio_transmission.audio_file_type, radio_transmission.end_datetime, radio_transmission.from_default_source, radio_transmission.has_audio
Index Cond: (radio_transmission.talkgroup_info_id = u0.id)
Filter: (radio_transmission.start_datetime > '2019-11-07 06:47:43.917344+00'::timestamp with time zone)
Tested the database on a magnetic, SSD, and m2 drive. All of them result in the same slow render/query time. From testing, this query ends up being single CPU bound, and unfortunately, I do not see how the above query could benefit from the parallel feature on pSQL 9.x+.
Attempted the above with Bitmap Heap Scan's disabled and enabled - it made no difference.
In a way, it would appear that tagged radio ID's that have been associated with text; rather a displayed Radio ID of 123456, we would see 'AgencyName 123456' - which may be the culprit of the slowdowns?
What if you remove the limit logic all together.
If you just comment out these two lines (and restart to web service) https://github.com/ScanOC/trunk-player/blob/master/radio/views.py#L321-L322
That would skip all limits in the scan lists, as a test. There was a HUGE performance hit when I added the limits, I had fixed it a few times to make it a little bit better but it was never that great,
Thanks, will poke about with that segment along with the item above it with the query limits.