steampipe icon indicating copy to clipboard operation
steampipe copied to clipboard

Cache errors with queries with large domain lists in `net_dns_record`

Open tinder-tder opened this issue 1 year ago • 9 comments

Describe the bug Caching errors/corruption(?) causing plugin to exit or not complete queries. It seems to happen only when doing a large amount of queries for a domain list > 2k. I have attempted to throttle the plugin with the limiter block config but it doesn't not fix it.

  • turning off caching on the database options, the query seems to only work if i run the query via steampipe query but not if submitted via redash (redash -> steampipe (in service mode) its getting an error Error running query: rpc error: code = Unknown desc = IterateSet called for callId net-1699322677353 but there is no in-progress ‘set’ operation which is also logged in the plugin log)
  • turning off caching using select from steampipe_internal.meta_cache(‘off’); before running the query prevents errors/warnings in the plugin and seems to run faster when it does complete (works in redash)

sample of logs when caching is ON:

==> plugin-2023-11-07.log <==
2023-11-07 02:04:37.306 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699322677933: EndSet called for callId net-1699322677933 but there is no in progress set operation
2023-11-07 02:04:37.306 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699322677933: cache EndSet failed: EndSet called for callId net-1699322677933 but there is no in progress set operation
2023-11-07 02:04:37.326 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699322677107: EndSet called for callId net-1699322677107 but there is no in progress set operation
2023-11-07 02:04:37.326 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699322677107: cache EndSet failed: EndSet called for callId net-1699322677107 but there is no in progress set operation
2023-11-07 02:04:37.387 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699322677683: EndSet called for callId net-1699322677683 but there is no in progress set operation
2023-11-07 02:04:37.387 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699322677683: cache EndSet failed: EndSet called for callId net-1699322677683 but there is no in progress set operation
2023-11-07 02:04:37.547 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699322677353: streamRows execution has failed: net-1699322677353 - calling queryCache.AbortSet (IterateSet called for callId net-1699322677353 but there is no in-progress 'set' operation)
2023-11-07 02:04:37.547 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699322677353: queryData.streamRows returned error: IterateSet called for callId net-1699322677353 but there is no in-progress 'set' operation
2023-11-07 02:04:37.547 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699322677353: executeForConnection net returned error IterateSet called for callId net-1699322677353 but there is no in-progress 'set' operation, writing to CHAN
2023-11-07 02:04:37.547 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699322677353: error channel received IterateSet called for callId net-1699322677353 but there is no in-progress 'set' operation

and

2023-11-06 22:39:03.684 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699310333993: getRowsSince failed to read page 0 key net_net_dns_record_--domain-=-tXxXXXXXXXX.---domain-=-tXxXXXXXXXX.--domain-=-tXxXXXXXXXX.-type-=-A-domain-=-tXxXXXXXXXX.-type-=-A_ttl,tag,expire,target,ip,priority,serial,refresh,type,dns_server,value,minimum,retry,domain,_ctx_-1-0 after 10 retries: Entry not found (net-1699310333993)
2023-11-06 22:39:03.685 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699310333993: readRowsAsync failed to read previous rows from cache: Entry not found publisher net-1699310333993 (net-1699310333993)
2023-11-06 22:39:03.685 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699310333993: waitUntilDone - setRequestSubscriber received an error from setRequest net-1699310333993: Entry not found (net-1699310333993)
2023-11-06 22:39:03.698 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699310343795: executeForConnection recover from panic: callId: net-1699310343795 table: net_dns_record error: runtime error: slice bounds out of range [16:4]
2023-11-06 22:39:03.698 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699310343795: executeForConnection net returned error runtime error: slice bounds out of range [16:4], writing to CHAN
2023-11-06 22:39:03.698 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1699310343795: error channel received runtime error: slice bounds out of range [16:4]

Steampipe version (steampipe -v) v0.21.1

Plugin version (steampipe plugin list) 0.11.1

To reproduce

create a large list of domains to scan and run a query against the net_dns_record table with all domains in the list

	--- Scan a list of dns A, CNAME records
	select
	 domain,
	 type,
	 ip,
	 target,
	 ttl
	into public.r53_query_dump
	from
	 net.net_dns_record
	where
	 domain in (
	   --- get all A and CNAME public records
	   select
	     name
	   from
	     public.route53_dump
	   where
	     type in ('A', 'CNAME')
	     and private_zone = false
	 )
	 and type = 'A';

Expected behavior query to complete

Additional context slack thread

tinder-tder avatar Nov 09 '23 20:11 tinder-tder

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

github-actions[bot] avatar Jan 08 '24 23:01 github-actions[bot]

Hey @tinder-tder, my apologies for the late response. Could you please try the above using the latest version(0.21.3) of Steampipe?

bigdatasourav avatar Jan 22 '24 16:01 bigdatasourav

@bigdatasourav it looks like those warnings no longer showup with the latest steampipe version and latest plugin version (0.12) but now the query doesnt complete.

2024-01-22 20:32:14.156 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1705955491598: doList callHydrateWithRetries (net-1705955491598) returned err read udp 10.24.166.252:39462->1.1.1.1:53: i/o timeout
2024-01-22 20:32:14.156 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1705955491598: QueryData StreamError read udp 10.24.166.252:39462->1.1.1.1:53: i/o timeout (net-1705955491598)
2024-01-22 20:32:14.156 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1705955491598: streamRows execution has failed: net-1705955491598 - calling queryCache.AbortSet (net: read udp 10.24.166.252:39462->1.1.1.1:53: i/o timeout)
2024-01-22 20:32:14.159 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1705955491598: QueryCache AbortSet - aborting request  with error net: read udp 10.24.166.252:39462->1.1.1.1:53: i/o timeout (1 subscriber) (net-1705955491598)
2024-01-22 20:32:14.159 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1705955491598: queryData.streamRows returned error: net: read udp 10.24.166.252:39462->1.1.1.1:53: i/o timeout
2024-01-22 20:32:14.159 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1705955491598: executeForConnection net returned error net: read udp 10.24.166.252:39462->1.1.1.1:53: i/o timeout, writing to CHAN
2024-01-22 20:32:14.159 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1705955491598: error channel received net: read udp 10.24.166.252:39462->1.1.1.1:53: i/o timeout
2024-01-22 20:32:14.180 UTC [WARN]  steampipe-plugin-net.plugin: [WARN]  1705955491598: readAndStreamAsync failed to read previous rows from cache: net: read udp 10.24.166.252:39462->1.1.1.1:53: i/o timeout publisher net-1705955491598 (net-1705955491598)

tinder-tder avatar Jan 22 '24 21:01 tinder-tder

@tinder-tder, When you said that the query doesn't complete, does this mean that query execution was not completed in the terminal, or are you getting an error in the terminal?

bigdatasourav avatar Jan 25 '24 07:01 bigdatasourav

@bigdatasourav no errors it just hangs, i have tried to change the query around and it looks like it just takes longer now. Connecting in service mode via redash it used to finish but now its timing out (client connection dies and the transaction gets rolled back). I can connect locally and run the same query via psql and it will eventually finish.

tinder-tder avatar Jan 26 '24 01:01 tinder-tder

@kaidaguerre, as discussed on the above Slack thread, could you please look into it when you get a chance?

bigdatasourav avatar Feb 06 '24 05:02 bigdatasourav

Hi @tinder-tder are you still seeing this issue?

kaidaguerre avatar Mar 21 '24 12:03 kaidaguerre

@kaidaguerre Yes, we are basically splitting up queries now into smaller chunks to get around it for now. The cache erros are not showing up in the logs. we can close this if there isnt anything we can do

tinder-tder avatar Apr 01 '24 20:04 tinder-tder

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

github-actions[bot] avatar Jun 01 '24 08:06 github-actions[bot]

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

github-actions[bot] avatar Aug 07 '24 08:08 github-actions[bot]