Finch HTTP1 pool metrics aren't removed when the pool is
As usual, thanks a ton for all your work 💚
Problem
It seems to me that PoolMetrics aren't removed when the pool is terminated. That may make it seems like there are more pools than there actually are.
The following script demonstrates the problem and does the following:
- configures the default pools with a count of 3 and a timeout of 1 second
- does requests to 3 URLs, creating 3 x 3 = 9 pools
- looks at pool status and the actual registry in which Finch stores the pools, these match directly after making the requests
- then it sleeps for 5 seconds to kill off 3 pools (empty pools aren't shut down right now, see #311)
- then it prints the pool metrics and registry again, noting that the pool metrics still has 3 pools per origin while in the registry there are only 2 per origin
finch_name = Tobi.Tries
_finch = Finch.start_link(name: finch_name, pools: %{default: [count: 3, pool_max_idle_time: to_timeout(second: 1), start_pool_metrics?: true]})
urls = ["http://www.google.com", "https://www.screenversemedia.com/", "http://hex.pm"]
urls
|> Enum.map(fn url ->
Task.async(fn -> Req.get!(url, finch: finch_name) end)
end)
|> Task.await_many()
urls
|> Enum.map(& (Finch.get_pool_status(finch_name, &1)))
|> IO.inspect()
urls
|> Enum.map(&Finch.Request.parse_url/1)
|> Enum.map(fn {s, h, p, _, _} -> {{s, h, p}, Registry.lookup(finch_name, {s, h, p}) |> length()} end)
|> IO.inspect()
IO.puts("\n\n\nSLEEEEEPPPYYYY TTTTIIIIMMMMEEEEEEEE\n\n\n")
# should kill the connections/pools
Process.sleep(to_timeout(second: 5))
urls
|> Enum.map(& (Finch.get_pool_status(finch_name, &1)))
|> IO.inspect()
urls
|> Enum.map(&Finch.Request.parse_url/1)
|> Enum.map(fn {s, h, p, _, _} -> {{s, h, p}, Registry.lookup(finch_name, {s, h, p}) |> length()} end)
|> IO.inspect()
The important output here is:
# ...
SLEEEEEPPPYYYY TTTTIIIIMMMMEEEEEEEE
[
ok: [
%Finch.HTTP1.PoolMetrics{
pool_index: 1,
pool_size: 50,
available_connections: 50,
in_use_connections: 0
},
%Finch.HTTP1.PoolMetrics{
pool_index: 2,
pool_size: 50,
available_connections: 50,
in_use_connections: 0
},
%Finch.HTTP1.PoolMetrics{
pool_index: 3,
pool_size: 50,
available_connections: 50,
in_use_connections: 0
}
],
ok: [
%Finch.HTTP1.PoolMetrics{
pool_index: 1,
pool_size: 50,
available_connections: 50,
in_use_connections: 0
},
%Finch.HTTP1.PoolMetrics{
pool_index: 2,
pool_size: 50,
available_connections: 50,
in_use_connections: 0
},
%Finch.HTTP1.PoolMetrics{
pool_index: 3,
pool_size: 50,
available_connections: 50,
in_use_connections: 0
}
],
ok: [
%Finch.HTTP1.PoolMetrics{
pool_index: 1,
pool_size: 50,
available_connections: 50,
in_use_connections: 0
},
%Finch.HTTP1.PoolMetrics{
pool_index: 2,
pool_size: 50,
available_connections: 50,
in_use_connections: 0
},
%Finch.HTTP1.PoolMetrics{
pool_index: 3,
pool_size: 50,
available_connections: 50,
in_use_connections: 0
}
]
]
[
{{:http, "www.google.com", 80}, 2},
{{:https, "www.screenversemedia.com", 443}, 2},
{{:http, "hex.pm", 80}, 2}
Solution
I suppose that in either the terminate/2 (Genserver)-callback (preferably, I suppose) or the NimblePools terminate_pool/2 callback the pool metrics of the terminating pool should be removed. Alternatively some "life check" process.
Should probably be happy to implement this, although I shouldn't make too many promises these days 😅
Thanks a ton for finch!
Great catch! The proposed solution makes sense to me. Thank you for the report and suggestion @PragTob !