lorawan-stack
lorawan-stack copied to clipboard
Gateway status api for multiple gateways
Summary
The current GS API allows getting the status for a single gateway via: GET /api/v3/gs/gateways/{gateway_id}/connection/stats
If one need to obtain the statuses of all your gateways you need to call this endpoint multiple (hundreds of) times. The overhead for an HTTP GET makes this process very slow.
It would be nice if this API worked just like the Gateway Registry List API (/api/v3/gateways) which returns a JSON array of all gateways the user has access to.
Why do we need this?
Lower overhead.
What is already there? What do you see now?
An API to obtain the status of a single gateway.
What is missing? What do you want to see?
An API to obtain the statuses of all gateways.
Environment
TTS v3.16.2
How do you propose to implement this?
Add extra API endpoint to GS.
How do you propose to test this?
Code test cases, curl command.
Can you do this yourself and submit a Pull Request?
I do not think so.
My current workflow is as follows:
- An admin user creates a personal API Key that has the permissions: View gateway info, location, status, list gateways
- Use api key to get a list of all gateways from
/api/v3/gateways - For each gateway obtained, call
/api/v3/gs/gateways/{gateway_id}/connection/statsto get the statuses.
As can be seen this is an iterative process that can take long, as N+1 GET requests are done to TTS. It would be better if this same info can be obtained from a single, or two GET requests (one to Gateway Registry for metadata, one to GS for status).
Third party systems might want to know the near-realtime statuses of gateways, which will mean this API would be called often.
Thanks @jpmeijers for filing this issue. We discussed this internally and here's the reasoning for not having an API to fetch multiple statuses
- This won't work well with our current rate-limiting and would need some additional things to be implemented there.
- Since stats are localized to a cluster, calls for multiple gateways need to be collected from various clusters which is not easy to implement.
- With our planned events system, we will push notifications for gateway actions instead of users needing to poll statuses.
So, as of now we don't plan to implement this.
For the record https://github.com/TheThingsNetwork/lorawan-stack/pull/5536 probably handles most of this. It still requires a list of gateway identifiers, but it does allow batch retrieval in order to lower the number of parallel requests.
I saw a mention to this in the changelog and thought it might solve this issue. When I get a chance I'll test and confirm.
According to the documentation I should be able to fetch up to 100 statuses per request. I'm however getting a timeout from the grpc gateway when I do this. Lowering the number of statuses to 50 works. At least that decreases the number of api calls from 200 to 4.
Edit: It seems like I'm sometimes getting a timeout doing a request for 50 status too. Lowering the number of gateways to 10, which helps, but not as much as I had hoped.
{
"details": [
{
"message_format": "upstream request timeout",
"attributes": {
"flags": "UT",
"cluster": "au1.cloud.thethings.industries"
},
"@type": "type.googleapis.com/ttn.lorawan.v3.ErrorDetails",
"name": "504_upstream_response_timeout",
"namespace": "proxy",
"correlation_id": "8396e01b-74fe-42bc-8a47-2173aa1e4a1b"
}
],
"message": "upstream request timeout"
}
I think the issue is now fixed and you can use the batch connection statistics end point. Is there anything else here?