watchmen icon indicating copy to clipboard operation
watchmen copied to clipboard

feat: add API to get latest outages (WIP)

Open thomas-barthelemy opened this issue 9 years ago • 3 comments

Still a WIP but a fair taste of what I have in mind regarding #36. Little considerations to discuss out of that issue:

  • LIMIT parameter defaults to 10
  • SINCE parameter defaults to 1 hour ago
  • API added to /api/services/outages
  • Currently outages definition do not contain any reference to the service so right now I'm adding an extra service field to each representing the service id. I also considered grouping result by service id instead.
  • Currently only available for admin user (Might wanna extend that to any logged in user?)
  • I did not add the build files for now (/public/build/), will add when/if need be. What's their use right now in the repository?
  • Missing some test on result, there is quite a few private test helper related to Outages in the test-report.js, might wanna avoid duplicating those as it will be needed around
  • On the performances, so far we get max-items outages for each requested service with a Multi, then sorting and slicing the result.

thomas-barthelemy avatar Aug 28 '16 03:08 thomas-barthelemy

Thanks a lot for the contribution Thomas! I answered some of your points/questions on the code while I was doing the review.

About point number 4, having the service name and ID would be great. Grouping by service could be a great idea... maybe that could be determined by a query parameter (default grouping=off).

Number 5: yes, that would be the whole point.

Number 6: this repo ships with the built assets so you can run it out of the box if you want.

Number 7: assertions to prove the main structure for outages should be fine.

Number 8: yes, performance may be a concern if there are hundreds of services and hundreds of outages :) Some rough numbers on the current performance will the proposed solution would be great.

iloire avatar Aug 29 '16 09:08 iloire

Thanks for the answers, that helped clarify a bit more what we were going for with this. Still have some more to do and I need to review the overall change, probably somewhere this week. For the WIP:

  • Tests (Add outages and actually validate the result
  • grouping parameter handling (probably gonna have 2 parsing/structuring methods there)

for 7, my point was more that test-report.js has a private helper to add dummy outages that I will need as well in test-api-report-route.js. but it's minimal duplication (10~ lines) in tests files so I believe it to be acceptable.

for 8 I will run some numbers after remaining todos, definitely curious to see how that multi is doing.

thomas-barthelemy avatar Aug 29 '16 10:08 thomas-barthelemy

Add tests and impl grouping. The remaining question would be the meaning of the max-items when grouping, as there is no nice way to say "those should be the X first services". so in case of grouping so far the max-items is ignored, and the max-items-per-service is working as expected.

Some numbers for 100 services, 10 outages/service on average:

  • api/report/services: ~120ms
  • api/report/services/outages?since=0: ~12ms
  • api/report/services/outages?since=0&max-items=100: ~12ms (this is expected as the different is just a slice)
  • api/report/services/outages?since=0&max-items=100&max-items-per-service=100: ~110ms

overall it's pretty consistent in performance with other APIs.

thomas-barthelemy avatar Sep 06 '16 12:09 thomas-barthelemy