pinot icon indicating copy to clipboard operation
pinot copied to clipboard

broker & server api for realtime table freshness

Open priyen opened this issue 1 year ago • 2 comments

One of the potential solutions to https://github.com/apache/pinot/issues/12477 was a broker api that returns table freshness info.

This PR:

  • adds a new broker api /debug/tableFreshness/{table}?timeoutMs=1000 that returns the minimum ingestion lag timestamp reported by servers across all the consuming segments. Underneath, the broker uses a new server api, /tables/{table}/consumingSegmentsFreshnessInfo that takes in a list of segments and returns the ingestion lag timestamp for the consuming segments among them
  • the broker was chosen because it is a component designed to be highly available, fast, & the information needed is right there: the routing map.
  • the api will route to 1 of the replica's based on whatever config is set, just as it does for any normal query

Example usage from broker:

❯ curl -i 'http://localhost:8000/debug/tableFreshness/airlineStats_REALTIME?timeoutMs=1000'
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 30

{"timestamp-ms":1716915412399}%

Example server usage:

❯ curl -X POST -H "Content-Type: application/json" \
-d '["airlineStats__0__0__20240528T1641Z", "airlineStats__4__0__20240528T1641Z"]' \
http://localhost:7500/tables/airlineStats/consumingSegmentsFreshnessInfo

{"airlineStats__4__0__20240528T1641Z":1716915549395,"airlineStats__0__0__20240528T1641Z":1716915552629}%

Possible improvements in future PRs:

  • collect this data periodically via a "FreshnessManager" of sorts cc @Jackie-Jiang

Instructions:

  1. The PR has to be tagged with at least one of the following labels (*):
    1. feature
    2. bugfix
    3. performance
    4. ui
    5. backward-incompat
    6. release-notes (**)
  2. Remove these instructions before publishing the PR.

(*) Other labels to consider:

  • testing
  • dependencies
  • docker
  • kubernetes
  • observability
  • security
  • code-style
  • extension-point
  • refactor
  • cleanup

(**) Use release-notes label for scenarios like:

  • New configuration options
  • Deprecation of configurations
  • Signature changes to public methods/interfaces
  • New plugins added or old plugins removed

priyen avatar May 28 '24 17:05 priyen

Codecov Report

Attention: Patch coverage is 3.37079% with 86 lines in your changes are missing coverage. Please review.

Project coverage is 62.12%. Comparing base (59551e4) to head (dbf1aef). Report is 500 commits behind head on master.

Files Patch % Lines
...che/pinot/broker/routing/BrokerRoutingManager.java 6.81% 41 Missing :warning:
...che/pinot/server/api/resources/TablesResource.java 0.00% 26 Missing :warning:
...e/pinot/broker/api/resources/PinotBrokerDebug.java 0.00% 19 Missing :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #13249      +/-   ##
============================================
+ Coverage     61.75%   62.12%   +0.37%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2534      +98     
  Lines        133233   139101    +5868     
  Branches      20636    21549     +913     
============================================
+ Hits          82274    86417    +4143     
- Misses        44911    46227    +1316     
- Partials       6048     6457     +409     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) :arrow_down:
integration <0.01% <0.00%> (-0.01%) :arrow_down:
integration1 <0.01% <0.00%> (-0.01%) :arrow_down:
integration2 0.00% <0.00%> (ø)
java-11 62.09% <3.37%> (+0.38%) :arrow_up:
java-21 61.99% <3.37%> (+0.36%) :arrow_up:
skip-bytebuffers-false 62.11% <3.37%> (+0.36%) :arrow_up:
skip-bytebuffers-true 61.97% <3.37%> (+34.24%) :arrow_up:
temurin 62.12% <3.37%> (+0.37%) :arrow_up:
unittests 62.12% <3.37%> (+0.37%) :arrow_up:
unittests1 46.68% <ø> (-0.21%) :arrow_down:
unittests2 27.79% <3.37%> (+0.06%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar May 28 '24 17:05 codecov-commenter

@Jackie-Jiang regarding broker vs controller - I was relying on the broker being aware of which segments can be queried or not. (ie, we know some servers hosting segments are not ready as they are not yet caught up). Additionally, the broker routing map means we can cycle through the replica's based on the query strategy of the table to determine the freshness

we have some FUD about adding to controller for above reasons & controller is not generally a high QPS or reliable component but broker is expected to be

priyen avatar May 29 '24 20:05 priyen