pinot icon indicating copy to clipboard operation
pinot copied to clipboard

Database query quota

Open shounakmk219 opened this issue 1 year ago • 1 comments

Goal

  • Ability to impose query rate limits at database level. This will consider all the queries that are made to the tables present under the database.
  • Way to specify query quota at database level

Introduce database rate limiter

Pinot decides if the query quota is reached or not with the help of QueryQuotaManager. QueryQuotaManager interface right now has an acquire(tableName) method which decides if the quota for the provided table is reached or not by returning a boolean. To allow similar mechanism for database query quota a new interface method acquireDatabase(databaseName) is introduced on QueryQuotaManager interface

Query quota provisioning

Right now as the query and storage quotas on tables are maintained at TableConfig itself, we follow the same approach for database quota configs as well.

We don't have a place to provide database specific configs as it not being a first class entity. But for this usecase and similar future usecases we need to maintain a separate DatabaseConfig znode. Assumptions on the new DatabaseConfig znode:

  • This config does not represent the logical database entity.
  • Absence of this config does not prevent users from creating tables under a database.
  • Deletion of thus config does not delete the tables under the database.

Provide default database query quota in ClusterConfig as below

"databaseMaxQueriesPerSecond" : "1000"

If we wish to provide an override for a particular database we can do so by creating a znode under PROPERTYSTORE/CONFIGS/DATABASE/<databaseName>

{
  "databaseName" : "db1",
  "quota" : {
    "maxQueriesPerSecond" : "200"
  }
}

Exposed APIs on controller to manage this:

  • POST /databases/{databaseName}/quotas?maxQueriesPerSecond=<query quota>
  • GET /databases/{databaseName}/quotas

Query quota update handling

There are 2 source based on which the database query quota is decided

  1. Default query quota provided at cluster config
  2. Override query quota provided at database config

On top of the overall quota provided by above configs, the per broker quota is also affected by the number of live brokers in the cluster. Considering all the above factors, rate limiter updates are handled as below.

  • Introduced new user defined message DatabaseConfigRefreshMessage which is sent to brokers upon the database config updates.
  • Added ClusterConfigChangeListener to ClusterChangeMediator to handle the changes made to database query quota in cluster configs.
  • Update per broker rate quota upon broker resource EV change
  • Create database rate limiter upon OFFLINE -> ONLINE transition of tables in BrokerResourceOnlineOfflineStateModel

shounakmk219 avatar Jul 04 '24 14:07 shounakmk219

We may eventually end up using Database feature. I have a couple of related questions

  • Doe the DB level quota get even distributed across all tables in that DB ?
  • On a related note, how does the math work for updating / maintaining and enforcing quotas when there is both table level and DB level quota configured ? Is there a precedent ?
  • Can we add a new table to an existing DB where that table level quota is exceeding the DB level quota ?

siddharthteotia avatar Jul 05 '24 16:07 siddharthteotia

@siddharthteotia

Does the DB level quota get even distributed across all tables in that DB ?

The db level quota is separate from table level quota. Each table has it's separate quota and individually or cumulatively it may exceed the db level quota. But queries will be blocked once the db level quota is reached. Its upto the db admin to ensure proper table level quotas are provided so that tables don’t end up contesting for the db quota share.

On a related note, how does the math work for updating / maintaining and enforcing quotas when there is both table level and DB level quota configured ? Is there a precedent ?

As mentioned above these 2 quotas are separate. For any incoming query both the table quota and db quota is checked and the query will fail if anyone of them is exceeding.

Can we add a new table to an existing DB where that table level quota is exceeding the DB level quota ?

Yes. We don't enforce the rule that sum of all table quotas under a db must not exceed db quota. Cluster/DB maintainer should take the call on what table quota should be and what db quota should be. Not necessary all tables may reach their quota at the same time. DB quota just ensures that at any point the particular db is not running more than x queries and does not affect other DBs in the cluster.

shounakmk219 avatar Jul 08 '24 04:07 shounakmk219

Codecov Report

Attention: Patch coverage is 41.15226% with 143 lines in your changes missing coverage. Please review.

Project coverage is 61.92%. Comparing base (59551e4) to head (912b4ae). Report is 819 commits behind head on master.

Files Patch % Lines
...ache/pinot/common/metadata/ZKMetadataProvider.java 0.00% 31 Missing :warning:
...er/api/resources/PinotDatabaseRestletResource.java 0.00% 24 Missing :warning:
...ntroller/helix/core/PinotHelixResourceManager.java 0.00% 22 Missing :warning:
.../helix/BrokerUserDefinedMessageHandlerFactory.java 0.00% 13 Missing :warning:
.../common/messages/DatabaseConfigRefreshMessage.java 0.00% 11 Missing :warning:
...va/org/apache/pinot/spi/config/DatabaseConfig.java 0.00% 10 Missing :warning:
...quota/HelixExternalViewBasedQueryQuotaManager.java 91.00% 4 Missing and 5 partials :warning:
...sthandler/BaseSingleStageBrokerRequestHandler.java 12.50% 6 Missing and 1 partial :warning:
...requesthandler/MultiStageBrokerRequestHandler.java 0.00% 6 Missing :warning:
...e/pinot/broker/broker/helix/BaseBrokerStarter.java 44.44% 4 Missing and 1 partial :warning:
... and 2 more
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #13544      +/-   ##
============================================
+ Coverage     61.75%   61.92%   +0.17%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2557     +121     
  Lines        133233   141046    +7813     
  Branches      20636    21935    +1299     
============================================
+ Hits          82274    87349    +5075     
- Misses        44911    47042    +2131     
- Partials       6048     6655     +607     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) :arrow_down:
integration <0.01% <0.00%> (-0.01%) :arrow_down:
integration1 <0.01% <0.00%> (-0.01%) :arrow_down:
integration2 0.00% <0.00%> (ø)
java-11 61.88% <41.15%> (+0.17%) :arrow_up:
java-21 61.82% <41.15%> (+0.19%) :arrow_up:
skip-bytebuffers-false 61.91% <41.15%> (+0.16%) :arrow_up:
skip-bytebuffers-true 61.81% <41.15%> (+34.08%) :arrow_up:
temurin 61.92% <41.15%> (+0.17%) :arrow_up:
unittests 61.92% <41.15%> (+0.17%) :arrow_up:
unittests1 46.36% <0.00%> (-0.53%) :arrow_down:
unittests2 27.79% <41.15%> (+0.06%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Jul 08 '24 07:07 codecov-commenter

Are there tests for setting/updating qps due to helix messages for table qps ? Do you plan to tests those out as well ?

@vrajat I am not sure how to specifically test that, will try to add a new integration test for checking query quota. I have done manual testing to ensure all the different flows work. Maybe I can add a section on what all manual tests were performed.

shounakmk219 avatar Jul 25 '24 05:07 shounakmk219

The tests LGTM

vrajat avatar Jul 29 '24 16:07 vrajat