mongodb-d4 icon indicating copy to clipboard operation
mongodb-d4 copied to clipboard

Performance decreasing for denormalization

Open fangjian601 opened this issue 10 years ago • 0 comments
trafficstars

Here are two designs for TATP benchmark:

Design 1

[00] ACCESS_INFO
  denorm:    None
  shardKeys: [u's_id']
[01] CALL_FORWARDING
  denorm:    None
  shardKeys: [u's_id']
[02] SPECIAL_FACILITY
  denorm:    None
  shardKeys: [u's_id']
[03] SUBSCRIBER
  denorm:    None
  shardKeys: [u's_id']

Design 2

[00] ACCESS_INFO
  denorm:    None
  shardKeys: [u's_id', u'ai_type']
[01] CALL_FORWARDING
  denorm:    SPECIAL_FACILITY
  shardKeys: []
[02] SPECIAL_FACILITY
  denorm:    SUBSCRIBER
  shardKeys: []
[03] SUBSCRIBER
  denorm:    None
  shardKeys: [u's_id']

In our current cost model, Design 2 has lower cost than Design 1. However, the replay framework indicates Design 1 has much higher throughput than Design 2. Below are replay framework's results:

Design 1

--------------------------------------------------------------------------
                    Executed          Total Time (ms)   Rate              
  Replay Queries    993561 - 100.0%   1379015.66553     11288.49 op/s     
--------------------------------------------------------------------------
  TOTAL             993561            119900.106192     8286.57 op/s      
==========================================================================
Latency Report
--------------------------------------------------------------------------
Queries(%)    Latency(ms)
10.0%         0.1020     
20.0%         0.5219     
50.0%         0.7350     
80.0%         1.1170     
90.0%         1.5299     
99.9%         22.9671    
--------------------------------------------------------------------------
=============================================================================
Top 20 Slowest Operations
-----------------------------------------------------------------------------
#     Latency(ms)    Session Id    Operation Id    Type      Collection      
0     855.7951       123           0               $query    SUBSCRIBER      
1     855.0751       530           1               $query    SUBSCRIBER      
2     854.6751       1715          1               $query    SUBSCRIBER      
3     854.6152       1853          0               $query    SUBSCRIBER      
4     854.6140       1216          0               $query    SPECIAL_FACILITY
5     854.5260       1601          0               $query    ACCESS_INFO     
6     854.4500       1148          1               $query    SUBSCRIBER      
7     854.3968       1548          0               $query    SUBSCRIBER      
8     854.3952       364           0               $query    ACCESS_INFO     
9     854.2671       116           0               $query    SUBSCRIBER      
10    854.2390       1139          0               $query    ACCESS_INFO     
11    854.2109       1651          0               $query    SUBSCRIBER      
12    854.1081       1891          1               $query    CALL_FORWARDING 
13    853.9870       1890          0               $query    SPECIAL_FACILITY
14    853.9579       1262          0               $query    SPECIAL_FACILITY
15    853.9290       1196          0               $query    ACCESS_INFO     
16    853.8289       1536          0               $query    ACCESS_INFO     
17    853.8220       781           0               $query    SUBSCRIBER      
18    853.8101       882           0               $query    ACCESS_INFO     
19    853.7490       71            0               $query    SUBSCRIBER      
-----------------------------------------------------------------------------

Design 2

--------------------------------------------------------------------------
                    Executed          Total Time (ms)   Rate              
  Replay Queries    114321 - 100.0%   1436510.32448     1299.82 op/s      
--------------------------------------------------------------------------
  TOTAL             114321            120245.948076     950.73 op/s       
==========================================================================
Latency Report
--------------------------------------------------------------------------
Queries(%)    Latency(ms)
10.0%         0.0801     
20.0%         0.4001     
50.0%         0.5720     
80.0%         0.6790     
90.0%         0.7560     
99.9%         515.1670   
--------------------------------------------------------------------------
==========================================================================
Top 20 Slowest Operations
--------------------------------------------------------------------------
#     Latency(ms)    Session Id    Operation Id    Type      Collection 
0     556.4971       157           0               $query    SUBSCRIBER 
1     556.2160       157           0               $query    SUBSCRIBER 
2     551.5430       270           0               $query    ACCESS_INFO
3     545.9940       725           0               $query    SUBSCRIBER 
4     545.8341       1170          0               $query    ACCESS_INFO
5     545.4969       117           0               $query    SUBSCRIBER 
6     545.3651       1170          0               $query    ACCESS_INFO
7     544.0409       270           0               $query    ACCESS_INFO
8     543.6971       997           0               $query    SUBSCRIBER 
9     543.6630       16            0               $query    SUBSCRIBER 
10    540.9229       40            0               $query    ACCESS_INFO
11    540.8962       157           0               $query    SUBSCRIBER 
12    540.5340       40            0               $query    ACCESS_INFO
13    540.5271       40            0               $query    ACCESS_INFO
14    540.2842       187           0               $query    SUBSCRIBER 
15    540.1580       187           0               $query    SUBSCRIBER 
16    540.0901       187           0               $query    SUBSCRIBER 
17    539.8049       40            0               $query    ACCESS_INFO
18    539.7899       40            0               $query    ACCESS_INFO
19    539.6461       40            0               $query    ACCESS_INFO
--------------------------------------------------------------------------

Therefore, we need to figure out, why denormalization decreases the throughput so much and then adjust our cost models.

fangjian601 avatar Dec 16 '14 01:12 fangjian601