matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

[Feature Request]: "master" index for sparse/many column indices

Open fengttt opened this issue 1 year ago • 4 comments

Is there an existing issue for the same feature request?

  • [X] I have checked the existing issues.

Is your feature request related to a problem?

create index on t (col1, col2, col3, ..., col1000) using master index;

Describe the feature you'd like

create an index for many, sparse columns.

Describe implementation you've considered

Build the index, make sure it works with all DML.

Optimize query, using index when possible.

Documentation, Adoption, Use Case, Migration Strategy

No response

Additional information

No response

fengttt avatar Jan 25 '24 09:01 fengttt

The PR is merged and is ready for testing.

The Doc: 【企微文档】Master Index https://doc.weixin.qq.com/doc/w3_AawAhAZmAIkF47477sETGag09XnLO?scode=AJsA6gc3AA8sb1w99TAawAhAZmAIk

Design Doc: https://github.com/matrixorigin/docs/pull/271

CC: @yangj1211

arjunsk avatar Feb 02 '24 19:02 arjunsk

Updated the document to include SELECT query usage:

https://doc.weixin.qq.com/doc/w3_AawAhAZmAIkF47477sETGag09XnLO?scode=AJsA6gc3AA8sb1w99TAawAhAZmAIk

CC: @yangj1211

arjunsk avatar Mar 18 '24 18:03 arjunsk

The master index is now ready. The performance is comparable to existing secondary index.

1 Filter Query QPS

  • No index: 500
  • 1 Master: 2820
  • 100 Secondary: 2958
Query Plan
-- Master Index

mysql> explain analyze SELECT tbl.a100  FROM tbl  WHERE tbl.a48 = 'b92k7dWP5t';
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| AP QUERY PLAN ON MULTICN(10 core)                                                                                                                   |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Project                                                                                                                                             |
|   Analyze: timeConsumed=0ms waitTime=4ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                            |
|   ->  Join                                                                                                                                          |
|         Analyze: timeConsumed=1ms waitTime=29ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                     |
|         Join Type: INDEX                                                                                                                            |
|         Join Cond: (tbl.a100 = __mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_pri_col)                                        |
|         Runtime Filter Build: #[-1,0]                                                                                                               |
|         ->  Table Scan on a.tbl [ForceOneCN]                                                                                                        |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=50bytes |
|               Filter Cond: (tbl.a48 = 'b92k7dWP5t')                                                                                                 |
|               Block Filter Cond: (tbl.a48 = 'b92k7dWP5t')                                                                                           |
|               Runtime Filter Probe: tbl.a100                                                                                                        |
|         ->  Table Scan on a.__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17 [ForceOneCN]                                                  |
|               Analyze: timeConsumed=1ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=79bytes OutputSize=24bytes MemorySize=80bytes |
|               Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F47 Fb92k7dWP5t ')            |
|               Block Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F47 Fb92k7dWP5t ')      |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
16 rows in set (0.00 sec)


-- Secondary Index

mysql> explain analyze SELECT tbl.a100  FROM tbl  WHERE tbl.a48 = 'b92k7dWP5t';
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| TP QURERY PLAN                                                                                                                                      |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Project                                                                                                                                             |
|   Analyze: timeConsumed=0ms waitTime=0ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                            |
|   ->  Join                                                                                                                                          |
|         Analyze: timeConsumed=0ms waitTime=1ms inputRows=1 outputRows=1 InputSize=24bytes OutputSize=24bytes MemorySize=0bytes                      |
|         Join Type: INDEX                                                                                                                            |
|         Join Cond: (tbl.a100 = __mo_index_secondary_01900577-f81d-7a7b-802c-b61a09a28067.__mo_index_pri_col)                                        |
|         Runtime Filter Build: #[-1,0]                                                                                                               |
|         ->  Table Scan on a.tbl [ForceOneCN]                                                                                                        |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=50bytes |
|               Filter Cond: (tbl.a48 = 'b92k7dWP5t')                                                                                                 |
|               Block Filter Cond: (tbl.a48 = 'b92k7dWP5t')                                                                                           |
|               Runtime Filter Probe: tbl.a100                                                                                                        |
|         ->  Table Scan on a.__mo_index_secondary_01900577-f81d-7a7b-802c-b61a09a28067 [ForceOneCN]                                                  |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=74bytes OutputSize=24bytes MemorySize=75bytes |
|               Filter Cond: prefix_eq(__mo_index_secondary_01900577-f81d-7a7b-802c-b61a09a28067.__mo_index_idx_col, 'Fb92k7dWP5t ')                 |
|               Block Filter Cond: prefix_eq(__mo_index_secondary_01900577-f81d-7a7b-802c-b61a09a28067.__mo_index_idx_col, 'Fb92k7dWP5t ')           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
16 rows in set (0.00 sec)



2 Filter Query QPS

  • No Index:
  • 1 Master: 1335 (right now we use both the filters using inner join)
  • 100 Secondary: 1725 (we only make use of one secondary index table)
Query Plan
-- master index
mysql> explain analyze SELECT tbl.a100  FROM tbl  WHERE tbl.a89 = '40u4JSeGvz' AND tbl.a31 = '3X5ZOcJbol';
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| AP QUERY PLAN ON MULTICN(10 core)                                                                                                                                                                            |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Project                                                                                                                                                                                                      |
|   Analyze: timeConsumed=0ms waitTime=36ms inputRows=3 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=0bytes                                                                                    |
|   ->  Join                                                                                                                                                                                                   |
|         Analyze: timeConsumed=11ms waitTime=167ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                                                                            |
|         Join Type: INDEX                                                                                                                                                                                     |
|         Join Cond: (tbl.a100 = __mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_pri_col)                                                                                                 |
|         Runtime Filter Build: #[-1,0]                                                                                                                                                                        |
|         ->  Table Scan on a.tbl [ForceOneCN]                                                                                                                                                                 |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=75bytes                                                          |
|               Filter Cond: (tbl.a89 = '40u4JSeGvz'), (tbl.a31 = '3X5ZOcJbol')                                                                                                                                |
|               Block Filter Cond: (tbl.a89 = '40u4JSeGvz'), (tbl.a31 = '3X5ZOcJbol')                                                                                                                          |
|               Runtime Filter Probe: tbl.a100                                                                                                                                                                 |
|         ->  Join                                                                                                                                                                                             |
|               Analyze: timeConsumed=11ms probe_time=[total=0ms,min=0ms,max=0ms,dop=10] build_time=[11ms] waitTime=145ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=361187bytes |
|               Join Type: INNER                                                                                                                                                                               |
|               Join Cond: (__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_pri_col = __mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_pri_col)                       |
|               ->  Table Scan on a.__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17 [ForceOneCN]                                                                                                     |
|                     Analyze: timeConsumed=5ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=79bytes OutputSize=24bytes MemorySize=80bytes                                                    |
|                     Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F30 F3X5ZOcJbol ')                                                               |
|                     Block Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F30 F3X5ZOcJbol ')                                                         |
|               ->  Table Scan on a.__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17 [ForceOneCN]                                                                                                     |
|                     Analyze: timeConsumed=11ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=79bytes OutputSize=24bytes MemorySize=80bytes                                                   |
|                     Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F88 F40u4JSeGvz ')                                                               |
|                     Block Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F88 F40u4JSeGvz ')                                                         |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
24 rows in set (0.03 sec)



-- secondary index

mysql> explain analyze SELECT tbl.a100  FROM tbl  WHERE tbl.a89 = '40u4JSeGvz' AND tbl.a31 = '3X5ZOcJbol';
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| TP QURERY PLAN                                                                                                                                      |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Project                                                                                                                                             |
|   Analyze: timeConsumed=0ms waitTime=0ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                            |
|   ->  Join                                                                                                                                          |
|         Analyze: timeConsumed=0ms waitTime=1ms inputRows=1 outputRows=1 InputSize=24bytes OutputSize=24bytes MemorySize=0bytes                      |
|         Join Type: INDEX                                                                                                                            |
|         Join Cond: (tbl.a100 = __mo_index_secondary_01900568-4e08-7f3a-9fb9-755355944df6.__mo_index_pri_col)                                        |
|         Runtime Filter Build: #[-1,0]                                                                                                               |
|         ->  Table Scan on a.tbl [ForceOneCN]                                                                                                        |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=75bytes |
|               Filter Cond: (tbl.a89 = '40u4JSeGvz'), (tbl.a31 = '3X5ZOcJbol')                                                                       |
|               Block Filter Cond: (tbl.a89 = '40u4JSeGvz'), (tbl.a31 = '3X5ZOcJbol')                                                                 |
|               Runtime Filter Probe: tbl.a100                                                                                                        |
|         ->  Table Scan on a.__mo_index_secondary_01900568-4e08-7f3a-9fb9-755355944df6 [ForceOneCN]                                                  |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=74bytes OutputSize=24bytes MemorySize=75bytes |
|               Filter Cond: prefix_eq(__mo_index_secondary_01900568-4e08-7f3a-9fb9-755355944df6.__mo_index_idx_col, 'F3X5ZOcJbol ')                 |
|               Block Filter Cond: prefix_eq(__mo_index_secondary_01900568-4e08-7f3a-9fb9-755355944df6.__mo_index_idx_col, 'F3X5ZOcJbol ')           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
16 rows in set (0.00 sec)

arjunsk avatar Jun 11 '24 17:06 arjunsk