ossinsight icon indicating copy to clipboard operation
ossinsight copied to clipboard

Improve the repo star history query

Open Mini256 opened this issue 1 year ago • 0 comments

The query will only return the repo id, get the latest repo name through API or other way. In this way, we can obtain faster query speed by covering index.

From:

WITH acc AS (
    SELECT
        event_month,
        repo_name,
        COUNT(actor_login) OVER(PARTITION BY repo_name ORDER BY event_month ASC) AS total
    FROM (
        SELECT
            event_month,
            actor_login,
            FIRST_VALUE(repo_name) OVER (PARTITION BY repo_id ORDER BY created_at DESC) AS repo_name,
            ROW_NUMBER() OVER(PARTITION BY actor_login) AS row_num
        FROM github_events
        WHERE
            type = 'WatchEvent' AND repo_id = 60246359
    ) prs_with_latest_repo_name
    WHERE row_num = 1
    ORDER BY 1
)
SELECT event_month, ANY_VALUE(repo_name) AS repo_name, ANY_VALUE(total) AS total
FROM acc
GROUP BY 1
ORDER BY 1
;

To:

SELECT
    event_month, repo_id, total
FROM (
    SELECT
        event_month,
        repo_id,
        COUNT(actor_login) OVER(PARTITION BY repo_id ORDER BY event_month ASC) AS total,
        ROW_NUMBER() OVER(PARTITION BY repo_id) AS row_num
    FROM github_events
    WHERE
        type = 'WatchEvent' AND repo_id = 60246359
    ORDER BY event_month
) acc
WHERE row_num = 1
GROUP BY event_month
ORDER BY event_month
;
+---------------------------------+----------+-----------+----------------------------------------------------------------------------------------------------------------------
| id                              | estRows  | task      | access object                                                                                                        
+---------------------------------+----------+-----------+----------------------------------------------------------------------------------------------------------------------
| Projection_29                   | 0.90     | root      |                                                                                                                      
| └─Selection_30                  | 0.90     | root      |                                                                                                                      
|   └─Window_31                   | 1.12     | root      |                                                                                                                      
|     └─Sort_28                   | 1.12     | root      |                                                                                                                      
|       └─Window_19               | 1.12     | root      |                                                                                                                      
|         └─Sort_27               | 48989.63 | root      |                                                                                                                      
|           └─IndexReader_26      | 48989.63 | root      | partition:watch_event                                                                                                
|             └─IndexRangeScan_25 | 48989.63 | cop[tikv] | table:github_events, index:index_github_events_on_repo_id_type_action_month_actor_login(repo_id, type, action, event_
+---------------------------------+----------+-----------+----------------------------------------------------------------------------------------------------------------------

Mini256 avatar Aug 15 '22 08:08 Mini256