sentry icon indicating copy to clipboard operation
sentry copied to clipboard

ref(tasks): Fix collect project platforms for snowflake ids

Open maxiuyuan opened this issue 3 years ago • 0 comments

*edit Abandoned the approach of manually slicing the Project table up by ID ranges. Now we'll paginate queries over the Project table and fetch 1000 IDs at a time

Snowflake_id currently destroys the loop in collect_project_platforms in src/sentry/tasks/collect_project_platforms.py . With snowflake ids a trouble rises due to while min_project_id <= max_project_id with our max_project_id now becoming massive. We can divide this into 2 parts, the projects created before and after snowflake ids. with the ones before, filtering like the following makes sense since 1 step is equivalent to 1 project

project__gte=min_project_id,
project__lt=min_project_id + step,

but with snowflake ids, the current filtering method wouldn't make sense since 1000 would just now represent 1000 seconds and there could be 0 projects in that timeframe.

The changes added will maintain the current method for pre_snowflake id projects. For the projects created after snowflake id implementation, we will change the steps to be 1/1000th of the time from the first to last snowflake id

maxiuyuan avatar Jul 29 '22 21:07 maxiuyuan