readthedocs.org icon indicating copy to clipboard operation
readthedocs.org copied to clipboard

DB: squash migrations

Open humitos opened this issue 1 year ago • 8 comments

When running tests locally we are executing 252 migrations that consume 33.43 seconds. This is tedious when running just a single test. I think we can reduce this time by squashing all the migrations so we have only one migration per application. What do you think?

Documentation: https://docs.djangoproject.com/en/4.1/topics/migrations/#migration-squashing

  Applying contenttypes.0001_initial... OK (0.005s)
  Applying auth.0001_initial... OK (0.013s)
  Applying account.0001_initial... OK (0.013s)
  Applying account.0002_email_max_length... OK (0.008s)
  Applying admin.0001_initial... OK (0.008s)
  Applying admin.0002_logentry_remove_auto_add... OK (0.010s)
  Applying admin.0003_logentry_add_action_flag_choices... OK (0.009s)
  Applying taggit.0001_initial... OK (0.011s)
  Applying projects.0001_initial... OK (0.064s)
  Applying builds.0001_initial... OK (0.144s)
  Applying projects.0002_add_importedfile_model... OK (0.016s)
  Applying projects.0003_project_cdn_enabled... OK (0.019s)
  Applying projects.0004_add_project_container_image... OK (0.020s)
  Applying projects.0005_sync_project_model... OK (0.051s)
  Applying projects.0006_add_domain_models... OK (0.017s)
  Applying projects.0007_migrate_canonical_data... OK (0.017s)
  Applying projects.0008_add_subproject_alias_prefix... OK (0.015s)
  Applying projects.0009_add_domain_field... OK (0.074s)
  Applying projects.0010_migrate_domain_data... OK (0.036s)
  Applying projects.0011_delete-url... OK (0.015s)
  Applying projects.0012_proper-name-for-install-project... OK (0.053s)
  Applying projects.0013_add-container-limits... OK (0.060s)
  Applying projects.0014_add-state-tracking... OK (0.040s)
  Applying projects.0015_add_project_allow_promos... OK (0.021s)
  Applying projects.0016_build-queue-name... OK (0.018s)
  Applying projects.0017_add_domain_https... OK (0.015s)
  Applying projects.0018_fix-translation-model... OK (0.018s)
  Applying projects.0019_add-features... OK (0.020s)
  Applying projects.0020_add-api-project-proxy... OK (0.002s)
  Applying projects.0021_add-webhook-deprecation-feature... OK (0.018s)
  Applying projects.0022_add-alias-slug... OK (0.021s)
  Applying projects.0023_migrate-alias-slug... OK (0.107s)
  Applying projects.0024_add-show-version-warning... OK (0.344s)
  Applying projects.0025_show-version-warning-existing-projects... OK (0.017s)
  Applying projects.0026_ad-free-option... OK (0.020s)
  Applying projects.0027_remove_json_with_html_feature... OK (0.017s)
  Applying projects.0028_remove_comments_and_update_old_migration... OK (0.145s)
  Applying projects.0029_add_additional_languages... OK (0.023s)
  Applying projects.0030_change-max-length-project-slug... OK (0.161s)
  Applying projects.0031_add_modified_date_importedfile... OK (0.017s)
  Applying projects.0032_increase_webhook_maxsize... OK (0.018s)
  Applying projects.0033_add_environment_variables... OK (0.024s)
  Applying projects.0034_remove_unused_project_model_fields... OK (0.195s)
  Applying projects.0035_container_time_limit_as_integer... OK (0.023s)
  Applying projects.0036_remove-auto-doctype... OK (0.045s)
  Applying projects.0037_add_htmlfile... OK (0.002s)
  Applying projects.0038_change-default-python-interpreter... OK (0.025s)
  Applying projects.0039_update-doctype-helptext... OK (0.019s)
  Applying projects.0040_increase_path_max_length... OK (0.018s)
  Applying projects.0041_index-repo-field... OK (0.023s)
  Applying projects.0042_increase_env_variable_value_max_length... OK (0.017s)
  Applying projects.0043_add-build-field... OK (0.017s)
  Applying projects.0044_auto_20190703_1300... OK (0.019s)
  Applying projects.0045_project_max_concurrent_builds... OK (0.021s)
  Applying projects.0046_domain_hsts_fields... OK (0.049s)
  Applying projects.0047_webhook_url_set_blank_default... OK (0.014s)
  Applying projects.0048_remove_version_privacy_field... OK (0.021s)
  Applying projects.0049_add_external_build_enabled... OK (0.112s)
  Applying projects.0050_migrate_external_builds... OK (0.018s)
  Applying projects.0051_project_urlconf_feature... OK (0.022s)
  Applying projects.0052_migrate_null_external_builds_field... OK (0.018s)
  Applying projects.0053_make_external_builds_field_not_null... OK (0.023s)
  Applying projects.0054_urlconf_blank... OK (0.018s)
  Applying projects.0055_change_help_text_description... OK (0.019s)
  Applying projects.0056_add_disable_analytics... OK (0.022s)
  Applying projects.0057_add_page_rank... OK (0.017s)
  Applying projects.0058_update_timestamp_fields... OK (0.014s)
  Applying projects.0059_migrate_null_rank... OK (0.018s)
  Applying projects.0060_make_rank_not_null... OK (0.018s)
  Applying projects.0061_add_imported_file_ignore... OK (0.018s)
  Applying projects.0062_add_ssl_status... OK (0.018s)
  Applying projects.0063_extend_domain_from_timestamp_model... OK (0.035s)
  Applying projects.0064_add_feature_future_default_true... OK (0.031s)
  Applying projects.0065_add_feature_future_default_true... OK (0.019s)
  Applying projects.0066_make_imported_file_slug_nullable... OK (0.019s)
  Applying projects.0067_change_max_length_feature_id... OK (0.017s)
  Applying projects.0068_remove_slug_field... OK (0.018s)
  Applying projects.0069_migrate_protected_projects... OK (0.019s)
  Applying projects.0070_make_md5_field_nullable... OK (0.018s)
  Applying projects.0071_add_env_var_privacy... OK (0.016s)
  Applying projects.0072_remove_md5_field... OK (0.017s)
  Applying projects.0073_remove_protected_privacy_level... OK (0.024s)
  Applying projects.0074_backport_indexes... OK (0.047s)
  Applying projects.0075_change_mkdocs_name... OK (0.023s)
  Applying sites.0001_initial... OK (0.003s)
  Applying socialaccount.0001_initial... OK (0.062s)
  Applying socialaccount.0002_token_max_lengths... OK (0.030s)
  Applying socialaccount.0003_extra_data_default_dict... OK (0.013s)
  Applying oauth.0001_initial... OK (0.219s)
  Applying oauth.0002_combine_services... OK (0.057s)
  Applying oauth.0003_move_github... OK (0.030s)
  Applying oauth.0004_drop_github_and_bitbucket_models... OK (0.186s)
  Applying oauth.0005_add_account_relation... OK (0.055s)
  Applying oauth.0006_move_oauth_source... OK (0.048s)
  Applying oauth.0007_org_slug_nonunique... OK (0.052s)
  Applying oauth.0008_add-project-relation... OK (0.030s)
  Applying oauth.0009_add_missing_model_change_migrations... OK (0.164s)
  Applying oauth.0010_index_full_name... OK (0.026s)
  Applying oauth.0011_add_default_branch... OK (0.025s)
  Applying oauth.0012_create_new_table_for_remote_organization_normalization... OK (0.088s)
  Applying oauth.0013_create_new_table_for_remote_repository_normalization... OK (0.098s)
  Applying projects.0076_project_remote_repository... OK (0.030s)
  Applying projects.0077_remote_repository_data_migration... OK (0.027s)
  Applying projects.0078_add_external_builds_privacy_level_field... OK (0.051s)
  Applying projects.0079_httpheader... OK (0.026s)
  Applying oauth.0014_remove_remoterepository_project... OK (0.030s)
  Applying projects.0080_historicalproject... OK (0.030s)
  Applying projects.0081_add_another_header... OK (0.011s)
  Applying projects.0082_add_extra_history_fields... OK (0.059s)
  Applying projects.0083_init_generic_webhooks... OK (0.274s)
  Applying projects.0084_create_webhook_events... OK (0.030s)
  Applying projects.0085_subscribe_old_webhooks_to_events... OK (0.032s)
  Applying projects.0086_is_spam... OK (0.065s)
  Applying projects.0087_use_booleanfield_null... OK (0.068s)
  Applying contenttypes.0002_remove_content_type_name... OK (0.050s)
  Applying builds.0002_build_command_initial... OK (0.031s)
  Applying builds.0003_add-cold-storage... OK (0.025s)
  Applying builds.0004_add-apiversion-proxy-model... OK (0.003s)
  Applying builds.0005_remove-version-alias... OK (0.195s)
  Applying builds.0006_add_config_field... OK (0.024s)
  Applying builds.0007_add-automation-rules... OK (0.154s)
  Applying builds.0008_remove-version-tags... OK (0.030s)
  Applying builds.0009_added_external_version_type... OK (0.049s)
  Applying builds.0010_add-description-field-to-automation-rule... OK (0.026s)
  Applying builds.0011_version-media-availability... OK (0.067s)
  Applying builds.0012_add-predefined-match-arg-field... OK (0.073s)
  Applying builds.0013_version_documentation_type... OK (0.023s)
  Applying builds.0014_migrate-doctype-from-project-to-version... OK (0.032s)
  Applying builds.0015_uploading_build_state... OK (0.026s)
  Applying builds.0016_add_mkdocs_html_doctype... OK (0.023s)
  Applying builds.0017_builds_deterministic_order_index... OK (0.020s)
  Applying builds.0018_add_hidden_field_to_version... OK (0.023s)
  Applying builds.0019_migrate_protected_versions_to_hidden... OK (0.031s)
  Applying builds.0020_migrate_null_hidden_field... OK (0.031s)
  Applying builds.0021_make_hidden_field_not_null... OK (0.025s)
  Applying builds.0022_migrate_protected_versions... OK (0.032s)
  Applying builds.0023_add_status_code... OK (0.134s)
  Applying builds.0024_status_code_choices... OK (0.050s)
  Applying builds.0025_migrate_private_versions... OK (0.031s)
  Applying builds.0026_add_hide_version_automation_rule... OK (0.027s)
  Applying builds.0027_add_privacy_level_automation_rules... OK (0.028s)
  Applying builds.0028_add_delete_version_action... OK (0.027s)
  Applying builds.0029_add_time_fields... OK (0.049s)
  Applying builds.0030_add_automation_rule_matches... OK (0.033s)
  Applying builds.0031_add_version_fields_to_build... OK (0.076s)
  Applying builds.0032_migrate_version_data_to_build... OK (0.033s)
  Applying builds.0033_dont_cascade_delete_builds... OK (0.032s)
  Applying builds.0034_remove_protected_privacy_level... OK (0.025s)
  Applying builds.0035_backport_indexes... OK (0.070s)
  Applying builds.0036_change_mkdocs_name... OK (0.024s)
  Applying builds.0037_alter_build_cold_storage... OK (0.026s)
  Applying builds.0038_add_new_jsonfields... OK (0.026s)
  Applying builds.0039_migrate_config_data... OK (0.031s)
  Applying builds.0040_remove_old_jsonfields... OK (0.156s)
  Applying builds.0041_track_task_id... OK (0.027s)
  Applying analytics.0001_initial... OK (0.034s)
  Applying analytics.0002_track_status_code... OK (0.149s)
  Applying analytics.0003_remove_index... OK (0.026s)
  Applying analytics.0004_merge_duplicate_records... OK (0.034s)
  Applying analytics.0005_add_unique_constraint... OK (0.025s)
  Applying organizations.0001_squashed... OK (0.801s)
  Applying organizations.0002_update_meta_options... OK (0.060s)
  Applying organizations.0003_team_auto_join_email_users... OK (0.037s)
  Applying organizations.0004_organization_max_concurrent_builds... OK (0.036s)
  Applying organizations.0005_historicalorganization_historicalteam... OK (0.193s)
  Applying organizations.0006_add_assets_cleaned... OK (0.066s)
  Applying audit.0001_initial... OK (0.040s)
  Applying audit.0002_add_organization... OK (0.121s)
  Applying audit.0003_update_ordering... OK (0.033s)
  Applying audit.0004_change_ip_field_type... OK (0.040s)
  Applying audit.0005_migrate_ip_field_values... OK (0.040s)
  Applying audit.0006_add_download_action... OK (0.039s)
  Applying auth.0002_alter_permission_name_max_length... OK (0.043s)
  Applying auth.0003_alter_user_email_max_length... OK (0.043s)
  Applying auth.0004_alter_user_username_opts... OK (0.152s)
  Applying auth.0005_alter_user_last_login_null... OK (0.044s)
  Applying auth.0006_require_contenttypes_0002... OK (0.001s)
  Applying auth.0007_alter_validators_add_error_messages... OK (0.040s)
  Applying auth.0008_alter_user_username_max_length... OK (0.045s)
  Applying auth.0009_alter_user_last_name_max_length... OK (0.044s)
  Applying auth.0010_alter_group_name_max_length... OK (0.045s)
  Applying auth.0011_update_proxy_permissions... OK (0.051s)
  Applying auth.0012_alter_user_first_name_max_length... OK (0.044s)
  Applying authtoken.0001_initial... OK (0.040s)
  Applying authtoken.0002_auto_20160226_1747... OK (0.110s)
  Applying authtoken.0003_tokenproxy... OK (0.003s)
  Applying builds.0042_version_state... OK (0.069s)
  Applying builds.0043_add_cancelled_state... OK (0.141s)
  Applying builds.0044_alter_version_documentation_type... OK (0.030s)
  Applying builds.0045_identifier_null... OK (0.001s)
  Applying core.0001_initial... OK (0.040s)
  Applying core.0002_make_userprofile_user_a_onetoonefield... OK (0.044s)
  Applying core.0003_add_banned_status... OK (0.028s)
  Applying core.0004_ad-opt-out... OK (0.071s)
  Applying core.0005_migrate-old-passwords... OK (0.041s)
  Applying core.0006_remove_userprofile_allow_email... OK (0.027s)
  Applying core.0007_historicaluser... OK (0.042s)
  Applying core.0008_add_extra_history_fields... OK (0.061s)
  Applying core.0009_historicaluserprofile... OK (0.043s)
  Applying core.0010_add_time_fields... OK (0.257s)
  Applying core.0011_alter_historicaluser_first_name... OK (0.033s)
  Applying corsheaders.0001_initial... OK (0.003s)
  Applying djstripe.0001_initial... OK (4.410s)
  Applying djstripe.0007_2_4... OK (9.757s)
  Applying djstripe.0008_2_5... OK (1.367s)
  Applying djstripe.0009_2_6... OK (2.493s)
  Applying djstripe.0010_alter_customer_balance... OK (0.093s)
  Applying gold.0001_initial... OK (0.102s)
  Applying gold.0002_rename_last_4_digits... OK (0.057s)
  Applying gold.0003_add_missing_model_change_migrations... OK (0.053s)
  Applying gold.0004_add_vat_id... OK (0.051s)
  Applying gold.0005_last_4_digits_null... OK (0.052s)
  Applying gold.0006_remove_old_payments_field... OK (0.098s)
  Applying integrations.0001_add_http_exchange... OK (0.101s)
  Applying integrations.0002_add-webhook... OK (0.227s)
  Applying integrations.0003_add_missing_model_change_migrations... OK (0.054s)
  Applying integrations.0004_add_integration_secret... OK (0.039s)
  Applying integrations.0005_change_default_integration_secret... OK (0.039s)
  Applying integrations.0006_set-default-value-provider-data... OK (0.040s)
  Applying integrations.0007_update-provider-data... OK (0.102s)
  Applying integrations.0008_add_new_jsonfields... OK (0.075s)
  Applying integrations.0009_migrate_headers_data... OK (0.106s)
  Applying integrations.0010_remove_old_jsonfields... OK (0.168s)
  Applying messages_extends.0001_initial... OK (0.223s)
  Applying oauth.0015_increase_avatar_url_length... OK (0.086s)
  Applying organizations.0007_add_extra_history_fields... OK (0.176s)
  Applying projects.0088_domain_field_edits... OK (0.088s)
  Applying projects.0089_update_help_text... OK (0.147s)
  Applying projects.0090_default_branch_helptext... OK (0.279s)
  Applying redirects.0001_initial... OK (0.107s)
  Applying redirects.0002_add_missing_model_change_migrations... OK (0.071s)
  Applying redirects.0003_add_default_redirect_http_status_to_302... OK (0.142s)
  Applying redirects.0004_denormalize-from-url... OK (0.143s)
  Applying redirects.0005_allow_to_force_redirects... OK (0.041s)
  Applying search.0001_initial... OK (0.243s)
  Applying search.0002_add_total_results_field... OK (0.040s)
  Applying search.0003_migrate_total_results_null_values... OK (0.104s)
  Applying search.0004_make_total_results_not_null... OK (0.041s)
  Applying sessions.0001_initial... OK (0.005s)
  Applying sites.0002_alter_domain_unique... OK (0.104s)
  Applying sphinx_domains.0001_initial... OK (0.107s)
  Applying sphinx_domains.0002_increase_max_length... OK (0.166s)
  Applying sphinx_domains.0003_additional_indexed_fields... OK (0.085s)
  Applying sphinx_domains.0004_add-build-field... OK (0.042s)
  Applying sphinx_domains.0005_htmlfile-sphinx-domain-integration... OK (0.265s)
  Applying sphinx_domains.0006_update_timestamp_fields... OK (0.032s)
  Applying sso.0001_squashed... OK (0.113s)
  Applying subscriptions.0001_squashed... OK (0.587s)
  Applying taggit.0002_auto_20150616_2121... OK (0.013s)
  Applying taggit.0003_taggeditem_add_unique_index... OK (0.013s)
  Applying taggit.0004_alter_taggeditem_content_type_alter_taggeditem_tag... OK (0.225s)
  Applying taggit.0005_auto_20220424_2025... OK (0.120s)
  Applying telemetry.0001_initial... OK (0.005s)

humitos avatar Aug 17 '22 10:08 humitos

In fact, pytest shows that we are running these migrations twice. I'm not sure yet why this is happening. Currently, running 1 single test consumes 1m8.35s.

humitos avatar Aug 17 '22 10:08 humitos

Commenting the telemetry database, these migrations are run just once and the same test consumes 0m41s. 27s less than using the telemetry database: https://github.com/readthedocs/readthedocs.org/blob/bdf154a35ab40343a3897b9db99cd09a2e1ac67f/readthedocs/settings/dev.py#L22-L25

Why these migrations are run in both databases? In telemetry database we only need migrations from readthedocs.telemetry application. cc @stsewd

humitos avatar Aug 17 '22 11:08 humitos

Why these migrations are run in both databases? In telemetry database we only need migrations from readthedocs.telemetry

Only the migrations from the telemetry app are run, the others are skipped, I assume they are faked, since you'll always see those migrations as applied, but the DB has only the telemetry tables.

stsewd avatar Aug 22 '22 22:08 stsewd

This could potentially be a low-hanging fruit. Have your tried quickly to squash @humitos? The typical issue would be that data migrations (stuff with RunPython) needs to be manually fixed in the auto-created squash migration.

benjaoming avatar Aug 23 '22 14:08 benjaoming

@benjaoming yeah, I did a quick try and I found that we have some RunPython code on the migrations. However, I think most of them could be omitted in the resulting squashed migration since this code is not run in production.

The immediate main issue that I found was the inter-dependency order the migrations need to be run, meaning that we can't squash them all into just one migration per application. I'm not sure how to easily find where they should be split to make this dependency order work fine.

humitos avatar Aug 23 '22 15:08 humitos

@stsewd

Only the migrations from the telemetry app are run, the others are skipped, I assume they are faked, since you'll always see those migrations as applied, but the DB has only the telemetry tables.

There is something else happening behind the scenes, because commenting the telemetry database the execution time for the same single test is reduced in 27 seconds.

The verbose output of tox/pytest (tox -e py310 --verbose -- -vvv -s -k test_revoke_expired_invitation) shows the migrations are effectively run twice with the current configuration and you can see the time each one of them takes next to the name of the migration.

Another thing that I found is that djstripe migrations take about 20s

So, there are some work required here that are related but they don't depend each other:

  1. [ ] do not run readthedocs application migrations on telemetry database
  2. [ ] avoid running djstripe migrations on tests
  3. [ ] squash migrations as much as possible to reduce execution time

humitos avatar Aug 23 '22 16:08 humitos

The verbose output of tox/pytest (tox -e py310 --verbose -- -vvv -s -k test_revoke_expired_invitation) shows the migrations are effectively run twice with the current configuration and you can see the time each one of them takes next to the name of the migration.

Yes, they appear as "run", but only the migrations from the telemetry app are in the telemetry db. This same behavior happens when you first create the docker container for development, all migrations appear in the output as applied, but they aren't really applied in the telemetry db. And I just checked the telemetry db inside a test, and only the telemetry table is there. So I guess django still takes some time to fake those migrations.

stsewd avatar Aug 23 '22 18:08 stsewd

Yeah, I'm not saying the tables are created on the telemetry database. I'm saying that the migrations are run twice and consuming the double of time they should

humitos avatar Aug 23 '22 18:08 humitos

I usually run tox without the migrations, which should solve this for the most part: https://pytest-django.readthedocs.io/en/latest/database.html#no-migrations-disable-django-migrations. Not necessarily a solution, but at least a workaround.

ericholscher avatar Nov 21 '22 23:11 ericholscher

@ericholscher I've been testing the --nomigrations suggestion here. It works super fast. However, there are cases where it leads to different results (i.e. all my tests were passing locally, but they failed on CI).

humitos avatar Jan 25 '23 08:01 humitos

Definitely all the migrations are run one time per database configured. This is how Django works with multiple databases. Then, while running Django decide if the migrations has to be applied to the database or not; but it's run. That's why we are seeing the migrations ran twice but not all tables are not created in telemetry db.

Applying djstripe.0007_2_4... OK (11.026s)
...
Applying djstripe.0007_2_4... OK (11.014s)

Just by running those 2 djstripe migrations, we are wasting 22s.

humitos avatar Jan 25 '23 08:01 humitos