commcare-hq icon indicating copy to clipboard operation
commcare-hq copied to clipboard

Command to refresh data dictionary

Open jingcheng16 opened this issue 10 months ago • 2 comments

Product Description

Technical Summary

Ticket: https://dimagi.atlassian.net/browse/SAAS-17442

This PR will run refresh_data_dictionary_from_app on all domains and all apps, so any case type that is configured in the form builder will be added to Data Dictionary. Thus we can treat Data Dictionary as a source of truth in the future. 🎉

refresh_data_dictionary_from_app is a function we just wrote recently, thus there are some case type create before this function is not synced to data dictionary.

This can be observed by the discrepancy between different functions to get all case types from a domain.

We have at least 3 different functions to get all existing case types. get_case_types_from_apps This use AppES to find all modules's case types. So this should return all case types defined in apps, but won't include any case types that are pre-defined in DD and not referenced in an app yet.

get_case_types_for_domain This use CaseES to find all case types that have at least one existing case, then union with DD case types that are not deprecated(by default). CaseES can return a deprecated DD case types but has at least one existing case. After the command running, the union should be unnecessary if include_deprecated = True, because DD will have all case types

data_dictionary_json_case_types This query CaseType

If I run all three functions on the same domain, I would expect the result from 2nd function is just the same as the result from 3rd function, if there are no case types being deprecated. Both 2nd and 3rd function should contains the result from 1st function

Feature Flag

Safety Assurance

Safety story

There should be no harm to run refresh_data_dictionary_from_app on all apps, because we already run it asynchronously after every app save.

Automated test coverage

QA Plan

Migrations

  • [x] The migrations in this code can be safely applied first independently of the code

Rollback instructions

  • [x] This PR can be reverted after deploy with no further considerations

Labels & Review

  • [x] Risk label is set correctly
  • [x] The set of people pinged as reviewers is appropriate for the level of risk of the change

jingcheng16 avatar May 29 '25 03:05 jingcheng16

ah, i see from the ticket this looks like it's going to be a temporary command as it says "Migration"...is this correct @jingcheng16? how do we expect third parties to update their apps? could we actually create a migration so that it is actually run for them on the next deploy?

biyeun avatar May 29 '25 17:05 biyeun

@biyeun Thanks, just realized maybe I should write a migration, so our self-hoster can also treat DD as a source of truth. Especially I plan to clean up some case type function to treat DD as source of truth. Migration added in 0e5e234f20d8aea8083bff76e3508c9f1896a6b1

jingcheng16 avatar May 29 '25 18:05 jingcheng16

Ah also do you plan to apply this migration manually to avoid blocking the deploy?

gherceg avatar Jun 09 '25 17:06 gherceg

@gherceg Yes, plan to apply manually

jingcheng16 avatar Jun 11 '25 03:06 jingcheng16

@mjriley Hi Matt, logics added to avoid iterate through all domains checking their migration status. Can you review again?

jingcheng16 avatar Jun 11 '25 03:06 jingcheng16

Matt raised a valid point that this refresh_data_dictionary_from_app might never succeed for some old domain or old app. Then we will stuck in. Thus commit d5e06cd9b815ce5959 is definitely a bad idea that might cause infinite loop. Though my intention is, if some connection error happened, we can always retry until it is successful.

As the next step, I think at least I first solve two existing sentry error with the refresh_data_dictionary_from_app this and this.

Though solve this two doesn't guarantee there won't be any weird issue when running migration. So I better set up some max_retry number for the migration.

jingcheng16 avatar Jun 12 '25 17:06 jingcheng16

Close it for now until we fix the two issues I mentioned in the above comment

jingcheng16 avatar Jun 12 '25 19:06 jingcheng16

Once the migration of a domain is skipped or finished, it doesn't need to be add in the domain-to-skip list when run the command again, but if there are too many malformed apps in one domain, it will be a bad user experience to keep retry the command with a long app-id-to-skip list, until the domain's migration finished.

How bad it is depends on how many malformed apps it can be in one domain.

So I will run the command through a private release on prod first, to get an idea of how many malformed apps we have. Then based on the result, we look into how to make the user experience better.

@mjriley Can you sign off on running it on prod?

jingcheng16 avatar Jul 24 '25 18:07 jingcheng16

@mjriley Can you sign off on running it on prod?

Approved

mjriley avatar Jul 24 '25 19:07 mjriley

@mjriley I run the command on production, this is what I found:

  • As long as calling get_apps_in_domain succeeds, Application are wrapped correctly, refresh DD on each app will go smoothly without any error. So we either skip a domain, or we process all apps in a domain. No need to skip any particular apps. But I'll still keep the logic just in case self hoster need to.
  • The only reason get_apps_in_domain failed is due to an app referencing an unsupported CommCare Build version.
  • Most domains that has to be skipped cannot even be accessed anymore. I no longer feel uncomfortable skipping them just due to one malformed app.

I've updated the command to skip domains that has app referencing an unsupported CommCare Build version so user don't have to skip them manually in e9c3b5d8b8025d93b0aa83349cb00786248165ba

Can you review and sign off on me running it on india and eu? Besides, given what I've found from running it on prod, I think this PR is at a good place to be merged. What do you think?

jingcheng16 avatar Jul 25 '25 18:07 jingcheng16

The migration in this PR has been run in all three environments.

jingcheng16 avatar Jul 26 '25 02:07 jingcheng16