firefox-ios icon indicating copy to clipboard operation
firefox-ios copied to clipboard

Places Dry run Migration [v107]: Implements history migration and allows dry-run

Open tarikeshaq opened this issue 2 years ago • 3 comments

Draft to prepare us for an experiment I'm planning to run 107 to check how long a history migration takes.

This shouldn't land until:

  • we have a release with https://github.com/mozilla/application-services/pull/5077
  • Add glean metrics to record success and failure metrics for the migration (including how long it takes)

cc @nbhasin2

tarikeshaq avatar Sep 12 '22 19:09 tarikeshaq

This pull request has conflicts when rebasing. Could you fix it @tarikeshaq?

mergify[bot] avatar Sep 12 '22 19:09 mergify[bot]

I want to add that the nimbus feature will pretty much only activate migrations on the second run but that is OK

tarikeshaq avatar Sep 12 '22 19:09 tarikeshaq

This pull request has conflicts when rebasing. Could you fix it @tarikeshaq?

mergify[bot] avatar Sep 20 '22 17:09 mergify[bot]

This is getting close to be ready, just need to do the data review, and then it'll be ready for code review I'll fill out the form - hoping we can land this after the all-hands!

tarikeshaq avatar Sep 22 '22 21:09 tarikeshaq

Request for data collection review form

All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.

  1. What questions will you answer with this data?
  • How long migrations take as a time distribution
  • The number of visits expected to migrate
  • The number of visits actually migrated (and thus a rate)
  • How often migrations fail and the reason they do
  • How often does the app die before the migration is done
  1. Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements? Some example responses:
  • We need to answer those to ensure a safe delivery of the migration to users
  • The migration is needed because the work to integrate the places component will shift us to a maintained implementation of history storage and sync
  • The migration is slightly risky since we are unsure how long it takes and might take too long. If it takes too long we want to know before we ship the actual feature.
  1. What alternative methods did you consider to answer these questions? Why were they not sufficient?

Locally measuring how long it takes to migrate.

  • That is not a reasonable estimate as users' devices and their history data vary.
  1. Can current instrumentation answer these questions?

No. The migration is new and does not have any telemetry of data instrumentation.

  1. List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories found on the Mozilla wiki.

For the implementation, look at the pull request

Measurement Description Data Collection Category Tracking Bug #
Success and performance data of the migration to the new application-services places component Techincal Data https://mozilla-hub.atlassian.net/browse/SYNC-3308
  1. Please provide a link to the documentation for this data collection which describes the ultimate data set in a public, complete, and accurate way.

This collection is documented in the Glean Dictionary at https://dictionary.telemetry.mozilla.org/

  1. How long will this data be collected? Choose one of the following:
  • The data will be collected for a year (until at least October 17th, 2023) and I Tarik Eshaq (@tarikeshaq: [email protected]) will monitor it.
  1. What populations will you measure?

All channels in all countries in all locales, for firefox-ios users that haven't undergone the migration yet.

  1. If this data collection is default on, what is the opt-out mechanism for users?

Standard Firefox-ios telemetry controls, users can opt-out in the settings menu.

  1. Please provide a general description of how you will analyze this data.

Observe the timing distribution of the migration times in a dry-run, along with success rates to determine how reasonable it is to ship a proper migration

  1. Where do you intend to share the results of your analysis?

The Sync and Firefox iOS teams

  1. Is there a third-party tool (i.e. not Glean or Telemetry) that you are proposing to use for this data collection? If so:

No.

tarikeshaq avatar Oct 11 '22 14:10 tarikeshaq

Request for data collection review form

All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.

1. What questions will you answer with this data?
  • How long migrations take as a time distribution
  • The number of visits expected to migrate
  • The number of visits actually migrated (and thus a rate)
  • How often migrations fail and the reason they do
  • How often does the app die before the migration is done
2. Why does Mozilla need to answer these questions?  Are there benefits for users? Do we need this information to address product or business requirements? Some example responses:
  • We need to answer those to ensure a safe delivery of the migration to users
  • The migration is needed because the work to integrate the places component will shift us to a maintained implementation of history storage and sync
  • The migration is slightly risky since we are unsure how long it takes and might take too long. If it takes too long we want to know before we ship the actual feature.
3. What alternative methods did you consider to answer these questions? Why were they not sufficient?

Locally measuring how long it takes to migrate.

  • That is not a reasonable estimate as users' devices and their history data vary.
4. Can current instrumentation answer these questions?

No. The migration is new and does not have any telemetry of data instrumentation.

5. List all proposed measurements and indicate the category of data collection for each measurement, using the [Firefox data collection categories](https://wiki.mozilla.org/Data_Collection) found on the Mozilla wiki.

For the implementation, look at the pull request Measurement Description Data Collection Category Tracking Bug # Success and performance data of the migration to the new application-services places component Techincal Data https://mozilla-hub.atlassian.net/browse/SYNC-3308

6. Please provide a link to the documentation for this data collection which describes the ultimate data set in a public, complete, and accurate way.

This collection is documented in the Glean Dictionary at dictionary.telemetry.mozilla.org

7. How long will this data be collected?  Choose one of the following:
  • The data will be collected for a year (until at least October 17th, 2023) and I Tarik Eshaq (@tarikeshaq: [email protected]) will monitor it.
8. What populations will you measure?

All channels in all countries in all locales, for firefox-ios users that haven't undergone the migration yet.

9. If this data collection is default on, what is the opt-out mechanism for users?

Standard Firefox-ios telemetry controls, users can opt-out in the settings menu.

10. Please provide a general description of how you will analyze this data.

Observe the timing distribution of the migration times in a dry-run, along with success rates to determine how reasonable it is to ship a proper migration

11. Where do you intend to share the results of your analysis?

The Sync and Firefox iOS teams

12. Is there a third-party tool (i.e. not Glean or Telemetry) that you are proposing to use for this data collection? If so:

No.

Data Review

  1. Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?

Yes, through the metrics.yaml file and the Glean Dictionary.

  1. Is there a control mechanism that allows the user to turn the data collection on and off?

Yes, through the "Send Usage Data" preference in the application settings.

  1. If the request is for permanent data collection, is there someone who will monitor the data over time?

N/A, collection to end 2023-10-17

  1. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1, Technical Data

  1. Is the data collection request for default-on or default-off?

Default-on

  1. Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?

No

  1. Is the data collection covered by the existing Firefox privacy notice?

Yes

  1. Does the data collection use a third-party collection tool?

No

Result

data-review+

travis79 avatar Oct 11 '22 14:10 travis79

@lougeniaC64 @mhammond @nbhasin2 this is ready for review

tarikeshaq avatar Oct 11 '22 17:10 tarikeshaq

hmm dunno why GitHub removed the review requests from @lougeniaC64 and @mhammond, but that was unintentional, I still would value your thoughts here if you have any!

tarikeshaq avatar Oct 12 '22 15:10 tarikeshaq

hmm dunno why GitHub removed the review requests from @lougeniaC64 and @mhammond, but that was unintentional, I still would value your thoughts here if you have any!

@tarikeshaq I'll be reviewing this today. If when that happens I'm not listed as an official reviewer, I'll leave a comment.

lougeniaC64 avatar Oct 12 '22 15:10 lougeniaC64

Left one small comment but otherwise LGTM!

lougeniaC64 avatar Oct 12 '22 22:10 lougeniaC64

Build is green here Screen Shot 2022-10-14 at 12 54 24 PM

lmarceau avatar Oct 14 '22 16:10 lmarceau