edx-analytics-pipeline icon indicating copy to clipboard operation
edx-analytics-pipeline copied to clipboard

Tasks for loading GA data into Snowflake (PART 2)

Open pwnage101 opened this issue 5 years ago • 1 comments

This is the second part of the work to add a pipeline to load GA360 data into Snowflake.

This part of the code change DOES depend on a Luigi upgrade, and is NOT ready for merging until we have done the following things:

  • Upgrade luigi/boto and tested all the rest of the pipelines for regressions.
    • edx-specific changes in our current luigi fork are rebased on top of luigi>=2.7.6
  • Upgrade the BigQuery loading tasks to take advantage of new API methods in google-cloud-bigquery==1.11.2
    • Instructions here: https://cloud.google.com/bigquery/docs/python-client-migration

Other PRs:

  • Code for Part 1: https://github.com/edx/edx-analytics-pipeline/pull/721
  • Code for Part 2: https://github.com/edx/edx-analytics-pipeline/pull/722 (this PR)
  • Config for Part 1: https://github.com/edx-ops/analytics-secure/pull/237
  • Config for Part 2: https://github.com/edx-ops/analytics-secure/pull/238

Analytics Pipeline Pull Request

Make sure that the following steps are done before merging:

  • [ ] If you have a migration please contact data engineering team before merging.
  • [ ] Before merging run full acceptance tests suite and provide URL for the acceptance tests run.
  • [ ] A member of data engineering team has approved the pull request.

pwnage101 avatar Apr 19 '19 16:04 pwnage101

This PR is now really old. You have a few more old PRs in flight: https://github.com/pulls?q=is%3Aopen+is%3Apr+archived%3Afalse+author%3Apwnage101+sort%3Aupdated-asc+org%3Aedx+org%3Aopenedx Do you want to keep them all open?

nedbat avatar Jan 09 '24 17:01 nedbat