edx-analytics-pipeline
edx-analytics-pipeline copied to clipboard
Tasks for loading GA data into Snowflake (PART 2)
This is the second part of the work to add a pipeline to load GA360 data into Snowflake.
This part of the code change DOES depend on a Luigi upgrade, and is NOT ready for merging until we have done the following things:
- Upgrade luigi/boto and tested all the rest of the pipelines for regressions.
- edx-specific changes in our current luigi fork are rebased on top of
luigi>=2.7.6
- edx-specific changes in our current luigi fork are rebased on top of
- Upgrade the BigQuery loading tasks to take advantage of new API methods in
google-cloud-bigquery==1.11.2
- Instructions here: https://cloud.google.com/bigquery/docs/python-client-migration
Other PRs:
- Code for Part 1: https://github.com/edx/edx-analytics-pipeline/pull/721
- Code for Part 2: https://github.com/edx/edx-analytics-pipeline/pull/722 (this PR)
- Config for Part 1: https://github.com/edx-ops/analytics-secure/pull/237
- Config for Part 2: https://github.com/edx-ops/analytics-secure/pull/238
Analytics Pipeline Pull Request
Make sure that the following steps are done before merging:
- [ ] If you have a migration please contact data engineering team before merging.
- [ ] Before merging run full acceptance tests suite and provide URL for the acceptance tests run.
- [ ] A member of data engineering team has approved the pull request.
This PR is now really old. You have a few more old PRs in flight: https://github.com/pulls?q=is%3Aopen+is%3Apr+archived%3Afalse+author%3Apwnage101+sort%3Aupdated-asc+org%3Aedx+org%3Aopenedx Do you want to keep them all open?