aws-serverless-data-lake-framework correct and clean manifests and cloudfront examples

correct and clean manifests and cloudfront examples

Open mariandumitrascu-p opened this issue 2 years ago • 0 comments

#69

Correct and clean manifests and cloudfront examples

This pull request addresses some missing steps and correction in two examples of SDLF:

Here is an aggregated summary of changes:

all changes are contained in ./sdlf-utils/pipeline-examples/cloudfront and sdlf-utils/pipeline-examples/manifests folders.
included steps for deploying the new pipelines, or datasets from local env, if a pipeline does not exists for them
added requirements for running deployments from a local development machine
replaced handling of a transformation through an extra branch (emr), since this would be harder to maintain in a multi deployment environment (dev, test prod). Replaced with creating its own stageB repo
changes to maintain initial naming convention for transformations. emr is breaking the convention (example the convention is light_transform_<short-desc>.py. Cloudfront was using different one.
added all missing steps in the correct order
segregated submitting data, from deploying the glue job, or emr scripts in the last step. Users should be able to submit data multiple times without re-deploying anything.

This changes were extensibly tested by me and a team 3 developers from PREDICTif Solutions.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Jul 14 '22 19:07 mariandumitrascu-p