beam icon indicating copy to clipboard operation
beam copied to clipboard

[Python] Log dependencies installed in submission environment

Open riteshghorse opened this issue 2 years ago • 17 comments

Saves the submission environment dependencies and stage it. Logs it along with the runtime dependencies.

image

Fixes #28563


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • [ ] Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • [ ] Update CHANGES.md with noteworthy changes.
  • [ ] If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels Python tests Java tests Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

riteshghorse avatar Sep 20 '23 17:09 riteshghorse

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 38.47%. Comparing base (0bbf2c3) to head (80d6d18). Report is 487 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #28564      +/-   ##
==========================================
+ Coverage   38.23%   38.47%   +0.24%     
==========================================
  Files         696      698       +2     
  Lines      101878   102520     +642     
==========================================
+ Hits        38952    39449     +497     
- Misses      61309    61439     +130     
- Partials     1617     1632      +15     
Flag Coverage Δ
go 54.33% <ø> (+0.39%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Sep 20 '23 18:09 codecov[bot]

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

github-actions[bot] avatar Sep 20 '23 19:09 github-actions[bot]

Run Python_Integration PreCommit

riteshghorse avatar Sep 20 '23 20:09 riteshghorse

Some unit tests are failing because there is additional staging file now which will always be present. Got the solution. I'll update the PR. Defer review until then.

riteshghorse avatar Sep 21 '23 14:09 riteshghorse

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @jrmccluskey for label python.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

github-actions[bot] avatar Sep 22 '23 15:09 github-actions[bot]

R: @chamikaramj could you comment on the external transform environment. The external_transform environment tests would fail if there is an additional staging file by default.

It complains about no artifact service when it tries to resolve that artifact.

riteshghorse avatar Sep 27 '23 17:09 riteshghorse

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

github-actions[bot] avatar Sep 27 '23 17:09 github-actions[bot]

t. The external_transform environment tests would fail if there is an additional staging file by default.

Can we stage job submission dependencies without including them in the runtime environment defintion?

tvalentyn avatar Sep 27 '23 18:09 tvalentyn

Summary:

We have two issues to address:

  1. Resolve artifact comparison in environments's __eq__ method
  2. Log/Skip dependency logging for External Environments

riteshghorse avatar Oct 03 '23 20:10 riteshghorse

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Dec 03 '23 12:12 github-actions[bot]

Hey @riteshghorse , could we extract the commits that log the runtime dependencies and merge those before next release cut while submission dependencies portion is being sorted out? Thanks!

tvalentyn avatar Dec 11 '23 09:12 tvalentyn

Hey @riteshghorse , could we extract the commits that log the runtime dependencies and merge those before next release cut while submission dependencies portion is being sorted out? Thanks!

Sounds good

riteshghorse avatar Dec 11 '23 14:12 riteshghorse

Created #29705

riteshghorse avatar Dec 11 '23 14:12 riteshghorse

I've verified that a multi-language from python to java works - Job

So it is a problem with just the test expansion service

riteshghorse avatar Dec 20 '23 14:12 riteshghorse

I'll a add fake artifact_service method to the test expansion service, that should get us going here

riteshghorse avatar Dec 20 '23 14:12 riteshghorse

R: @tvalentyn this is ready for review

Changes to note:

  1. Changed the artifact comparison logic to ignore the type payload field since that has unique hashes
  2. The external transform test failure was because of the ExpansionServiceServicer not having artifact service method. So added that. Confirmed this by running a multi-language pipeline successfully - Job Link
  3. The staging logic stays in stager.py since we ulitmately call create_job_resources from python_sdk_dependencies() which is invoked during environment creation.

riteshghorse avatar Dec 20 '23 18:12 riteshghorse

R: @tvalentyn

Barring the lint failure, all tests pass.

riteshghorse avatar Feb 05 '24 18:02 riteshghorse

Is this ready to merge?

kennknowles avatar Mar 04 '24 16:03 kennknowles

i left one comment, after that it should be ready to merge.

tvalentyn avatar Mar 04 '24 16:03 tvalentyn

Done, I'll merge once the check passes

riteshghorse avatar Mar 07 '24 18:03 riteshghorse