terraform-aws-github-runner icon indicating copy to clipboard operation
terraform-aws-github-runner copied to clipboard

fix: #1218 Proxy implementation for Lambdas (basic UT for aws-sdk v2)

Open axel3rd opened this issue 4 years ago β€’ 27 comments

PR proposal for fixing #1218.


As first minor review, perhaps https_proxy could be renamed HTTPS_PROXY for coherence in environment variables naming (even if proxy variables is generally named in lowercase):

image


Tested in "real condition", working fine about EC2 launch:

image

PS: End2end use case is always in progress, templates/user-data.sh should be completely customized to manage proxy usage.

axel3rd avatar Oct 11 '21 09:10 axel3rd

As first minor review, perhaps https_proxy could be renamed HTTPS_PROXY for coherence in environment variables naming (even if proxy variables is generally named in lowercase):

Done

axel3rd avatar Oct 14 '21 08:10 axel3rd

why did you only add the proxy to the runner lambda? And not to the other two?

[comment updated after initially missing the sense of question]

@npalm : The webhook & runner-binaries-syncer are not attached to defined VPC defined by lambda_subnet_ids template parameter and can accessing directly to AWS APIs =>For:

  • webhook: Can access to SQS without any problem.
  • runner-binaries-syncer:
    • (here is supposition, I only do some real test with up2date actions-runner-linux-x64-xxx.tar.gz downloaded in lambdas-download/ directory from internal repository due to proxy constraints)
    • GHES URL seems not implemented (runner-binaries-syncer/variables.tf) => if last version retrieved from GitHub.com, S3 could be updated
    • But yes, the templates/user-data.sh should be customized with proxy usage in this case (like the SSM access in addition 😞)

axel3rd avatar Oct 21 '21 20:10 axel3rd

@npalm : Ready for review (basic unit tests added for AWS SDK v2 ; all AWS request API are not mocked).

axel3rd avatar Nov 30 '21 18:11 axel3rd

Thx, will check this week. I have no option to test the proxy.

npalm avatar Dec 07 '21 16:12 npalm

@ScottGuymer would you have a bit of time to check the PR?

npalm avatar Dec 09 '21 17:12 npalm

For understanding why a proxy, and linked with #1303, a WIP/draft of a future template example of this use case.

axel3rd avatar Dec 09 '21 19:12 axel3rd

@ScottGuymer: Sorry for my response delay, full time on a "logger problem" since ~10 days.

I would suggest adding in the configuration of the https_proxy to the user-data scripting by default so it does not need to be customised directly for all usages of this feature.

Not agree with that ; when EC2 VM are in "routed VPC" and GitHub GHES in this same VPC or on corporate network, the runner (on VM) should be in capacity to request GitHub directly without proxy.

If runner communicate to GitHub with proxy, some reverse proxy (to secure internet access) could be in game and complexify the process (for sample: authentication required, which doesn't support GitHub OAuth currently).

=> "Proxy by default" introduces more problem IMO (vs configured for requests which need it).

Is it not possible to configure your VPC in such a way that it is able to speak to the AWS API without the need for a proxy?

I would like ... but no way to deal on that with security team currently. For any resource on corporate network (on-premises or routed VPC):

  • (egress) all resource which would access to internet should use proxy
  • (ingress) all internet access to these resources should pass by reverse-proxy (with authentication in some case)

You can see some more extensive examples of mocking dependencies in [...]

Are the current unit tests on SSM & lambda (root dir) sufficient or should I investigate on Axios sample ?

axel3rd avatar Dec 20 '21 15:12 axel3rd

Set to draft even if all checks pass, because when tested really after last develop merge, problem on webhook runner (even if not part of these changes):

image

axel3rd avatar Dec 23 '21 11:12 axel3rd

Set to draft even if all checks pass, because when tested really after last develop merge, problem on webhook runner (even if not part of these changes)

Due to previous release webhook.zip lambda usage. Works fine with rebuild from develop.

axel3rd avatar Dec 23 '21 13:12 axel3rd

@ScottGuymer : Happy new year ! I think have answered to all request/question. Don't hesitate for any complements if it is not the case.

axel3rd avatar Jan 03 '22 09:01 axel3rd

@npalm / @ScottGuymer : Sorry to bother you or seems to be insistent, but what could be the following of this PR ? Currently its status is Changes requested but I'm not sure to have concrete updates todo on it. Let me know, many thanks.

axel3rd avatar Jan 14 '22 11:01 axel3rd

I have done a more in depth review with a number more comments. [...]. Just need to add a little more testing and completeness to it.

@ScottGuymer : Updates done. Let me now if it is acceptable 🀞.

axel3rd avatar Jan 18 '22 18:01 axel3rd

I am happy with the changes here.

I don't have any environment in which I can test this, and I guess you don't have one where there isn't a proxy either.

So, it's quite difficult to be totally sure there are no side effects.

@npalm what do you think?

ScottGuymer avatar Jan 26 '22 14:01 ScottGuymer

@npalm what do you think?

@npalm : I allow me to bump this request 😁.

axel3rd avatar Feb 09 '22 16:02 axel3rd

@axel3rd sorry for keep you waiting, thx for the notice. Put it on my list for checking

npalm avatar Feb 10 '22 08:02 npalm

SonarCloud Code Analysis Failing after 39s β€” Quality Gate failed

About:

axel3rd avatar Mar 02 '22 17:03 axel3rd

SonarCloud Code Analysis Failing after 39s β€” Quality Gate failed

About:

SonarQube was an experiment, it is disabled. I still hope to disable it later. So sonar can be ignored

npalm avatar Mar 24 '22 09:03 npalm

Tried the current PR with the default example, got the following error in scale up

"errorMessage": "ENOENT: no such file or directory, open '/var/task/bridge.js'",

npalm avatar Mar 24 '22 11:03 npalm

"errorMessage": "ENOENT: no such file or directory, open '/var/task/bridge.js'",

@npalm : Even proxy-agent dependency hasn't change, the ncc generation JS files result has change 😒.

contextify.js file has been replaced by:

ncc: Using [email protected] (local user-provided)
   11kB  dist/setup-node-sandbox.js
   12kB  dist/setup-sandbox.js
   27kB  dist/events.js
   28kB  dist/bridge.js
13114kB  dist/index.js

To be more sustainable, distribution ZIP file has been updated with ("include all JS files"):

"dist": "yarn build && cd dist && zip ../runners.zip *.js",

Commit c890c84

axel3rd avatar Mar 25 '22 17:03 axel3rd

"zip ../runners.zip *.js" / Why some other file than index.js

Using ncc build -d option:

Emitting /git/terraform-aws-github-runner/modules/runners/lambdas/runners/node_modules/vm2/lib/setup-node-sandbox.js for static use in module /git/terraform-aws-github-runner/modules/runners/lambdas/runners/node_modules/vm2/lib/nodevm.js

The vm2 dependency comes from pac-proxy-agent dependency:

$ npm list vm2
[email protected] /git/terraform-aws-github-runner/modules/runners/lambdas/runners
└─┬ [email protected]
  └─┬ [email protected]
    └─┬ [email protected]
      └─┬ [email protected]
        └── [email protected]

... probably not used because no proxy PAC is used here 😒; but not sure this transitive dependency can be removed with NPM.

axel3rd avatar Apr 01 '22 16:04 axel3rd

but not sure this transitive dependency can be removed with NPM.

Trying with ncc:

    "build": "ncc build src/lambda.ts -o dist -e pac-proxy-agent -e socks-proxy-agent",
    "dist": "yarn build && cd dist && zip ../runners.zip index.js",

The additional JS are not in ZIP, but error at lambda usage 😒:

"Runtime.ImportModuleError: Error: Cannot find module 'pac-proxy-agent'",

axel3rd avatar Apr 01 '22 17:04 axel3rd

axel3rd dismissed BinHavin0323’s stale review via 9fc0da3 16 seconds ago

😱 (it was just to update package.json for conflict resolution)

axel3rd avatar Apr 21 '22 16:04 axel3rd

Is there a chance to merge this PR ? I would start to work on #1303 and it is a prerequisite (not easy to maintain correctly both PR correctly separated)

axel3rd avatar Apr 29 '22 10:04 axel3rd

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jun 19 '22 02:06 github-actions[bot]

github-actions bot added the Stale label yesterday

Please let me know if any changes are required. It would be nice if this proxy support could be supported IMO.

axel3rd avatar Jun 20 '22 07:06 axel3rd

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Aug 26 '22 02:08 github-actions[bot]

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.

This PR is always relevant IMO.

axel3rd avatar Aug 30 '22 08:08 axel3rd

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Oct 01 '22 02:10 github-actions[bot]

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Nov 12 '22 02:11 github-actions[bot]

down stale bot, down

mcaulifn avatar Nov 12 '22 15:11 mcaulifn