skaffold icon indicating copy to clipboard operation
skaffold copied to clipboard

Speed up deployments by rendering and deploying concurrently

Open iosifnicolae2 opened this issue 2 years ago • 13 comments

Most of our time is spent waiting for helm chart render and deployment to run one by one (this is exceptionally problematic when deploying/rendering tasks is taking a few minutes). We could significally reduce deployment time by processing the required skaffold files in parallel.

Expected behavior

We expect skaffold to render and deploy tasks in parallel.

Actual behavior

Each helm chart is rendered and deployed one by one..

Information

  • Skaffold version: v2.1.0
  • Operating system: macOS 13.1 (22C65)
  • Installed via: Homebrew
  • Contents of skaffold.yaml:
apiVersion: skaffold/v3
kind: Config
requires:
  - path: ./app1/skaffold.yml
  - path: ./app2/skaffold.yml
  - path: ./app3/skaffold.yml
  - path: ./app4/skaffold.yml
  - path: ./app5/skaffold.yml

...

Steps to reproduce the behavior

  1. git clone https://github.com/iosifnicolae2/skaffold-bug
  2. cd skaffold-bug/skaffold
  3. skaffold run --verbosity debug (you migth need to update build.artifacts.image from each skaffold.yml files)
  4. As you can see the Render and Deploy tasks are synchronius.
 Running command: [helm --kube-context cluster.local template app2 skaffold-bug/charts/app -
 Running command: [helm --kube-context cluster.local template app2 skaffold-bug/charts/app --post-renderer /opt/homebrew/bin/skaffold --set image.repository=bringes/app1 --set image.tag=XXXXX]  subtask=1 task=Render-post-renderer /opt/homebrew/bin/skaffold --set image.repository=bringes/app1 --set image.tag=XXXXX]  subtask=2 task=Render
 Running command: [helm --kube-context cluster.local template app2 skaffold-bug/charts/app --post-renderer /opt/homebrew/bin/skaffold --set image.repository=bringes/app1 --set image.tag=XXXXX]  subtask=3 task=Render
...

Running command: [helm --kube-context cluster.local dep build skaffold-bug/charts/app]  subtask=1 task=Deploy
Running command: [helm --kube-context cluster.local dep build skaffold-bug/charts/app]  subtask=2 task=Deploy
Running command: [helm --kube-context cluster.local dep build skaffold-bug/charts/app]  subtask=3 task=Deploy
...

It might be perfect to have an option to group certain requirements in a group which will be deployed concurrently , something like:

apiVersion: skaffold/v3
kind: Config
requires:
  - concurrency: 3
    paths: 
     - ./app1/skaffold.yml
     - ./app2/skaffold.yml
     - ./app3/skaffold.yml
     - ./app4/skaffold.yml
  - path: ./app5/skaffold.yml

Obs! It's pretty important to be able to deploy certain requirements synchronously. Obs! For rendering the helm charts, maybe we could execute multiple tasks in parallel by default..

Similar with https://github.com/GoogleContainerTools/skaffold/issues/5417

iosifnicolae2 avatar Jan 24 '23 19:01 iosifnicolae2

This feature would be greatly appreciated, we have a monorepo with 10+ workloads that we are deploying together to aid local development and avoid deploying ephemeral environments into GKE. The lack of concurrency is a major headache as deployments being sequential results in us waiting 10+ minutes for initial deployment by Skaffold into Minikube and 5+ minutes for iteration.

The majority of the team feedback that this is the blocker to using Skaffold full time.

davidgwps avatar Jan 25 '23 12:01 davidgwps

@ericzzzzzzz Is there a way to scope this out? Would like to see if I can do this. We have over 40 microservices and need to deploy quickly.

nickdapper avatar Feb 07 '23 16:02 nickdapper

Also related #5417

nickdapper avatar Feb 08 '23 14:02 nickdapper

Hi @nickdapper, glad to hear that you're interested in working on this! I'll post some context later!

ericzzzzzzz avatar Feb 08 '23 16:02 ericzzzzzzz

Hi @nickdapper , sorry for the late reply. We have implementation for concurrent build https://github.com/GoogleContainerTools/skaffold/blob/88e1c1734ea0f1bd79a0f24783f9565057a0e46a/pkg/skaffold/build/scheduler.go#L146-L157, I think the same approach can be apply to render, deploy as well. We would also like to have the similar flag to build-concurrency for render/deploy https://github.com/GoogleContainerTools/skaffold/blob/c3d15a41a62fbb08096a971795a1a0b5367feaa2/cmd/skaffold/app/cmd/flags.go#L621-L628

One thing a little bit challenging is to build dependency graphs for renderers and deployers, unlike build stage where artifacts dependency relationship is defined under https://github.com/GoogleContainerTools/skaffold/blob/7d346f54c50b37860d03fbebaf32b0180dfcf515/pkg/skaffold/schema/latest/config.go#L976-L977 build.artifacts.dependencies stanza directly, we need to build dependency graph based on https://github.com/GoogleContainerTools/skaffold/blob/7d346f54c50b37860d03fbebaf32b0180dfcf515/pkg/skaffold/schema/latest/config.go#L115-L122, note that the name field is an optional field, this is not reliable for build the graph.

After gathering the data we need, we can start dispatch render jobs and deploy jobs from https://github.com/GoogleContainerTools/skaffold/blob/88e1c1734ea0f1bd79a0f24783f9565057a0e46a/pkg/skaffold/render/renderer/render_mux.go#L61-L78 and https://github.com/GoogleContainerTools/skaffold/blob/88e1c1734ea0f1bd79a0f24783f9565057a0e46a/pkg/skaffold/deploy/deploy_mux.go#L120-L139

Aside the work I mentioned, we may need a little more investigation, I think it's good to start with a design doc. If you feel this may take too much time from you, we can assign this to our team member, but this won't be a top priority in the near future milestones.

ericzzzzzzz avatar Feb 13 '23 11:02 ericzzzzzzz

@nickdapper definitely let us know how we can help support you with any work here. The Skaffold team would be super excited to help with any work on this, thanks!

aaron-prindle avatar Feb 13 '23 19:02 aaron-prindle

@ericzzzzzzz Thanks for the context here. Would we apply this to delete as well?

I agree about the dependency graph. I'm not sure a top level concurrency flag will be enough though.

Example:

apiVersion: skaffold/v3
kind: Config
metadata:
  name: my app
requires:
  - path: deps/postgres/skaffold.yaml
  - path: deps/redis/skaffold.yaml
  - path: cmd/service-1/skaffold.yaml
  - path: cmd/service-2/skaffold.yaml 

In this scenario, I would want a way to define that the deps are deployed in parallel however it should wait until those are up and running before proceeding to service-1 and service-2 which those also could be parallel. This means we need a concurrency flag in the skaffold.yaml file for the dependency graph to utilize.

Here's an example of what I'm thinking. Not sure if this should be defined under the profile or somewhere else. I didn't put it inside deploy because this could apply to delete as well.

Here concurrency would be defined as (0 or -1) == max, 1== sequential, >1== only go up to the limit with respecting the graph.

apiVersion: skaffold/v3
kind: Config
metadata:
  name: my-app
requires:
  - path: deps/skaffold.yaml
  - path: cmd/skaffold.yaml    
profiles:
  - name: local
    concurrency: 1

--- 
apiVersion: skaffold/v3
kind: Config
metadata:
  name: deps
requires:
  - path: postgres/skaffold.yaml
  - path: redis/skaffold.yaml
profiles:
  - name: local
    concurrency: 0

--- 
apiVersion: skaffold/v3
kind: Config
metadata:
  name: services
requires:
  - path: service-1/skaffold.yaml
  - path: service-1/skaffold.yaml
profiles:
  - name: local
    concurrency: 0

This would mean, deploy deps and services sequentially. For deps, deploy all in parallel. For services, deploy all in parallel.

Thoughts on this approach?

nickdapper avatar Feb 17 '23 15:02 nickdapper

@nickdapper your solution might also work for us.

iosifnicolae2 avatar Feb 17 '23 16:02 iosifnicolae2

I think the concurrency field needs to be set at build, render and deployer stage individually. So something like:

apiVersion: skaffold/v3
kind: Config
build:
  ~~
  concurrency: 0
manifests:
  ~~
  concurrency:1
deploy:
  ~~
  concurrency:0

Also, for build there's an obvious number of artifacts so setting an integer concurrency level makes more sense than in render or deploy where it might not be apparent how many resources are exactly being worked on and what should concurrency control. In that case, should it just be something like:

apiVersion: skaffold/v3
kind: Config
build:
  ~~
  concurrency: 0
manifests:
  ~~
  runParallel: true
deploy:
  ~~
  runParallel: false

gsquared94 avatar Mar 07 '23 20:03 gsquared94

Any idea if this functionality will be implemented?

iosifnicolae2 avatar May 23 '23 12:05 iosifnicolae2

Same issue here, would love to see this feature implemented

altais avatar Sep 19 '23 21:09 altais

+1

matake01 avatar Feb 14 '24 16:02 matake01