deepflow icon indicating copy to clipboard operation
deepflow copied to clipboard

Shorten the waiting time of GitHub CI

Open Nick-0314 opened this issue 3 years ago • 9 comments

Feature request

Shorten the waiting time of GitHub CI

Use case

Nick-0314 avatar Oct 19 '22 06:10 Nick-0314

@aktech Hello, We have fully launched cirun.io and abandoned the original runner, and found some problems today. We think the current one and a half minute response of github CI is a bit slow, is there a way to speed it up? For example, do some preset software in ami and so on? In addition, we have a problem that the runner requested by CI is taken away by other CI. Is there a good solution?

Nick-0314 avatar Oct 19 '22 06:10 Nick-0314

We think the current one and a half minute response of github CI is a bit slow, is there a way to speed it up? For example, do some preset software in ami and so on?

Hey @mytting yes you can create custom AMI with some of the software already installed like say docker, etc. That would speed up your overall CI time. The provision time wouldn't be affected much as it's mainly just calling AWS's API spinning up a VM and installing Git Actions, installation doesn't take much time, like less than 15-20 seconds. Most of the time is spent getting a VM from AWS. I can try to take a look if there are any bottlenecks which can be improved.

In addition, we have a problem that the runner requested by CI is taken away by other CI. Is there a good solution?

What do you mean by other CI? Do you mean other jobs? Runners are are picked up by GitHub Action workflows by runner labels, which is controlled by:

runs-on: cirun-aws-amd64-32c

If you want them to be unique I can look on implementing spinning up runners by run_id, then you could do something like:

# Not implemented yet
runs-on: "cirun-aws-amd64-32c--${{ github.run_id }}"

will that help?

aktech avatar Oct 19 '22 08:10 aktech

cirun-aws-amd64-32c--${{ github.run_id }}" Yes, I want this effect. Is there anything you need to do?

@aktech

Nick-0314 avatar Oct 19 '22 08:10 Nick-0314

When will that be possible?

Nick-0314 avatar Oct 19 '22 08:10 Nick-0314

Yes, I want this effect. Is there anything you need to do?

Yes I need to implement it. You should have it within a few days (maximum: a week). I'll implement it at share the documentation link here.

aktech avatar Oct 19 '22 08:10 aktech

Yes, I want this effect. Is there anything you need to do?

Yes I need to implement it. You should have it within a few days (maximum: a week). I'll implement it at share the documentation link here.

ok, wait for the good news. Does github action support this syntax?

Nick-0314 avatar Oct 19 '22 08:10 Nick-0314

Another restriction is that the runner label must begin with cirun, as if it is not mentioned in the documentation @aktech

Nick-0314 avatar Oct 19 '22 08:10 Nick-0314

Yes, I want this effect. Is there anything you need to do?

Yes I need to implement it. You should have it within a few days (maximum: a week). I'll implement it at share the documentation link here.

ok, wait for the good news. Does github action support this syntax?

Oh, I just tried it. github supports this syntax

Nick-0314 avatar Oct 19 '22 08:10 Nick-0314

Another restriction is that the runner label must begin with cirun, as if it is not mentioned in the documentation

Thanks for pointing that out, I'll update that in the documentation, apologies for the inconvenience. Yes, that's important because its make my life easier to filter webhook events, where runner needs to be created, otherwise it would have been tricky.

ok, wait for the good news. Does github action support this syntax? Oh, I just tried it. github supports this syntax

Yep, I tried it as well. You would hear from me soon. :)

aktech avatar Oct 19 '22 08:10 aktech

It seems that the spot instance defining multiple regions and multiple specifications in the.cirun file does not work, and the spot request often appears open, 'no Spot capacity available', at which point cirun considers the creation successful. @aktech

Nick-0314 avatar Oct 20 '22 08:10 Nick-0314

Yes, that's an outstanding bug. It will be fixed in the next release.

aktech avatar Oct 20 '22 08:10 aktech

Does cirun support google cloud? aws spot instances are billed by the hour, one hour minimum, our CI usually runs for about 10 minutes, I understand that gcp is billed by the minute, @aktech

Nick-0314 avatar Oct 21 '22 07:10 Nick-0314

Does cirun support google cloud?

Yes, it does.

aws spot instances are billed by the hour, one hour minimum,

Are you sure? to me it seems like you're charged for seconds used: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/billing-for-interrupted-spot-instances.html

aktech avatar Oct 21 '22 08:10 aktech

cirun 支持谷歌云吗?

是的,它确实。

aws Spot 实例按小时计费,最少一小时,

你确定吗?对我来说,您似乎需要为使用秒数付费:https ://docs.aws.amazon.com/AWSEC2/latest/UserGuide/billing-for-interrupted-spot-instances.html

I'll check with the sales staff. I think the bill is by the hour

Nick-0314 avatar Oct 21 '22 08:10 Nick-0314

I'll check with the sales staff. I think the bill is by the hour

Quite strange, let me know if you hear from them.

aktech avatar Oct 21 '22 09:10 aktech

Quite strange, let me know if you hear from them.

Just confirmed that the bill is by the second, was misled by some pages of aws

Nick-0314 avatar Oct 21 '22 09:10 Nick-0314

@aktech Hi ,has there been any progress recently?

Nick-0314 avatar Oct 27 '22 03:10 Nick-0314

Hey @mytting not yet, I’m travelling at the moment. Expect it by the end of this week.

aktech avatar Oct 27 '22 06:10 aktech

ok

Nick-0314 avatar Oct 27 '22 06:10 Nick-0314

Unique runner labels is available now: https://docs.cirun.io/reference/unique-runner-labels

aktech avatar Oct 28 '22 17:10 aktech

Recently, some open source projects came to our aws spot to ask us what to do. We recommended cirun. We will promote cirun recently. cirun has solved many of our pain points. It's great.

Nick-0314 avatar Oct 28 '22 23:10 Nick-0314

@aktech Hi I have raised an issue to the GitHub documentation, hoping that excellent projects like Cirun can be added to the GitHub documentation, and users will avoid detours https://github.com/github/docs/issues/21697

In addition, now the job is basically fixed for 90 seconds, and the time may be a bit long. Is there a template for user-data? We can find aws people to see if there is any way to optimize it.

Nick-0314 avatar Nov 01 '22 01:11 Nick-0314

Hey @mytting thanks a lot for that, I really appreciate it. There isn't a specific template for the same, but it's fairly simple, which is pulling the github actions runner software and installing it and creating a user and it doesn't take much time really, for example, here are the logs of a one of the random runners on this repo, it took about 15 seconds for the user data script to run.

My suspicious is on AWS, the time they take to hand over a VM is quite slow. Let me know if you have more questions. I am happy to jump on a call with you and AWS to see where the bottlenecks are.

aktech avatar Nov 01 '22 02:11 aktech

ok ,I have a general understanding, is there a general script? How to pass it to EC2, I can let AWS people debug it

Nick-0314 avatar Nov 01 '22 07:11 Nick-0314

Hey @mytting thanks a lot for that, I really appreciate it. There isn't a specific template for the same, but it's fairly simple, which is pulling the github actions runner software and installing it and creating a user and it doesn't take much time really, for example, here are the logs of a one of the random runners on this repo, it took about 15 seconds for the user data script to run.

My suspicious is on AWS, the time they take to hand over a VM is quite slow. Let me know if you have more questions. I am happy to jump on a call with you and AWS to see where the bottlenecks are.

It seems that after the runner is registered to become the idle state, it will switch to the offline state, and then it will become the Active state

Nick-0314 avatar Nov 01 '22 08:11 Nick-0314

I tried it. It took about 25 seconds from creating EC2 to being able to ssh. Is there a network reason for downloading the product?

Nick-0314 avatar Nov 01 '22 08:11 Nick-0314

ok ,I have a general understanding, is there a general script? How to pass it to EC2, I can let AWS people debug it

I can try to create one for you.

It seems that after the runner is registered to become the idle state, it will switch to the offline state, and then it will become the Active state

Ah, interesting.

I tried it. It took about 25 seconds from creating EC2 to being able to ssh. Is there a network reason for downloading the product?

Did you create it via API? Can you share the script? If that's the case then it might be something on our end. I am happy to take a look, later this week.

aktech avatar Nov 01 '22 08:11 aktech

Manually created... ..

Nick-0314 avatar Nov 01 '22 08:11 Nick-0314

I mean I created it manually and didn't pass in user-data?

Nick-0314 avatar Nov 01 '22 08:11 Nick-0314

I mean I created it manually and didn't pass in user-data?

Ah, ok. I'll debug ours and will let you know.

aktech avatar Nov 01 '22 08:11 aktech