trained-models icon indicating copy to clipboard operation
trained-models copied to clipboard

improve workflow to perform all operations on self-hosted runner

Open hvgazula opened this issue 1 year ago • 7 comments

Partial workflow at https://github.com/hvgazula/trained-models/actions/runs/6806826215 barring pushing the branch and docker image on successful testing.

hvgazula avatar Nov 09 '23 04:11 hvgazula

datalad push ... in a new job keeps failing. For now, leaving it as a step in the same job that configures git seem to work fine.

hvgazula avatar Nov 10 '23 04:11 hvgazula

@hvgazula I checked the workflow right now, is it this one? https://github.com/hvgazula/trained-models/actions/runs/6813853293/job/18529675234

gaiborjosue avatar Nov 10 '23 05:11 gaiborjosue

Sure, you can use that by removing the last job and moving the steps into the previous one.

hvgazula avatar Nov 10 '23 11:11 hvgazula

Here's the final workflow (with a test docker file) that works end-to-end on the self-hosted ec2 runner and pushes the docker image, weights, and inference scripts only if the docker and singularity tests run successfully. https://github.com/hvgazula/trained-models/actions/runs/6831844963 (edited to add the link).

hvgazula avatar Nov 11 '23 03:11 hvgazula

and here's the final branch layout https://github.com/hvgazula/trained-models/tree/issue-7

hvgazula avatar Nov 11 '23 03:11 hvgazula

Hello @hvgazula, really nice. Do you think it would be best to separate into multiple jobs? Or do you think it is best to keep everything inside one job (a.k.a "create branch")?

gaiborjosue avatar Nov 11 '23 03:11 gaiborjosue

I'd leave it as is. No more refactoring at this point. If at all you are interested in separating into multiple jobs, try fixing this workflow first. This issue partially had a role to play in why I ended up with all the steps in one job sandwiched between start and stop ec2 runner. Good luck!

hvgazula avatar Nov 11 '23 04:11 hvgazula