gcr-catalogs
gcr-catalogs copied to clipboard
automate integration tests that require NERSC resources
Now that GCRCatalogs have real users (aka people not on the DESCQA team) and people have started to make plots with these catalogs, it is important to test the updates to readers and catalogs before releasing them.
While there are some very basic unit tests in place, most of the major issues are likely to come up at the integration stage (e.g., with interfacing with DESCQA or with actual catalog data). It would be nice if we can automatically trigger integration tests that require NERSC resources when a PR is submitted, and display the test results in GitHub.
During a brief conversation, @jchiang87 suggested that what we need is likely feasible. What is left to do is to figure out how to make it happen.
- [ ] @yymao @heather999 @tomuram figure out how to submit a job to run at NERSC from SLAC
Further conversation with @jchiang87 and @heather999 points out the need of a special NERSC account that can be used to ssh into NERSC and submit a batch job.
@tony-johnson and @brianv0 Jim suggested that we'd want to submit these jobs from SLAC via Jenkins using a special NERSC account. We're going to need your expertise :)
Capturing advice from Tony (via email) on how to do this:
The easiest way to get this to work is just to run a jenkins
agent at NERSC which creates a connection back to the jenkins
server at SLAC and asks for work. This is how we run most of the
agents at SLAC. If you do this I would suggest using some
account like desc to run the agent, and if you run it somewhere a
cron job can be used to ensure it is running (e.g. corigrid) it
is easy to setup.
A second way would be to set up an on-demand agent which is
started from SLAC, when it is needed, probably by storing NERSC
ssh credentials in the SLAC jenkins server. I think this would
have to use a real-users ssh credentials, I don't think it would
work with a service account like desc.
If you want to go with the first option, the steps would be:
a) Create an jenkins "job" at SLAC.
b) Create a script for running the agent at NERSC under desc or
similar account. I can create an example script if this is what
you would like to do.
For the first option, Tony provided
a basic set up scripts are in
~desc/jenkins (at NERSC)
But we need the jenkins agent set up to test them.
@yymao Brian helped me set up the Jenkins job on the SLAC server, and I've made a cron job that ensures the jenkins agent at NERSC is running under the desc
account. We triggered Jenkins to run a hello, world
script at NERSC:
https://srs.slac.stanford.edu/hudson/job/LSST-DESC/job/nersc-helloworld/lastBuild/console
So we can start to discuss what github events to trigger on and where to put the scripts that you want executed for the integration tests.
That's great! Thanks @jchiang87! Several questions:
- Where do we put together the script to run, NERSC or SLAC? In your example I did not see a script on the NERSC side.
- What environment variable will we have in the script?
- I assume we also need to add something to the
.travis.yml
to trigger the Jenkins build? Or can Jenkins monitor GitHub on its own?
The script can live anywhere at NERSC where the desc
account can access (so any lsst
group area). You can set any environment you need within that script. The Jenkins configuration is accessible through the Jenkins interface. I should be able to grant you access to add and configure jobs in the LSSTDESC area via your SLAC userid.
@jchiang87 OK. For the NERSC-side script, the first thing it needs to do is to clone the targeted PR or commit. So how can the NERSC-side script know which PR/commit to clone? I imagine that must be through the Jenkins?
I think this is now possible with Heather being SPIN certified? I think this is not super high priority but we should revisit it some time.
I think this is further made possible due to NERSC's gitlab CI capability. Is this of any interest?