gcr-catalogs icon indicating copy to clipboard operation
gcr-catalogs copied to clipboard

automate integration tests that require NERSC resources

Open yymao opened this issue 7 years ago • 10 comments

Now that GCRCatalogs have real users (aka people not on the DESCQA team) and people have started to make plots with these catalogs, it is important to test the updates to readers and catalogs before releasing them.

While there are some very basic unit tests in place, most of the major issues are likely to come up at the integration stage (e.g., with interfacing with DESCQA or with actual catalog data). It would be nice if we can automatically trigger integration tests that require NERSC resources when a PR is submitted, and display the test results in GitHub.

During a brief conversation, @jchiang87 suggested that what we need is likely feasible. What is left to do is to figure out how to make it happen.

yymao avatar Dec 08 '17 07:12 yymao

  • [ ] @yymao @heather999 @tomuram figure out how to submit a job to run at NERSC from SLAC

yymao avatar Dec 08 '17 15:12 yymao

Further conversation with @jchiang87 and @heather999 points out the need of a special NERSC account that can be used to ssh into NERSC and submit a batch job.

yymao avatar Dec 08 '17 16:12 yymao

@tony-johnson and @brianv0 Jim suggested that we'd want to submit these jobs from SLAC via Jenkins using a special NERSC account. We're going to need your expertise :)

heather999 avatar Dec 08 '17 16:12 heather999

Capturing advice from Tony (via email) on how to do this:

The easiest way to get this to work is just to run a jenkins
agent at NERSC which creates a connection back to the jenkins
server at SLAC and asks for work. This is how we run most of the
agents at SLAC.  If you do this I would suggest using some
account like desc to run the agent, and if you run it somewhere a
cron job can be used to ensure it is running (e.g. corigrid) it
is easy to setup.

A second way would be to set up an on-demand agent which is
started from SLAC, when it is needed, probably by storing NERSC
ssh credentials in the SLAC jenkins server. I think this would
have to use a real-users ssh credentials, I don't think it would
work with a service account like desc.

If you want to go with the first option, the steps would be:

a) Create an jenkins "job" at SLAC.
b) Create a script for running the agent at NERSC under desc or
similar account. I can create an example script if this is what
you would like to do.

For the first option, Tony provided

a basic set up scripts are in

~desc/jenkins (at NERSC)

But we need the jenkins agent set up to test them.

jchiang87 avatar Jan 30 '18 17:01 jchiang87

@yymao Brian helped me set up the Jenkins job on the SLAC server, and I've made a cron job that ensures the jenkins agent at NERSC is running under the desc account. We triggered Jenkins to run a hello, world script at NERSC: https://srs.slac.stanford.edu/hudson/job/LSST-DESC/job/nersc-helloworld/lastBuild/console So we can start to discuss what github events to trigger on and where to put the scripts that you want executed for the integration tests.

jchiang87 avatar Feb 10 '18 00:02 jchiang87

That's great! Thanks @jchiang87! Several questions:

  • Where do we put together the script to run, NERSC or SLAC? In your example I did not see a script on the NERSC side.
  • What environment variable will we have in the script?
  • I assume we also need to add something to the .travis.yml to trigger the Jenkins build? Or can Jenkins monitor GitHub on its own?

yymao avatar Feb 10 '18 00:02 yymao

The script can live anywhere at NERSC where the desc account can access (so any lsst group area). You can set any environment you need within that script. The Jenkins configuration is accessible through the Jenkins interface. I should be able to grant you access to add and configure jobs in the LSSTDESC area via your SLAC userid.

jchiang87 avatar Feb 10 '18 00:02 jchiang87

@jchiang87 OK. For the NERSC-side script, the first thing it needs to do is to clone the targeted PR or commit. So how can the NERSC-side script know which PR/commit to clone? I imagine that must be through the Jenkins?

yymao avatar Feb 10 '18 00:02 yymao

I think this is now possible with Heather being SPIN certified? I think this is not super high priority but we should revisit it some time.

yymao avatar Jan 25 '20 00:01 yymao

I think this is further made possible due to NERSC's gitlab CI capability. Is this of any interest?

heather999 avatar Oct 20 '21 21:10 heather999