girder_worker icon indicating copy to clipboard operation
girder_worker copied to clipboard

Document job event binding and output reference kwargs.

Open subdavis opened this issue 4 years ago • 7 comments

I've now used girder worker on 3 projects, and I've wanted this feature on all of them.

What do people think about a new type of docker result hook that is a simple webhook? It would look something like this:

docker_run.delay(
    'my_docker_image:latest', container_args=[
        ...
    ],
    girder_result_hooks=[
        GirderUploadVolumePathJobArtifact(...)
    ],
    girder_status_hooks=[
        GirderSimpleWebhook('https://domain.com/api/v1/hook')
    ],
)

I don't think the return value of the webhook should impact the status of the job. If a job completes successfully and the webhook gives a 500, that should be totally fine, just like how CI works. Most of the time this hook would be to an endpoint in a girder plugin, but it could just as easily be an external system.

I actually think "ignore failures" is what people imagine when they think about hooks, so the kwarg girder_result_hook is IMO a bit misleading. Those aren't hooks, but transforms that are part of the job and affect the outcome.

We can't rename girder_result_hooks, so these two kwargs would inevitably cause confusion.

subdavis avatar Feb 19 '20 14:02 subdavis

I assume we want to pass some small information to the endpoint, such as the job's completion status and the job ID, but not large data, like an output file. Further, I assume the hook could have some auth token to send to the endpoint to show that it is a trusted source.

Do all status hooks get triggered after all result hooks, as the success of processing result hooks is part of the final status of the job? Or would you want a flag indicating where in the job's life cycle the status hook should be called?

manthey avatar Feb 19 '20 14:02 manthey

I assume we want to pass some small information to the endpoint, such as the job's completion status and the job ID, but not large data, like an output file

Yes.

Further, I assume the hook could have some auth token to send to the endpoint to show that it is a trusted source.

Also yes.

Do all status hooks get triggered after all result hooks.

That's how I imagine it would work.

Or would you want a flag indicating where in the job's life cycle the status hook should be called?

I don't know. I've never had a use case where I cared about a status other than completion. We could definitely have hooks that trigger for specific statuses, like queued, started, completed, but I've never needed any of that, ~~so I think I'd prefer to do option 1~~ I really don't know.

subdavis avatar Feb 19 '20 14:02 subdavis

The way I've always handled this is to attach special metadata fields to the job, then hook into the job update event on the server and check if it's a status update, and if so take action on whatever piece of data I linked into the job.

# create some job
job = ...
# link it to some piece of data
job['relatedThingId'] = item['_id']
Job().save(job)
def load(self, info):
    events.bind('jobs.job.update', 'my_plugin', jobUpdated)

Here's an example

Maybe you already knew about that and were making a case that the webhook approach is superior, however unless we actually have a use case to talk to something other than Girder I'm not convinced. What makes this preferable to using the existing job update endpoint to listen to job updates? If we just want more specificity of events, we could add those as more granular server-side events, such as jobs.jobs.update.status instead of just jobs.job.update, for example.

zachmullen avatar Feb 19 '20 15:02 zachmullen

Maybe you already knew about that and were making a case that the webhook approach is superior.

No, I didn't know this was possible. It would be great to add this to the docs. I'll try it out. Thanks!

subdavis avatar Feb 19 '20 15:02 subdavis

Cool. FWIW, another common need is to come from the other direction when linking data -- that is, linking file(s) produced by a worker job back to some existing data in Girder. The way to accomplish that is using the reference feature on the output, like so.

zachmullen avatar Feb 19 '20 15:02 zachmullen

Converting this issue to a docs update issue. I'll make a PR sometime after I've used it a bit more.

subdavis avatar Feb 19 '20 16:02 subdavis

Thanks. Our docs for the new girder_worker architecture are abysmal. We've heretofore done everything via tribal knowledge and reading source code.

zachmullen avatar Feb 19 '20 16:02 zachmullen