Thomas Broadley
Thomas Broadley
Also, sorry, I just merged a change that will conflict with this: https://github.com/METR/vivaria/pull/577
It's rough to double this query's runtime. The `/getTraceModifiedSince` route is used on run pages -- it isn't on the critical path to get something rendered, but the run's trace...
Yes, that's probably the cause. Thanks for pointing this out!
Maybe. In this case, the flow is: - User clicks the "edit in playground" button - Runs page UI code constructs a URL pointing to the playground, containing the generation...
And in fact the playground page already caches the last executed generation request in localstorage, so this could be pretty easy to do.
Maybe `window.postMessage`?
> * create a new runId by inserting into runs_t first, and then adding the rest of the data in a > separate transaction I'm not sure a separate transaction...
> `docker stop` -> `docker commit` -> push committed images to a container registry -> support starting task environments from these images Sounds good to me! `docker checkpoint` would be...
Never mind, this was probably an issue before #1029 too. `eval.run_id` is only unique per eval set, just like eval set IDs. This is based on looking at the JSON...
I think `eval_id` makes more sense. It isn't clear that `task_id` isn't the same across all runs of the same task. `eval_id` sounds like it's unique per eval in an...