Thomas Broadley

Results 109 comments of Thomas Broadley

Strangely, my local copy of VIvaria can upload the file I linked in 22 seconds. Maybe the problem is database data transfer latency? No, that should be quite fast within...

I'm not getting 502s anymore in production. I do [see](https://us3.datadoghq.com/logs?query=importInspect&agg_m=count&agg_m_source=base&agg_t=count&cols=host%2Cservice&fromUser=true&messageDisplay=inline&refresh_mode=paused&storage=hot&stream_sort=desc&viz=stream&from_ts=1744224943127&to_ts=1744225843127&live=false) the request to upload this file takes over two minutes in production. Profile: ![Image](https://github.com/user-attachments/assets/f7e20e90-84ad-41b2-ac02-34de24a917e0) Most of the usage is database...

Yes, I think we should address this issue by changing Vivaria to do bulk inserts and updates of trace entries when upserting Inspect runs in InspectImporter.ts

I was wrong about bulk inserts. On my machine, they don't seem to speed up importing. On a 16MB log file, I'm seeing about 30 seconds spent importing whether there...

I'll think more about bulk inserts tomorrow. Maybe there are some benefits to not doing them in bulk -- each sample gets its own transaction so can fail independently of...

I decided to look for signs of life for bulk inserts by batching inserts from multiple samples into one query. That didn't really seem to help. So back to the...

Maybe `COPY TO`? https://www.npmjs.com/package/pg-copy-streams I bet it would be faster, but also it's more annoying to code, because `COPY TO` doesn't support `ON CONFLICT`: https://stackoverflow.com/questions/48019381/how-postgresql-copy-to-stdin-with-csv-do-on-conflic-do-update But we could at least...

At my last job, we would have simply not had all of this logic in a single API endpoint handler. The handler would have enqueued a job and immediately returned...

Yeah that makes sense. If we were to switch PM2 to do that right now, I would be concerned about the same run getting set up by two different BG...