Results 99 issues of Thomas Leonard

Sometimes workers get stuck in uninterruptable waits (due to btrfs bugs). In that case, the TCP connection remains alive but the worker never does any work. The scheduler should detect...

enhancement

At the moment, the scheduler hard-codes time estimates for cached and non-cached jobs based on ocaml-ci's usage patterns: https://github.com/ocurrent/ocluster/blob/6e6e2b283e57165dee484519c368c334a00a6e59/scheduler/cluster_scheduler.ml#L33-L38 We should let the client pass its own estimates.

At the moment, workers will buffer any amount of log data (until they run out of memory and crash). We should abort any job that produces more than some limit,...

enhancement

On `x86-bm-7.ocamllabs.io`, `btrfs subvolume sync` has been running for days: ``` USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1986 11.7 0.4 5407724 414700 ? Ssl...

At the moment, we copy in the cluster worker's clone of opam-repository, which includes all branches and PRs. We should instead just clone the `master` branch directly within the Dockerfile.

I think I'm a bit confused about how to use dscheck. Perhaps other people will be too. The README says: > At the core, dscheck runs each test numerous times,...

This is a quick fix for https://github.com/ocaml-multicore/eio/issues/732. A proper fix would be quite involved and I'd like to get feedback from the OCaml devs about fixing it upstream first, but...

Since #728, we submit requests and wait for replies in a single system call. If a signal is received while waiting in `io_uring_enter` but after some requests were accepted, it...

bug

When a signal arrives, the C function `handle_signal` is called, which calls `caml_record_signal`. This just sets a flag indicating that the corresponding OCaml handler should run at the next opportunity....