Jerome Gravel-Niquet

Results 101 comments of Jerome Gravel-Niquet

Thanks for looking into it. Hopefully it's just a fluke. I'll add some more metadata around the exception to try and figure out under what conditions this happened. I noticed...

I've also encountered this issue just now. I'm keeping the `http.IncomingMessage` for a little while and only reading from it later (only if necessary). Without node-replay this is fine, but...

You're right. This command should be a noop after the first successful run or it should return a proper error.

Update: We noticed this only happens on kernel v5.11. Diagnosing more with `kvm_stat`, I noticed the "bad" firecracker process had absolutely no `halt_wakeup` events while correctly functioning firecracker processes had...

We're not seeing any halt-polling events: ``` $ cat /sys/kernel/debug/tracing/set_event kvm:kvm_vcpu_wakeup kvm:kvm_halt_poll_ns $ cat /sys/kernel/debug/tracing/set_event_pid 14577 # problematic firecracker $ cat /sys/kernel/debug/tracing/trace_pipe # nothing... ```

I believe this issue is the same as what we're currently facing: https://github.com/cloud-hypervisor/cloud-hypervisor/issues/2277#issuecomment-817992687 We're seeing roughly 1 million exits per second on affected VMs. The fix appears to be to...

I have implemented the same fix in a fork here: https://github.com/jeromegn/firecracker/commit/d6e73e4405d8f119605291750099f94d6b9715c8 Please let me know if I'm making a grave mistake 😅

We're also interested by this. For now we're just setting a bigger TX queue on our tap interfaces.

I ran this script and observed the same results. **However**, I figured out why. The test is inserting constantly, using all CPU resources and `sled` doesn't "have the time" to...

We have a similar problem at Fly.io and after having done a few tests, I wonder if we even need to bother with RSA certificates? It looks like rustls doesn't...