caddy icon indicating copy to clipboard operation
caddy copied to clipboard

Implement WAL writer for the `net` log writer

Open mohammed90 opened this issue 2 years ago • 5 comments

The net log writer is great for infrastructure where logs across all systems sink into a centralized system. However, the current implementation is bitten by the following fallacies of distributed systems[^1]:

  • The network is reliable
  • Latency is zero
  • Bandwidth is infinite
  • Transport cost is zero

We've heard at least 1 report of pain when the network misbehaves, #4083. While timeouts and redialing resolved the earlier issue, it isn't robust against the relevant fallacies. We can still suffer from slowness. We're still reliant on the network's bandwidth. The network misbehavior can impact Caddy's performance.

To get around the fallacies and not impact Caddy's performance, I propose we introduce WAL to the net writer. The WAL will be placed in the data directory. Log writes are synchronously written to the WAL, and an asynchronous reader (from WAL) picks up the entries to write them to the network. On first open, the net writer opens the standard WAL, and it should check for unwritten entries to be synchronized to upstream. On close, the writer should attempt to flush/drain all the entries in the WAL.

With this implementation, Caddy will not suffer due to external network issues pertaining to the log sink, e.g. network is slow, log ingester is slow, or any of the fallacies.

Have I missed anything?

[^1]: Fallacies of Distributed Systems

mohammed90 avatar Aug 05 '23 15:08 mohammed90

@mohammed90 Hello, I have great interst about this issue.Although I'm not proficient at go,but I've read the docs and researched the implements of WAL. This is my first time participating in an open-source project. Is there anything else I need to do? : )

elysium-w avatar Jun 05 '24 04:06 elysium-w

Go for it, @elysium-w! I had a humble attempt in the net-wal branch, but it doesn't work properly. You can look at it or start fresh. I'd love it see working implementation!

mohammed90 avatar Jun 05 '24 09:06 mohammed90

Go for it, @elysium-w! I had a humble attempt in the net-wal branch, but it doesn't work properly. You can look at it or start fresh. I'd love it see working implementation!

Sure! I will

elysium-w avatar Jun 07 '24 00:06 elysium-w

大胆试试吧,@elysium-w!我在net-wal 分支中做了一次小小的尝试,但效果并不好。你可以看看它或者重新开始。我很乐意看到它的有效实现!

May I ask where you have tried using WAL?I mean in which folder.What functionalities should a qualified WAL have in Caddy?

elysium-w avatar Aug 08 '24 08:08 elysium-w

大胆试试吧,@elysium-w!我在net-wal 分支中做了一次小小的尝试,但效果并不好。你可以看看它或者重新开始。我很乐意看到它的有效实现!

May I ask where you have tried using WAL?I mean in which folder.

You can see them in the link (https://github.com/caddyserver/caddy/compare/master...net-wal). It isn't a working nor complete implementation.

What functionalities should a qualified WAL have in Caddy?

What do you mean by this question?

mohammed90 avatar Aug 08 '24 09:08 mohammed90