headscale icon indicating copy to clipboard operation
headscale copied to clipboard

Support for Network Flow Logging?

Open victorhooi opened this issue 1 year ago • 9 comments

Would it be possible to add some kind of network flow logging to Headscale?

Perhaps something basic like byte counters between nodes to begin?

victorhooi avatar Jan 16 '24 06:01 victorhooi

The network types for the client seem to be here:

https://github.com/tailscale/tailscale/blob/main/types/netlogtype/netlogtype.go

There's some documentation on the feature, and a blog post announcing it.

Does anybody know if there's anything special that needs to be done on the client, to enable them shipping the network log information? Or is it currently being sent by the Tailscale clients to headscale, and we simply discard it?

victorhooi avatar Jan 28 '24 12:01 victorhooi

I think this data is sent to their logtail server which we can't reconfigure. https://tailscale.com/kb/1011/log-mesh-traffic

Sh4d avatar Jan 30 '24 19:01 Sh4d

@Sh4d we can reconfigure log server using LogTarget policy key, see https://github.com/tailscale/tailscale/blob/main/util/syspolicy/policy_keys.go

adipierro avatar Feb 06 '24 13:02 adipierro

If anyone tries to implement logtail API, here are some tech docs 😊: https://github.com/tailscale/tailscale/blob/main/logtail/api.md

adipierro avatar Feb 12 '24 07:02 adipierro

I would like to share some of my insights that I gained while working on the client logs.[^1]

What logtail is

Both network flow logs and client logs send their data to a Logtail[^2] instance. The clients send the data to the logtail instance as JSON objects. The data is grouped by type (e.g. client logs, network flow logs). The originating node is also identifiable.

Configuration of the logtail instance

It is possible to configure the logtail instance locally with the LogTarget system policy under windows or the TS_LOG_TARGET environment variable under Linux. These settings are only used for the client logs. The network flow logs are always sent to log.tailscale.io.

Problems:

  • For this feature to be possible tailscaled must respect the set logtail instance for all logging. This has to be changed upstream in tailscale and is blocking this feature in headscale.
  • It would be good if the logtail instance could be configured by the control server. This will take some thought to get right though.

Receiving the network flow logs

A corresponding service that receives and possibly processes the network flow logs would also be required for this feature. The logtail protocol is fortunately very simple. A simple receiver that only writes the data to the file system can be written in a short time.[^3] These logs can the be processed further using your own log pipelines.

Open Questions:

  • Should this log receiver be part of headscale? I personally tend towards no.

[^1]: Clients Logs is a feature for the central collection of client's logs that works very similar to the network flow logs behind the scenes. [^2]: API Docs: https://github.com/tailscale/tailscale/blob/main/logtail/api.md [^3]: A simple receiver for client logs (and ssh session monitoring): https://github.com/Qup42/loghead

Qup42 avatar Mar 16 '24 13:03 Qup42

How does one enable network flow logging on the client? I've setup a simple http server that gets the contents of the requests but I can't see any network flow logs (logs for the tailtraffic.log.tailscale.io collection). Is there an environment variable or config that needs to be set on the client?

lockness-Ko avatar Apr 05 '24 02:04 lockness-Ko

For network flow logging to work you currently have to patch both the local client and the control plane. As stated above some patches to the client are required and support for this feature is not yet implement in the headscale control server.

But you can of course patch your executables for debugging purposes:

control plane

diff --git a/hscontrol/mapper/mapper.go b/hscontrol/mapper/mapper.go
index df0f4d9..4dac498 100644
--- a/hscontrol/mapper/mapper.go
+++ b/hscontrol/mapper/mapper.go
@@ -560,6 +560,8 @@ func (m *Mapper) baseMapResponse() tailcfg.MapResponse {
                // TODO(kradalby): Implement PingRequest?
        }
 
+       resp.DomainDataPlaneAuditLogID = "6262626262626262626262626262626262626262626262626262626262626262"
+
        return resp
 }
 
@@ -594,6 +596,9 @@ func (m *Mapper) baseWithConfigMapResponse(
                DisableLogTail: !m.logtail,
        }
 
+       // 32*b
+       resp.DomainDataPlaneAuditLogID = "6262626262626262626262626262626262626262626262626262626262626262"
+
        return &resp, nil
 }
 
diff --git a/hscontrol/mapper/tail.go b/hscontrol/mapper/tail.go
index c10da4d..459069f 100644
--- a/hscontrol/mapper/tail.go
+++ b/hscontrol/mapper/tail.go
@@ -128,9 +128,10 @@ func tailNode(
        //   - 74: 2023-09-18: Client understands NodeCapMap
        if capVer >= 74 {
                tNode.CapMap = tailcfg.NodeCapMap{
-                       tailcfg.CapabilityFileSharing: []tailcfg.RawMessage{},
-                       tailcfg.CapabilityAdmin:       []tailcfg.RawMessage{},
-                       tailcfg.CapabilitySSH:         []tailcfg.RawMessage{},
+                       tailcfg.CapabilityFileSharing:        []tailcfg.RawMessage{},
+                       tailcfg.CapabilityAdmin:              []tailcfg.RawMessage{},
+                       tailcfg.CapabilitySSH:                []tailcfg.RawMessage{},
+                       tailcfg.CapabilityDataPlaneAuditLogs: []tailcfg.RawMessage{},
                }
 
                if randomClientPort {
@@ -159,5 +160,8 @@ func tailNode(
                tNode.LastSeen = node.LastSeen
        }
 
+       // must be 32 decoded bytes, here: 32*a
+       tNode.DataPlaneAuditLogID = "6161616161616161616161616161616161616161616161616161616161616161"
+
        return &tNode, nil
 }

client

Patch the client to send the traffic for the tailtraffic.log.tailscale.io collection to your host instead of the default host/log URL.

Qup42 avatar Apr 08 '24 20:04 Qup42

This issue is stale because it has been open for 90 days with no activity.

github-actions[bot] avatar Jul 08 '24 01:07 github-actions[bot]

This issue is stale because it has been open for 90 days with no activity.

not stale

benley avatar Jul 08 '24 15:07 benley