netbird icon indicating copy to clipboard operation
netbird copied to clipboard

Netbird should rotate netbird.out.log

Open trbutler opened this issue 9 months ago • 5 comments

Describe the problem

When something malfunctions in the system, /var/log/netbird.out.log or /var/log/netbird/netbird.out can grow quite large, possibly even large enough to disrupt system operation (I've seen it grow to 40GB+). The log is never rotated. It has filled up both small Debian appliance servers and several Macs with massive logs that are never removed without my intervention.

To Reproduce

Steps to reproduce the behavior:

  1. Use netbird and forget there could even be a problem.
  2. Some issue occurs (such as Netbird's issues with MacOS sleep)
  3. Notice the device's drive has filled up after getting out of space errors.

Expected behavior

Netbird should register the log with whatever native log rotation facility exists on a given OS or do its own log rotation. I would suggest this is particularly urgent on MacOS where the GUI installer does a great job of getting things setup and most users would not think to even look in /var/log/ to figure out why their device is running out of space.

Are you using NetBird Cloud?

No, self-hosted.

NetBird version

Applies to all versions up through 0.39.1.

Is any other VPN software installed?

No.

Debug output

Sample contents of /var/log/netbird.out.log from my bug report on MacOS sleep/wake:

time=2025-03-25T14:08:22.302-05:00 level=DEBUG msg="Handling message" type=InitHello
time=2025-03-25T14:08:22.341-05:00 level=DEBUG msg="Sending message" pid="oRSVlj1QGB2+o2Lb+NpXpsQ3iTuLFMoaznC7WqLwPB4=" type=RespHello
time=2025-03-25T14:08:22.342-05:00 level=ERROR msg="Failed to send" pid="W2zmqmuTObiXHXXcuiGKAv5SXsV1EJlprAEL0CC+1T8=" error="write udp [::]:59445->100.68.126.239:37142: use of closed network connection"
time=2025-03-25T14:08:22.342-05:00 level=ERROR msg="Failed to send" pid="zE7RGWFXUgp+VY6vq1okJNzVZC7FiQQJFDU/fquZAFc=" error="write udp [::]:59445->100.68.36.98:51500: use of closed network connection"
time=2025-03-25T14:08:22.342-05:00 level=DEBUG msg="Handling message" type=RespHello
time=2025-03-25T14:08:22.342-05:00 level=ERROR msg="Failed to handle message" error="session not found: e906089c"
time=2025-03-25T14:08:22.344-05:00 level=ERROR msg="Failed to send" pid="W2zmqmuTObiXHXXcuiGKAv5SXsV1EJlprAEL0CC+1T8=" error="write udp [::]:59445->100.68.126.239:37142: use of closed network connection"
time=2025-03-25T14:08:22.348-05:00 level=DEBUG msg="Handling message" type=RespHello
time=2025-03-25T14:08:22.348-05:00 level=ERROR msg="Failed to handle message" error="session not found: f2d216cf"
time=2025-03-25T14:08:22.350-05:00 level=ERROR msg="Failed to send" pid="wwRjAqzcl2/eXICHMZYp+YajCu3abQVEmiEHD0f7f0E=" error="write udp [::]:59445->100.68.253.229:41753: use of closed network connection"
time=2025-03-25T14:08:22.372-05:00 level=DEBUG msg="Handling message" type=InitHello
time=2025-03-25T14:08:22.376-05:00 level=ERROR msg="Failed to send" pid="wwRjAqzcl2/eXICHMZYp+YajCu3abQVEmiEHD0f7f0E=" error="write udp [::]:59445->100.68.253.229:41753: use of closed network connection"
time=2025-03-25T14:08:22.408-05:00 level=DEBUG msg="Sending message" pid="Pe+rqDOp6tx5iXr2C4sZ9SnF4+ryqXlM5e3v1kWz9G8=" type=RespHello
time=2025-03-25T14:08:22.409-05:00 level=DEBUG msg="Handling message" type=InitHello

Screenshots

N/A

Additional context

Add any other context about the problem here.

Have you tried these troubleshooting steps?

  • [x] Checked for newer NetBird versions
  • [x] Searched for similar issues on GitHub (including closed ones)
  • [x] Restarted the NetBird client
  • [x] Disabled other VPN software
  • [x] Checked firewall settings

trbutler avatar Mar 25 '25 19:03 trbutler

I added /etc/newsyslog.d/netbird.conf on MacOS:

/var/log/netbird.out.log              644    5      1024  *     J

For Debian I've been using:

/var/log/netbird/netbird.out
{
    missingok
    daily
    copytruncate
    rotate 7
    notifempty
    postrotate
        /bin/systemctl restart netbird
    endscript
}

trbutler avatar Mar 25 '25 23:03 trbutler

Hello @trbutler,

Looking at the log messages pasted, are you sure you are talking about netbird.out and not client.log?

netbird.out log would be usually handled by system's logging daemon and otherwise rarely contains any content at all.

client.log should already be rotated every 5 MBs, see the code

nazarewk avatar Mar 27 '25 08:03 nazarewk

@nazarewk No, unfortunately it is netbird.out and it doesn't seem to rotate until I manually intervene on Debian 12, Ubuntu 22.04 LTS or MacOS (any recent version -- I've noticed this on both 14.x and 15.x). It is usually quiet enough, but then it will have a huge burst of activity -- particularly when it gets into one of those really bad wake/sleep problems (#2454) and the log will grow incredibly large. I've had this happen on at least a half dozen different systems running on either platform and connected to two different Netbird servers.

Here's the most recent set of messages on MacOS (it had been quiet because of that aforementioned other issue; restarting Netbird will cause a new flurry of activity):

time=2025-03-26T01:30:14.426-05:00 level=DEBUG msg="Sending message" pid="Pe+rqDOp6tx5iXr2C4sZ9SnF4+ryqXlM5e3v1kWz9G8=" type=InitConf
time=2025-03-26T01:30:14.461-05:00 level=DEBUG msg="Handling message" type=RespHello
time=2025-03-26T01:30:14.461-05:00 level=ERROR msg="Failed to handle message" error="session not found: 3961b579"
time=2025-03-26T01:30:14.463-05:00 level=DEBUG msg="Handling message" type=EmptyData
time=2025-03-26T01:30:14.463-05:00 level=DEBUG msg="Exchanged key with peer" pid="Pe+rqDOp6tx5iXr2C4sZ9SnF4+ryqXlM5e3v1kWz9G8="
time=2025-03-26T01:30:14.464-05:00 level=DEBUG msg=Rekey pid="Pe+rqDOp6tx5iXr2C4sZ9SnF4+ryqXlM5e3v1kWz9G8=" after=2m10s
time=2025-03-26T01:30:14.794-05:00 level=DEBUG msg="Skipping handshake due to missing endpoint" pid="9E0ulfWwoKPkWLtS3m9wpM45tgu/4CyjieHYySakLCg="
time=2025-03-26T01:30:14.797-05:00 level=DEBUG msg="Added peer" pid="9E0ulfWwoKPkWLtS3m9wpM45tgu/4CyjieHYySakLCg="
time=2025-03-26T01:30:16.057-05:00 level=DEBUG msg="Handling message" type=InitHello
time=2025-03-26T01:30:16.121-05:00 level=DEBUG msg="Sending message" pid="Pe+rqDOp6tx5iXr2C4sZ9SnF4+ryqXlM5e3v1kWz9G8=" type=RespHello

Here's another sample of lines from MacOS an hour later (after restarting Netbird on the Mac client to get around the sleep/wake bug):

time=2025-03-27T13:18:29.954-05:00 level=DEBUG msg="Sending message" pid="CTpvH4GTb0I59At7YKP9hCb7WGGHDrSoRYW9myyGtmM=" type=InitHello
time=2025-03-27T13:18:29.957-05:00 level=DEBUG msg="Started new handshake" pid="CTpvH4GTb0I59At7YKP9hCb7WGGHDrSoRYW9myyGtmM=" sidi=f3f5ac53
time=2025-03-27T13:18:31.929-05:00 level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T13:18:31.931-05:00 level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T13:18:35.823-05:00 level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T13:18:35.825-05:00 level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T13:18:36.264-05:00 level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T13:18:36.267-05:00 level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T13:18:38.077-05:00 level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T13:18:38.079-05:00 level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T13:18:39.610-05:00 level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T13:18:39.614-05:00 level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T13:18:41.746-05:00 level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T13:18:41.749-05:00 level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T13:18:44.691-05:00 level=DEBUG msg="Handling message" type=RespHello
time=2025-03-27T13:18:44.764-05:00 level=DEBUG msg="Enter live" pid="CTpvH4GTb0I59At7YKP9hCb7WGGHDrSoRYW9myyGtmM="
time=2025-03-27T13:18:44.764-05:00 level=DEBUG msg="Sending message" pid="CTpvH4GTb0I59At7YKP9hCb7WGGHDrSoRYW9myyGtmM=" type=InitConf
time=2025-03-27T13:18:45.551-05:00 level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T13:18:45.554-05:00 level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T13:18:45.870-05:00 level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T13:18:45.873-05:00 level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T13:18:46.166-05:00 level=DEBUG msg="Handling message" type=RespHello
time=2025-03-27T13:18:46.166-05:00 level=ERROR msg="Failed to handle message" error="received unexpected message type: RespHello"

I remoted into another MacOS system that is a desktop (so removed from the aforementioned bug) and discovered it has again entirely filled up the SSD with these errors. This system is a vanilla install of netbird without me messing with the log rotation configuration and the system is not rotating it:

timothybutler@boaz log % ls -oh /var/log/netbird*
-rw-r--r--  1 root   388K Mar 27 06:56 /var/log/netbird.err.log
-rw-r--r--  1 root    91G Mar 27 13:22 /var/log/netbird.out.log

/var/log/netbird:
total 5488
-rw-r--r--  1 root   221K Mar  6 05:03 client-2025-03-06T11-03-36.790.log.gz
-rw-r--r--  1 root   184K Mar 11 11:22 client-2025-03-11T16-22-23.061.log.gz
-rw-r--r--  1 root   224K Mar 17 21:18 client-2025-03-18T02-18-12.747.log.gz
-rw-r--r--  1 root   212K Mar 24 12:50 client-2025-03-24T17-50-50.876.log.gz
-rw-r--r--  1 root   200K Mar 27 00:19 client-2025-03-27T05-19-21.624.log.gz
-rw-r--r--  1 root   1.6M Mar 27 13:22 client.log
-rw-r--r--  1 root   739B Dec  6 01:52 client_post_install.log
-rw-r--r--  1 root   327B Dec  6 01:52 client_pre_install.log

Here's the recent output to that log:

time=2025-03-27T13:24:07.697-05:00 level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T13:24:07.700-05:00 level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T13:24:08.896-05:00 level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T13:24:08.900-05:00 level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T13:24:10.540-05:00 level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T13:24:10.544-05:00 level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T13:24:17.284-05:00 level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T13:24:17.286-05:00 level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T13:24:18.587-05:00 level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T13:24:18.591-05:00 level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T13:24:20.314-05:00 level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T13:24:20.315-05:00 level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T13:24:20.368-05:00 level=DEBUG msg="Skipping handshake due to missing endpoint" pid="NyO6bOt3b+gtorX/EZDRXb1Y6nByP02yudQ1LSbInwU="
time=2025-03-27T13:24:20.369-05:00 level=DEBUG msg="Added peer" pid="NyO6bOt3b+gtorX/EZDRXb1Y6nByP02yudQ1LSbInwU="

Here's what it looks like on a Linux system (this one is functioning normally has creating a fair number of entries):

time=2025-03-27T17:21:41.981Z level=DEBUG msg="Sending message" pid="oRSVlj1QGB2+o2Lb+NpXpsQ3iTuLFMoaznC7WqLwPB4=" type=RespHello
time=2025-03-27T17:21:41.982Z level=DEBUG msg="Handling message" type=RespHello
time=2025-03-27T17:21:41.982Z level=ERROR msg="Failed to handle message" error="session not found: 349cca70"
time=2025-03-27T17:21:42.026Z level=ERROR msg="Failed to send" pid="QeI0HWJ1Sagv9OU8FFChqz19JUQToXDyaLc5noDOWn4=" error="write udp [::]:40204->100.68.206.84:54530: sendto: required key not available"
time=2025-03-27T17:21:42.350Z level=DEBUG msg="Handling message" type=RespHello
time=2025-03-27T17:21:42.350Z level=ERROR msg="Failed to handle message" error="session not found: 8980b6cf"
time=2025-03-27T17:21:42.420Z level=DEBUG msg="Handling message" type=RespHello
time=2025-03-27T17:21:42.420Z level=ERROR msg="Failed to handle message" error="session not found: 6ff9968c"
time=2025-03-27T17:21:42.675Z level=DEBUG msg="Handling message" type=RespHello
time=2025-03-27T17:21:42.675Z level=ERROR msg="Failed to handle message" error="session not found: 5471ab47"
time=2025-03-27T17:21:43.077Z level=DEBUG msg="Handling message" type=RespHello
time=2025-03-27T17:21:43.077Z level=ERROR msg="Failed to handle message" error="session not found: 1e1e0992"
time=2025-03-27T17:21:43.396Z level=DEBUG msg="Handling message" type=RespHello
time=2025-03-27T17:21:43.396Z level=ERROR msg="Failed to handle message" error="session not found: c8beb6c5"
time=2025-03-27T17:21:43.547Z level=DEBUG msg="Handling message" type=RespHello
time=2025-03-27T17:21:43.547Z level=ERROR msg="Failed to handle message" error="session not found: 7d2bb1cf"
time=2025-03-27T17:21:43.662Z level=DEBUG msg="Handling message" type=RespHello
time=2025-03-27T17:21:43.662Z level=ERROR msg="Failed to handle message" error="session not found: f5c3c86f"
time=2025-03-27T17:21:43.845Z level=DEBUG msg="Handling message" type=InitHello
time=2025-03-27T17:21:43.922Z level=DEBUG msg="Sending message" pid="Pe+rqDOp6tx5iXr2C4sZ9SnF4+ryqXlM5e3v1kWz9G8=" type=RespHello
time=2025-03-27T17:21:44.158Z level=DEBUG msg="Handling message" type=RespHello
time=2025-03-27T17:21:44.158Z level=ERROR msg="Failed to handle message" error="session not found: d7b54601"
time=2025-03-27T17:21:44.237Z level=DEBUG msg="Handling message" type=RespHello
time=2025-03-27T17:21:44.237Z level=ERROR msg="Failed to handle message" error="session not found: 5f875b06"
time=2025-03-27T17:21:44.266Z level=DEBUG msg="Handling message" type=InitConf
time=2025-03-27T17:21:44.270Z level=ERROR msg="Failed to handle message" error="failed to load biscuit (ICR1): failed decrypt biscuit"
time=2025-03-27T17:21:44.286Z level=ERROR msg="Failed to send" pid="YCxzm8imfCW2d1RVmTrKynQGPgPwZkeOw/DTeH1Bzh4=" error="write udp [::]:40204->100.68.206.84:50791: sendto: required key not available"

Neither rotated on its own until I added those rotation instructions to the systems manually. As you can see, this morning, there's nothing happening on the MacOS system, whereas the Linux system (which is a server and remained awake all night) is relatively active with messages in that file.

trbutler avatar Mar 27 '25 17:03 trbutler

I second that this issue is occurring. /var/log/netbird.out.log continuously grows, unless manual intervention occurs.

twocoolbeans avatar Apr 25 '25 16:04 twocoolbeans

Just received a disk space alert on a server acting as the entry point to a larger network.

Image
OS: linux/amd64
Daemon version: 0.60.9
CLI version: 0.60.9

-rw-r--r-- 1 root root 3.8G Dec 24 20:59 netbird.out

Sample data:

time=2025-12-24T20:59:37.677Z level=ERROR msg="Failed to send" pid="YWAVst99WjEU3qkgbybCYtIi2vGo5/r5fbs3jKWn320=" error="write udp [::]:48899->100.81.176.24:59919: use of closed network connection"
time=2025-12-24T20:59:37.813Z level=ERROR msg="Failed to send" pid="I3Jd2nQv4C5J5sZASYeltytGL7BfbwXV6DHJcHhIqDc=" error="write udp [::]:48899->100.81.176.24:52170: use of closed network connection"
time=2025-12-24T20:59:37.981Z level=ERROR msg="Failed to send" pid="I3Jd2nQv4C5J5sZASYeltytGL7BfbwXV6DHJcHhIqDc=" error="write udp [::]:48899->100.81.176.24:52170: use of closed network connection"
time=2025-12-24T20:59:38.062Z level=ERROR msg="Failed to send" pid="YWAVst99WjEU3qkgbybCYtIi2vGo5/r5fbs3jKWn320=" error="write udp [::]:48899->100.81.176.24:59919: use of closed network connection"
time=2025-12-24T20:59:38.150Z level=ERROR msg="Failed to send" pid="5m//W2/n0HRMC+AoBI3NkveqwNDQ3DurRldNRZ8ynzM=" error="write udp [::]:48899->100.81.33.191:49350: use of closed network connection"
time=2025-12-24T20:59:38.332Z level=ERROR msg="Failed to send" pid="apdGYhL0hDBkE/Lh/jFCfUGHXD9/R5MwZtWC7K2Ywps=" error="write udp [::]:48899->100.81.223.169:39610: use of closed network connection"

netbird.out log [...] rarely contains any content at all.

If I tail -f netbird.out, it is very chatty

https://github.com/user-attachments/assets/21373b0e-a83d-4a6f-b413-ed77f5572d6f

sahps avatar Dec 24 '25 21:12 sahps

I have also seen this file grow unexpectedly. In my case, it was 23GB in size.

I would like to mention that I tried changing the log level and it did not prevent DEBUG messages in netbird.out, but it did work on client.log

I changed the log level like this:

echo 'NB_LOG_LEVEL=warn' | sudo tee -a /etc/sysconfig/netbird
sudo systemctl restart netbird

I am also in Debian 12, NetBird version 0.62.3.

zerowhybr avatar Jan 14 '26 22:01 zerowhybr