🐛 BUG: Logging is very heavy if certificate has expired.
What version of nebula are you using? (nebula -version)
1.8.2
What operating system are you using?
Linux
Describe the Bug
In an example here where the CA had expired before I deployed everything.
I found a bug with Nebula 1.8.2. if the CA is expired, the log output if written to a file is truly ginormous. over 100MB in just an hour with this kind of output. Running cat in a terminal to view a log like this is a lot of data. the rate of output should be reduced.
Logs from affected hosts
time="2024-03-14T03:34:51Z" level=info msg="Invalid certificate from host" cert="NebulaCertificate {\n\tDetails {\n\t\tName: centosclient\n\t\tIps: [\n\t\t\t192.168.100.4/24\n\t\t]\n\t\tSubnets: []\n\t\tGroups: [\n\t\t\t\"servers\"\n\t\t]\n\t\tNot before: 2024-03-12 21:00:16 +0000 UTC\n\t\tNot After: 2024-03-13 21:00:16 +0000 UTC\n\t\tIs CA: false\n\t\tIssuer: e19330d7f9fe73a05132490eaf0debacbbcf068b42cf12e27ebc63756da40172\n\t\tPublic key: 838ee739b1e32a75ac233b891f36a0974896b78f7ccd776fee148421bfee7b53\n\t\tCurve: CURVE25519\n\t}\n\tFingerprint: bf74b3578cbb5d0ad9c4ef2281861f92f3d6ffd7b50c2776e7f277c210bd27d1\n\tSignature: a9495dc92ef541f83f2603ed12aa62938c6b42ea1f64ac9f22acb831cd2e2dae8ee06c958ca2dc7fb4c7d2401bf369b3c080522709bd983af71e31d5ef1ce209\n}" error="certificate validation failed: certificate is expired" handshake="map[stage:1 style:ix_psk0]" udpAddr="52.65.41.129:11308"
time="2024-03-14T03:34:52Z" level=info msg="Invalid certificate from host" cert="NebulaCertificate {\n\tDetails {\n\t\tName: centosclient\n\t\tIps: [\n\t\t\t192.168.100.4/24\n\t\t]\n\t\tSubnets: []\n\t\tGroups: [\n\t\t\t\"servers\"\n\t\t]\n\t\tNot before: 2024-03-12 21:00:16 +0000 UTC\n\t\tNot After: 2024-03-13 21:00:16 +0000 UTC\n\t\tIs CA: false\n\t\tIssuer: e19330d7f9fe73a05132490eaf0debacbbcf068b42cf12e27ebc63756da40172\n\t\tPublic key: 838ee739b1e32a75ac233b891f36a0974896b78f7ccd776fee148421bfee7b53\n\t\tCurve: CURVE25519\n\t}\n\tFingerprint: bf74b3578cbb5d0ad9c4ef2281861f92f3d6ffd7b50c2776e7f277c210bd27d1\n\tSignature: a9495dc92ef541f83f2603ed12aa62938c6b42ea1f64ac9f22acb831cd2e2dae8ee06c958ca2dc7fb4c7d2401bf369b3c080522709bd983af71e31d5ef1ce209\n}" error="certificate validation failed: certificate is expired" handshake="map[stage:1 style:ix_psk0]" udpAddr="52.65.41.129:11308"
time="2024-03-14T03:34:52Z" level=info msg="Invalid certificate from host" cert="NebulaCertificate {\n\tDetails {\n\t\tName: centosclient\n\t\tIps: [\n\t\t\t192.168.100.4/24\n\t\t]\n\t\tSubnets: []\n\t\tGroups: [\n\t\t\t\"servers\"\n\t\t]\n\t\tNot before: 2024-03-12 21:00:16 +0000 UTC\n\t\tNot After: 2024-03-13 21:00:16 +0000 UTC\n\t\tIs CA: false\n\t\tIssuer: e19330d7f9fe73a05132490eaf0debacbbcf068b42cf12e27ebc63756da40172\n\t\tPublic key: 838ee739b1e32a75ac233b891f36a0974896b78f7ccd776fee148421bfee7b53\n\t\tCurve: CURVE25519\n\t}\n\tFingerprint: bf74b3578cbb5d0ad9c4ef2281861f92f3d6ffd7b50c2776e7f277c210bd27d1\n\tSignature: a9495dc92ef541f83f2603ed12aa62938c6b42ea1f64ac9f22acb831cd2e2dae8ee06c958ca2dc7fb4c7d2401bf369b3c080522709bd983af71e31d5ef1ce209\n}" error="certificate validation failed: certificate is expired" handshake="map[stage:1 style:ix_psk0]" udpAddr="52.65.41.129:11308"
time="2024-03-14T03:34:53Z" level=info msg="Invalid certificate from host" cert="NebulaCertificate {\n\tDetails {\n\t\tName: centosclient\n\t\tIps: [\n\t\t\t192.168.100.4/24\n\t\t]\n\t\tSubnets: []\n\t\tGroups: [\n\t\t\t\"servers\"\n\t\t]\n\t\tNot before: 2024-03-12 21:00:16 +0000 UTC\n\t\tNot After: 2024-03-13 21:00:16 +0000 UTC\n\t\tIs CA: false\n\t\tIssuer: e19330d7f9fe73a05132490eaf0debacbbcf068b42cf12e27ebc63756da40172\n\t\tPublic key: 838ee739b1e32a75ac233b891f36a0974896b78f7ccd776fee148421bfee7b53\n\t\tCurve: CURVE25519\n\t}\n\tFingerprint: bf74b3578cbb5d0ad9c4ef2281861f92f3d6ffd7b50c2776e7f277c210bd27d1\n\tSignature: a9495dc92ef541f83f2603ed12aa62938c6b42ea1f64ac9f22acb831cd2e2dae8ee06c958ca2dc7fb4c7d2401bf369b3c080522709bd983af71e31d5ef1ce209\n}" error="certificate validation failed: certificate is expired" handshake="map[stage:1 style:ix_psk0]" udpAddr="52.65.41.129:11308"
time="2024-03-14T03:35:17Z" level=info msg="Invalid certificate from host" cert="NebulaCertificate {\n\tDetails {\n\t\tName: amazonlinuxclient\n\t\tIps: [\n\t\t\t192.168.100.3/24\n\t\t]\n\t\tSubnets: []\n\t\tGroups: [\n\t\t\t\"servers\"\n\t\t]\n\t\tNot before: 2024-03-12 21:00:11 +0000 UTC\n\t\tNot After: 2024-03-13 21:00:11 +0000 UTC\n\t\tIs CA: false\n\t\tIssuer: e19330d7f9fe73a05132490eaf0debacbbcf068b42cf12e27ebc63756da40172\n\t\tPublic key: e294d9e5e261a3a12f11cbb398fe4469c12698f7770df9d164d22f2fccc36326\n\t\tCurve: CURVE25519\n\t}\n\tFingerprint: 7e9dbed3ed342168a63cea22fe673709e404f243fb85e9f82de30d762c8b762a\n\tSignature: b1f0c71efd95fea64664e3f7be377fb435eba062f5a7adcf26f09ab8702c77e4bbab3e6fc79a4e56351adeaff77d2327484fcda32f596483b85f4547dc941a09\n}" error="certificate validation failed: certificate is expired" handshake="map[stage:1 style:ix_psk0]" udpAddr="52.65.41.129:60146"
Config files from affected hosts
This is somewhat exploitable in practice/with intent; I'm able to generate ~15MiB in logs in 10-12 seconds from a single client that has a random/unauthorized CA/cert pair with only the server's IP address and port, using nothing but the nebula client itself, roughly double the bandwidth being sent to the server. I haven't tried multiple clients and/or a specialized attack yet.
Removing the cert field would make a significant difference, though a more complete fix may be to also de-duplicate log entries (at least for invalid/unauthorized requests).
Suppress large numbers of duplicate log messages and replace them with periodic summaries. For example, syslog may include an entry that states "last message repeated X times" when recording repeated events.
That said I think syslog and journald both have deduplication, but journald at least will only do that for single-line messages IIRC (in any case it does not seem to dedupe these)