build icon indicating copy to clipboard operation
build copied to clipboard

Send debug logs to Humio for all builds

Open ehmicky opened this issue 4 years ago • 9 comments

@netlify/build and @netlify/config print debug logs when the NETLIFY_BUILD_DEBUG environment variable is set.

This is very useful, but this requires a manual action from users which adds some back-and-forth with support. Those debug logs should always be available instead, although still hidden to users by default.

I would suggest implementing this by sending those logs to Humio (through Firebase) since this is the current solution for the same problem in the buildbot. We would still print those logs to stdout but only when NETLIFY_BUILD_DEBUG is set, since reading logs in the UI is much easier for debugging than in Humio.

We should be careful about:

  • Performance. This would add many HTTP requests to each build, so we might consider batching those requests.
  • Robustness. Builds should not fail nor print any error message if one of those HTTP requests fail.

ehmicky avatar Jul 06 '21 15:07 ehmicky

After some discussion with @JGAntunes, we have been considering the following implementation:

  • Add a --debugOutput CLI flag to @netlify/build and @netlify/config
  • That flag takes either a file path or a file descriptor as value.
  • That flag defaults to 1 (stdout).
  • The buildbot (unlike Netlify CLI) overrides this default value using --defaultFile=3 instead, with a new file descriptor redirected to the right place so we eventually see logs in Humio.

ehmicky avatar Jul 07 '21 13:07 ehmicky

you really want to produce debug logs for every build? wouldn't it be nicer to control this via a feature flag to keep log volume down?

mraerino avatar Jun 13 '22 13:06 mraerino

@mraerino what if we log the debug logs directly to humio without logging it to stdout. Then we could do it for every build.

I know this is probably not the best pattern but with ocean we are doing this as we need to ingest structured data.

lukasholzer avatar Jun 13 '22 13:06 lukasholzer

i don't have a preference of how to do it. i'm just saying that producing very verbose logs for every build increases the log volume unnecessarily. while volume is not as big a deal on humio as it is on firebase, we still are constrained on it and will loose days of retention if we produce significantly more logs.

it may be different for ocean since you're using this for data analysis, not debugging. for the debug case that this issue is talking about it should be fine to enable a flag for some customer where you want more insight into their build.

mraerino avatar Jun 13 '22 13:06 mraerino

you really want to produce debug logs for every build? wouldn't it be nicer to control this via a feature flag to keep log volume down?

This will mean another hop where we need to enable the feature flag for the customer and ask them to run another deploy (which they may be billed for) before we can find out any potential issues. If we're able to always stream the logs while keeping verbosity under control, I think we'd be able to provide a better support experience.

eduardoboucas avatar Jun 13 '22 16:06 eduardoboucas

what if we log the debug logs directly to humio without logging it to stdout. Then we could do it for every build.

IMO we should take advantage of what we already have in place within our build clusters to fwd logs to Humio (and we do have ability to ingest structured logs this way too). Coupling build with Humio directly feels like a stretch and a pattern we should avoid in the long run I believe.

JGAntunes avatar Jun 14 '22 11:06 JGAntunes

logging to the stdout of the container so the infra level log shipper can deliver to humio sounds like the right call here.

mraerino avatar Jun 14 '22 13:06 mraerino

Coming back to this because we desperately need a way to see debugging information when troubleshooting a build problem. Having to manually set a feature flag for a site and waiting for the customer to build again is not an option I'm happy with, because often times we're troubleshooting intermittent errors and this back and forth will make the debugging process ineffective. The same goes for the option of asking customers to set the NETLIFY_BUILD_DEBUG environment variable.

Looking through the thread makes me think that writing to Humio separately is not a desirable option, so I'm wondering if a compromise solution would be to start prefixing any internal log messages with a given character sequence that is understood by the Netlify UI. When it finds a log line starting with that sequence and NETLIFY_BUILD_DEBUG is disabled, it hides the log line from the user.

Because messages are still logged to stdout, they'll be available to us in Humio without any additional work.

This is the best option I could think of in terms of effort/reward, but if anyone has any other ideas, I'd love to explore them!

eduardoboucas avatar Aug 02 '22 09:08 eduardoboucas

After discussing with @JGAntunes, we settled on a different approach, similar to what was proposed here:

  • buildbot will expose an additional file descriptor, piped to the system logs, to the build command
  • The build command will send that file descriptor number to Netlify Build via a new --system-log-file flag
  • Netlify Build will contain a new systemLog() utility method, which will write to the file descriptor referenced by the --system-log-file flag
  • Rather than automatically routing all debug logs to this mechanism, we'll use the new systemLog() method to gradually opt-in any debug messages that we want to capture for every build

eduardoboucas avatar Aug 03 '22 10:08 eduardoboucas

I'm very happy to close this! 🎉

eduardoboucas avatar Aug 12 '22 16:08 eduardoboucas

YESSS!!!

minivan avatar Aug 12 '22 16:08 minivan