log15 icon indicating copy to clipboard operation
log15 copied to clipboard

Add loggly hook.

Open GeertJohan opened this issue 10 years ago • 27 comments
trafficstars

Fixes #42

GeertJohan avatar Jan 22 '15 09:01 GeertJohan

Sorry for the insanely slow turnaround here.

LGTM. Do we have thoughts on the following:

  1. Do we want to add a third-party dependency (or vendor it)?
  2. Could it go in log15/loggly instead of log15/ext/loggly?
  3. Should it be taking a token in the constructor instead of a *loggly.Client? The way it's currently wired up, it's not possible for a user to specify tags to the loggly.Client constructor.
  4. Should it flush after every message or if it should be configured to flush on an configured interval?

inconshreveable avatar Jan 29 '15 21:01 inconshreveable

  1. Good point. How do you propose we vendor it?
  2. I think it could, but do you really want that? Eventually there might be a lot of external logging hooks... (I would be glad to see as many as possible)
  3. I chose this method because a) this reflects the loggly.New(token) constructor and b) the Client is an exported field on LogglyHook anyway, so tags can still be changed after using NewLogglyHook(token), as you would do after using loggly.New(token). But this simple/short constructor gives you the convenience to build a handler tree really fast.
  4. I've been thinking about this too. I notice latency in our applications because of the synchronous HTTP request to loggly. If you're okay with it I would like to send an extra PR later that adds async and bulk logging (both configurable, with async enabled by default)

GeertJohan avatar Jan 30 '15 11:01 GeertJohan

Any update on this? Would like to get it merged :)

GeertJohan avatar Mar 02 '15 07:03 GeertJohan

  1. If we vendor go-loggly, are we prepared to maintain it?
  2. I vote for keeping this in log15/ext/loggly.
  3. I prefer the constructor that takes a Client rather than a token.
  4. See my inline comments above.

ChrisHines avatar Mar 02 '15 16:03 ChrisHines

  1. See below.
  2. I agree, I think it's cleaner to gather logging hooks to external systems in a subdirectory so it's not polluting the project root.
  3. Still think a simple constructor taking a token is better, it also avoids having to import segmentio/go-loggly in the main project. Also: see below.

One of the key features of log15 is the way Handlers connect and create a handling tree. I think properties such as async, buffering (with max buffersize and flush interval) should be handled by a Handler, and the hook to a logging service shouldn't have to implement buffering itself and should just send logs to the logging service.

Now ofcourse the whole purpose of the buffering in go-loggly is so that can send multiple log entries in a single call to loggly. And this is actually very important, especially when working with huge amounts of logs. But buffering and bulk writes are also important in other handlers. For instance, a colleague of mine found that some of our application spend a huge amount of time in the kernel due to writes to files and os.Stdout. So a generic approach to buffering and handling of bulk log records seems to be an interesting topic. We brainstormed a bit about it this midday and came up with the following idea:

Add an extra interface type:

type BulkHandler interface{
    LogBulk([]*Record) error
}

Some Handlers such as the LogglyHandler and the FileHandler could implement this method, taking a slice of logs and sending/writing them all at once, or at least as much as possible/efficient per call/write. Loggly's bulk endpoint states "You can send a maximum of 5MB per batch and up to 1MB per event", so the LogglyHandler will need to take care of that.

Then a new Handler would be added:

type bufferedBulkHandler struct {
    buffer []*Record
    handler Handler
    bufferSize int // buffer is this size large, and will be flushed when it's full
    flushInterval time.Duration
    forceFlushLevel Level // buffer is flushed directly (while blocking the logging call) after a message with this level (or higher) is added.
}

func BufferedBulkHander(size int, interval time.Duration, forceFlushLevel Level, handler Handler) Handler

The ForceFlushLevel is actually quite important, there's a good chance a program is crashing when a log with LvlCrit is added. And you wouldn't want that message to be stuck in the buffer when the program crashes. Normal flushing takes place asynchronously, not directly blocking a Log call.

Maybe in this case (because of the number of options) the BufferedBulkHandler should be an exported type with exported fields, and not a function.

What I propose is:

  • Pull in the go-loggly Client and merge it with the LogglyHandler but drop all buffering and async features.
  • Define the BulkHandler interface
  • Add the BufferedBulkHandler
  • Implement BulkHandler for the FileHandler and LogglyHandler, and possibly more.

What I'm not sure about is if BufferedBulkHandler should accept a Handler as argument, and check with type assertion if the Handler also implements BulkHandler. Or maybe it should just accept BulkHandler's.

GeertJohan avatar Mar 02 '15 22:03 GeertJohan

Thanks for the detailed thoughts, and I like where this is headed! I like how you are breaking this down into nicely composable pieces. I think you have some great ideas here that could add some nice capabilities to log15 as a whole. I would be happy to work with you to polish your ideas and get them into log15. How about we break this into multiple issues. This PR can be for implementing the simple loggly handler. Then let's create a new issue for the bulk logging idea.

ChrisHines avatar Mar 03 '15 01:03 ChrisHines

Great! I'll update this PR with the loggly http API asap, probably this evening (CET).

GeertJohan avatar Mar 03 '15 10:03 GeertJohan

Some delay: I want to know if the log level (debug, info, crit) should be in a specific format so it is recognized by loggly. Because segmentio/go-loggy uses a different list of level strings than log15. I'm contacting loggly about this.

GeertJohan avatar Mar 04 '15 11:03 GeertJohan

Nevermind, it looks like levels are not defined by loggly.

GeertJohan avatar Mar 04 '15 11:03 GeertJohan

@ChrisHines I just updated the PR. I rewrote the complete code, there's no external dependency anymore. There's no buffering anymore, the logs are being sent to Loggly immediately. Please let me know what you think.

GeertJohan avatar Mar 04 '15 14:03 GeertJohan

Ah well, there is one external dependency: github.com/pquerna/ffjson/fflib/v1. But I trust the maintainers of that project to not make any breaking changes. They even versioned this dependency and breaking changes are made in v2.

GeertJohan avatar Mar 04 '15 14:03 GeertJohan

The bulk logging stuff is interesting, I'll comment on that in the separate issue you opened for it. I'm on board with log15/ext/loggly and the dependency on ffjson. Also, all of the code looks good. Thanks for this!

I'm concerned about the exported symbols TokenPlaceholder, EndpointSingle and EndpointBulk (especially because endpointbulk is unused at this point).

I'd propose instead that Endpoint be a public exported field of the LogglyHandler object. If the client wants to send logs to custom URIs, they can do it on a per-handler basis by setting Endpoint. This would make NewLogglyHandler a convenience constructor when you don't need any special behaviors.

inconshreveable avatar Mar 05 '15 09:03 inconshreveable

@inconshreveable You're right, I'm updating the code right now.

GeertJohan avatar Mar 05 '15 13:03 GeertJohan

There, removed the const and var's and have Endpoint as exported field.

GeertJohan avatar Mar 05 '15 13:03 GeertJohan

LGTM. Last thing we need is just a test. We could set up an HTTP server in the test and then set up a loggly handler with a custom URL pointing at the local server and validate that we get properly formatted loggly data.

inconshreveable avatar Mar 05 '15 14:03 inconshreveable

From Loggly's website:

Any sort of log data can be sent to Loggly, it just needs to be text based. Unstructured or structured logs from any application or device can be forwarded.

Which raises the question: Should the LogglyHandler accept a log15.Format rather than only using JSON?

ChrisHines avatar Mar 05 '15 15:03 ChrisHines

Loggly strongly recommends sending JSON for this type of thing. They do indeed accept all kind of logs so that you can pipe apache, nginx, syslog, etc. etc. directly to their API. They then analyse the log and parse it as best as they can to make it search-able, so you can put filters on fields and their values. But since we already have structured data in the form of log15.Ctx, it's best to send it as JSON so we're sure the structure is copied into the loggly the same way. It wouldn't make sense to use any other formatter and then just hope loggly manages to parse the text to it's original structure.

GeertJohan avatar Mar 05 '15 15:03 GeertJohan

I've pushed a work in progress test, but it's not working.. Looks like a problem with the channel (might be a Go issue on tip). To be continued later.

GeertJohan avatar Mar 05 '15 15:03 GeertJohan

I understand, but as library authors I think we should give the application developer that choice. Plus, we already have log15.JsonFormat. So I believe we should support any log15.Format and document that JsonFormat is the preferred format for use with Loggly. Supporting a Format makes LogglyHandler parallel to other sink handlers (FileHandler, StreamHandler, NetHandler), and also gives LogglyHandler users a hook to taylor the format if they have special requirements.

ChrisHines avatar Mar 05 '15 16:03 ChrisHines

It really makes no sense to send anything other than JSON to Loggly, unless you don't have JSON. For the File, Stream and Net handlers this does make sense, because in some cases you want human-readable data, so thats why you use a Formatter. But in this case Loggly already parses the data and presents it through their webinterface. JSON is a very reliable format for data transfer, especially with the standardized encoding of bytes, times, decimals and strings (escaping), and there's no reason to use anything else.

I do like the idea of allowing the user to modify the data sent, but I think the use of that feature will be quite low. And using Formatter has a huge disadvantage: it causes allocations. Especially with the upcoming bulk handling, I think it's very important to make sure we don't allocate a lot of bytes, and with the Formatter you'd have to allocate the bytes that are returned by Format, and then copy those to the buffer. To allow users to modify the request before marshaling it onto the buffer we could add a field PreMarhshalHook func(r *log15.Record). Then we wouldn't have an allocation and the json.Encoder can write directly to the pre-allocated buffer.

GeertJohan avatar Mar 06 '15 10:03 GeertJohan

If someone does really require to set a custom Formatter for some reason, then they can either fork the LogglyHandler and add it with a few lines of code. And when there is a greater demand (which I would be willing to bet more than a few beers on that it won't happen), a Formatter field can always be added to the LogglyHandler later :+1:

What do you think about PreMarshalHook func(r *log15.Record) ?

GeertJohan avatar Mar 06 '15 11:03 GeertJohan

Agreed, I see no reason to send anything over than JSON. The format is irrelevant in this case because we're sending to a machine. The goal is no loss of fidelity, and JSON does that well, and is recommended by loggly. Even if we're wrong on that, it covers the 90% case and we'll learn from someone if we're wrong.

I'm ready to merge when we have a passing test.

inconshreveable avatar Mar 06 '15 13:03 inconshreveable

I concede the point on making JSON the default format, but I'm still not convinced that the formatting should be hard coded into the handler. I will make my case below, and then I am fine with the majority opinion.

Suppose someone constructs the following handler tree.

h := FailoverHandler(
    LogglyHandler(),
    FileHandler("path", JsonFormat()),
)

The intention is to save logs to disk if Loggly is unreachable and batch upload the files later when Loggly is again reachable. The problem is that the logs on disk are in a different flavor of JSON than logs sent on the happy path.

So I would like to see one of two outcomes before we merge:

  1. LogglyHandler uses JsonFormat internally.
  2. A new LogglyFormat is factored out to allow for the above scenario to work as intended.

ChrisHines avatar Mar 06 '15 16:03 ChrisHines

Agreed. I'd prefer to use JsonFormat internally, if possible. Are we using the go generated JSON encoding just for speed, or is there another reason? Otherwise I'm fine with exposing a loggly.JSONFormat

inconshreveable avatar Mar 06 '15 16:03 inconshreveable

@ChrisHines Good point, I can definitely see that as a use-case, especially in larger cluster systems that should be able to handle connectivity downtime without losing information.

@inconshreveable I would prefer a LogglyFormat because I think the way I structured the JSON makes more sense for use with Loggly. An extra bonus is that the internal loggly formatter can work with an io.Writer under the hood so that it can directly write to a buffer when that is implemented in the LogglyHandler. The LogglyFormat will wrap the internal formatter with a bytes.Buffer and return the byte slice.

Maybe having a FormattingWriter is something to think about.

type FormattingWriter interface {
    WriteFormatted(w io.Writer, r *Record)
}

TerminalFormat and LogfmtFormat already use a bytes.Buffer{} internally so handing it a different io.Writer should work. JsonFormat now uses json.Marshal which in turn wraps a json.Encoder + bytes.Buffer{}. So in the end, every formatter could actually work with an io.Writer without much change, and that makes a lot of sense with the buffering in #49. Anyway, this should probably be a new issue.

GeertJohan avatar Mar 06 '15 17:03 GeertJohan

I'm eagerly awaiting this merge. Any movement on this?

phylake avatar Jun 07 '15 15:06 phylake

This is what we were waiting on:

"I've pushed a work in progress test, but it's not working.. Looks like a problem with the channel (might be a Go issue on tip). To be continued later."

If you'd like to write a test that accurately tests the functionality and works I'm happy to merge this in.

inconshreveable avatar Jun 07 '15 18:06 inconshreveable