vector icon indicating copy to clipboard operation
vector copied to clipboard

feat(codecs): new syslog codec

Open syedriko opened this issue 1 year ago • 15 comments

feat(syslog codec) POC of a syslog codec This is a naive syslog codec implementation, geared towards the needs of the OpenShift logging stack

syedriko avatar Jun 12 '23 19:06 syedriko

Deploy Preview for vrl-playground ready!

Name Link
Latest commit 38dfb93d7d4b414ee96c34b651f6fe68b626a562
Latest deploy log https://app.netlify.com/sites/vrl-playground/deploys/64876b4ed83dc30009ce6c26
Deploy Preview https://deploy-preview-17668--vrl-playground.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

netlify[bot] avatar Jun 12 '23 19:06 netlify[bot]

Deploy Preview for vector-project ready!

Name Link
Latest commit 38dfb93d7d4b414ee96c34b651f6fe68b626a562
Latest deploy log https://app.netlify.com/sites/vector-project/deploys/64876b4e4231e70008b30526
Deploy Preview https://deploy-preview-17668--vector-project.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

netlify[bot] avatar Jun 12 '23 19:06 netlify[bot]

@polarathene Thanks for a detailed review, I've learned a few things. There's some history of this PR here: https://github.com/vectordotdev/vector/issues/6863. This code was supposed to mimic the fluentd syslog plugin, with some extensions. The vector configuration is not directly exposed to the user but auto-generated by a k8s operator. (thus RFC-specific fields in configuration, like Tag). The idea behind

enum Severity {
    /// Syslog severity ordinal number
    Fixed(u8),

    /// Syslog severity name
    Field(String)
}

(if we take severity as an example) is to be able to set the severity value in configuration as a number, within the correct limits, or a string, which in turn can be the name of a severity level, like "ALERT" or a log event field reference, if it is of the $.message.take_sev_from_this_field form. The $.message syntax is from the fluentd implementation. There is a custom extension around the "add_log_source" logic that is k8s-specific. I'm hoping to start with this code and arrive at a generic implementation that is good for the Vector upstream but can be customized/specialized to do what this code does now.

syedriko avatar Nov 14 '23 01:11 syedriko

@syedriko Any progress on this?

gaby avatar Dec 05 '23 04:12 gaby

Sorry, I didn't get around to putting this response through earlier 😓

This was a good chunk of it below, with a little bit more and some pseudo-code that I've excluded.

I have refactored this PR with a local build based on my original feedback. It'd need a little bit more work but if you'd like I could either open a new PR with that or do a 2nd review with the change suggestions in bulk?

I have kept the third-party additions, so it retains that functionality, although I don't think it should have either the add_log_source() or $.message. prefix logic since that seems more relevant to something like VRL with remap handling? (the k8s log source could also be adapted to RFC5242 structured data)


@polarathene Thanks for a detailed review, I've learned a few things.

You're welcome! Apologies for the verbosity 😅

  • The top-level review comment should convey anything relevant to address (with links to inline review comments providing more context).
  • I had to read up on the syslog RFCs to get a better understanding of the PR, I've referenced those documents a fair bit with links (might benefit some other reviewers 🤷‍♂️ ).
  • Unfortunately I lack experience with Vector and the codebase, so I can't provide much input on those specifics.

I'm hoping to start with this code and arrive at a generic implementation that is good for the Vector upstream but can be customized/specialized to do what this code does now.

Sounds great! ❤️

I'm lacking time to build it locally and configure for a project that I'm interested in bringing Vector into. Hopefully I can allocate some time this month or next to give it a try and ensure the syslog codec support integrates well 👍


Additional feedback

(if we take severity as an example) is to be able to set the severity value in configuration as:

  • a number, within the correct limits,
  • or a string, which in turn can be the name of a severity level, like "ALERT"
  • or a log event field reference, if it is of the $.message.take_sev_from_this_field form. The $.message syntax is from the fluentd implementation.
  • There is a custom extension around the add_log_source logic that is k8s-specific.

That functionality is retained (and simpler?) with the enum approach suggested:

// Must be within the ordinal bounds, otherwise `None` => `Err(...)`:
let s = Severity::from_repr( num ).ok_or_else(|| /* Err */ )

// String must match a valid enum variant, otherwise `Err(...)`:
let s = Severity::from_str( str ).map_err( /* Err */ )

// Regardless of how you created `s`, easily get the desired output:
println!("{}", s as u8);
// strum implements Display (so you could also use `.to_string()`):
println!("{s}");

Making the custom deserialize methods quite a bit simpler to grok, and the get_num_*() method for each redundant? Composing the PRIVAL is still simple:

// NOTE: This would be a bit nicer out of `encode()`, instead via a struct method?
let prival = (facility as u8 * 8) + severity as u8;
let pri = ["<", &prival.to_string(), ">"].concat();

I'm unable to provide much feedback for the latter two points.

  • Are they not compatible with the suggested enum approach?
  • The variant from custom key lookup would be a separate responsibility?

polarathene avatar Dec 05 '23 06:12 polarathene

Hi. Thanks for doing this, this could be very useful to incorporate.

Is there a compelling reason to support RFC3164? Given how loosely specified 3164 is, I feel this would just add complexity that we don't really need. If we can I would prefer to drop this.

It would be really useful to be able to create structured data. This could be done by adding a map of template fields to the configuration. Something similar to the way the loki sink handles labels.

The ID field could just be a string field in the config. So the config for this would look something like:

structured_data:
  id: data@12312
  fields:
    thing: "{{ thing }}"
    thang: "{{ thang }}"

The spec allows for multiple structured data sections - but I think we should be able to just allow a single section to be specified, at least for the initial version.

Would it be possible to add this?

StephenWakely avatar Dec 06 '23 10:12 StephenWakely

Is there a compelling reason to support RFC3164? Given how loosely specified 3164 is, I feel this would just add complexity that we don't really need. If we can I would prefer to drop this.

Some software still seems to emit logs in that format, so it's preferred if Vector is to ingest the logs but the user still wants to maintain compatibility with the current RFC 3164 logs.

Which happens to be my case at a glance. I want to bring Vector into a project with a wide user base, but I don't want that to be too disruptive if possible. Supporting RFC 3164 doesn't seem like it requires that much more logic 🤷‍♂️

It would be really useful to be able to create structured data.

My refactor has this functionality implemented.

Would it be possible to add this?

My approach just relied on remap to provide a structured_data object. I'm not familiar with Vector config implementations, so I'm not sure when it's appropriate to have config fields for static / default data inputs vs config fields that expect a key to reference data from the log event.

polarathene avatar Dec 06 '23 20:12 polarathene

Is there a compelling reason to support RFC3164? Given how loosely specified 3164 is, I feel this would just add complexity that we don't really need. If we can I would prefer to drop this.

Some software still seems to emit logs in that format, so it's preferred if Vector is to ingest the logs but the user still wants to maintain compatibility with the current RFC 3164 logs.

Which happens to be my case at a glance. I want to bring Vector into a project with a wide user base, but I don't want that to be too disruptive if possible. Supporting RFC 3164 doesn't seem like it requires that much more logic 🤷‍♂️

It's more UX complexity I am concerned with for 3164.

It would be really useful to be able to create structured data.

My refactor has this functionality implemented.

Would it be possible to add this?

My approach just relied on remap to provide a structured_data object. I'm not familiar with Vector config implementations, so I'm not sure when it's appropriate to have config fields for static / default data inputs vs config fields that expect a key to reference data from the log event.

I'm not sure I fully understand. What is your refactor?

StephenWakely avatar Dec 07 '23 09:12 StephenWakely

It's more UX complexity I am concerned with for 3164.

Could you elaborate?

On the user end with config it defaults to RFC 5424, but you can explicitly add the rfc key to request RFC 3164 instead. What UX concerns do you have?

I'm not sure I fully understand. What is your refactor?

I've forked this PR and rewritten the logic based on my review feedback.

It is simpler and easier to review IMO, I've implemented Structured Data support for both RFC 5424 and 3164 (as RFC 5424 advises by prepending it to a message when not NIL).

I'll probably raise a separate PR in the weekend since there is not much activity here since my review.

polarathene avatar Dec 07 '23 20:12 polarathene

It's more UX complexity I am concerned with for 3164.

Could you elaborate?

According to the RFC, 5424 has made 3164 obsolete.

3164 is very loosely specified, so really if we say we are conforming to 3164 it just means 'our interpretation of a 3164 structure'. If we have an incoming 3164 message and we send out a 3164 message - there is no guarantee that they would be exactly the same text. Sending a 3164 message to a system that claims to parse 3164 doesn't provide any real guarantee that it will actually be able to parse our 3164 message.

Plus I don't like that the year gets dropped in the timestamp..

With a 5424 message it is much more precise and we can have more guarantees about the structure of the message.

I'm not saying absolutely we shouldn't support 3164, if, say, there was a well known system that was unable to parse 5124 messages then that would be reason enough for us to support it. But to my knowledge there isn't such a system, but I am all ears if you know of any?

StephenWakely avatar Dec 08 '23 14:12 StephenWakely

According to the RFC, 5424 has made 3164 obsolete.

Yes, and where possible it should be adopted. A project I work on that bundles popular software in Docker is still outputting logs with RFC 3164, and I plan to change that with Vector, but I initially want that RFC 3164 support, it is very minimal to support.

Here is the RFC 3164 specific portion:

SyslogRFC::Rfc3164 => {
    // TIMESTAMP field format:
    // https://datatracker.ietf.org/doc/html/rfc3164#section-4.1.2
    let timestamp = self.timestamp.format("%b %e %H:%M:%S").to_string();

    // MSG part begins with TAG field + optional context:
    // https://datatracker.ietf.org/doc/html/rfc3164#section-4.1.3
    let mut msg_start = self.tag.encode_rfc_3164();

    // When RFC 5424 "Structured Data" is available, it can be included in the `CONTENT` field:
    // https://datatracker.ietf.org/doc/html/rfc5424#appendix-A.1
    if let Some(sd) = structured_data.as_deref() {
        msg_start = msg_start + " " + sd
    }

    [
        timestamp.as_str(),
        hostname,
        &msg_start,
    ].join(" ")
}

// ...

struct Tag {
    app_name: String,
    proc_id: Option<String>,
    msg_id: Option<String>
}

// `.as_deref()` usage below avoids requiring `self.clone()`
impl Tag {
    // Roughly equivalent RFC 5424 fields can compose the start of
    // RFC 3164 MSG part (TAG + CONTENT fields):
    // https://datatracker.ietf.org/doc/html/rfc5424#appendix-A.1
    fn encode_rfc_3164(&self) -> String {
        let Self { app_name, proc_id, msg_id } = self;

        match proc_id.as_deref().or(msg_id.as_deref()) {
            Some(context) => [&app_name, "[", &context, "]:"].concat(),
            None => [&app_name, ":"].concat()
        }
    }
// ...
}

Roughly 4 lines without comments and formatting (EDIT: Forgot the Tag portion, added that not much more). Not a maintenance burden.


If we say we are conforming to 3164 it just means 'our interpretation of a 3164 structure'. If we have an incoming 3164 message and we send out a 3164 message - there is no guarantee that they would be exactly the same text.

Vector already has support for 3164 elsewhere, and outputs a specific format from the demo logs source for example. This aligns with the same 3164 logs I encounter with the popular software bundled via the project I maintain, so that's common enough for me :shrug:

Sure it might not meet the expectation of everyone else for 3164, but I don't see an issue with this as:

  • Conforms to 3164 output emitted from existing Vector support for 3164
  • Conforms to 3164 output with popular software that emits logs in that format
  • Conforms to the advice of outputting RFC 3164 via the context of RFC 5424
  • A PR author contributing the functionality has some influence earned from time spent making the contribution as a whole.

Plus I don't like that the year gets dropped in the timestamp..

Then don't use the format? That's specific to RFC 3164, it's expected.

that would be reason enough for us to support it. But to my knowledge there isn't such a system, but I am all ears if you know of any?

It's a low maintenance feature to support as shown above. If you get requests for other variations with enough demand (or contributions) you can discuss if supporting that is worthwhile.

It's a motivating factor for me to contribute an improved version of this PR with that functionality intact. I don't mind it being deprecated if it's necessary to drop it in future for some reason, but I'd like RFC 3164 so that I can use that when switching over to Vector, and transitioning away from it.

If there is a strong argument from maintainers of Vector against that, I won't have much say, but I'd appreciate it considering how low maintenance it is and the existing RFC 3164 support present in Vector.

polarathene avatar Dec 08 '23 20:12 polarathene

@polarathene I am very much interested in you driving this forward given @syedriko has been pulled in a different direction for us. We currently support RFC3164 in the design that @syedriko has presented but we have upcoming usages where I can push dropping its support though would take it if the burden is low. Regarding some of the specific deployment extentions, those I could live without given they are not part of the spec. Looking forward to your new PR

jcantrill avatar Jan 03 '24 14:01 jcantrill

@polarathene I am very much interested in you driving this forward given @syedriko has been pulled in a different direction for us.

Sure, I'll try find some time to raise the PR of what I have done. Should be done before mid jan 👍 (hopefully earlier depending on workload elsewhere)

We currently support RFC3164 in the design that @syedriko has presented but we have upcoming usages where I can push dropping its support though would take it if the burden is low.

As stated, for my usage logs are in RFC3164 too. I can likewise probably address that, but I don't see the extra code to support RFC 3164 to be much of a maintenance burden.

Regarding some of the specific deployment extentions, those I could live without given they are not part of the spec.

Great! In my fork I implemented structured data (for RFC 3164 it is prepended to the log message, as RFC 5424 advises). You can optionally use VRL to construct the same content in your config instead of the hard-coded approach that was done here.


UPDATE: Had a busy January and had to delay tackling this. Hoping to have time soon to make progress on pushing this forward.

UPDATE (March 14th): Actively working on my fork to get it in shape for review 👍

polarathene avatar Jan 03 '24 22:01 polarathene

@polarathene Any update on this?

gaby avatar Jun 15 '24 00:06 gaby

@polarathene Any update on this?

Sorry I got burnt out earlier.

I am able to publish the progress I had, but it was not in a state I was comfortable with for review.

It still needs more work to be ready, but perhaps it could get some review prior and maybe someone else can assist as I realize I've not been able to reliably push forward on this as I've wanted to with all my other commitments 😓

polarathene avatar Jun 15 '24 02:06 polarathene

I am able to publish the progress I had ... perhaps it could get some review

@polarathene Following-up if this ever happen? Can we either move this PR forward or see your proposed changes so that we can make progress

jcantrill avatar Jul 18 '24 12:07 jcantrill

Following-up if this ever happen?

No, nobody expressed interest since I wrote that comment.

see your proposed changes so that we can make progress

I just wrapped up a large task elsewhere, so I have time to publish the branch (and potentially rebase if there is no major conflicts to resolve). I'll tackle that tomorrow, getting late here 👍


My branch diverged quite a bit from the one here, which while an improvement.. I'm not sure if it'll help much towards making progress if nobody else with the skills wants to assist 😅

I do want to complete the work, it's just been difficult to budget my volunteer time for. So if nobody helps, I will eventually be able to block time out again for coming back to this. I sunk a considerable chunk of time into the branch already, would be an awful waste to leave it to rot.

polarathene avatar Jul 19 '24 08:07 polarathene

It seems like I actually did push my branch up months ago and forgot about that: https://github.com/polarathene/vector/tree/feat/syslog-codec

Diff (as if it was opened as a PR): https://github.com/vectordotdev/vector/compare/master...polarathene:feat/syslog-codec


The last commit was April, and locally it looks like I was refactoring some structs and fields to address some concerns with configuration + VRL. I had to shift priorities and I think ran into a tricky problem, so that progress was not completed.

I haven't tried building the public branch I linked, but you could give that a go? (it's not been rebased to upstream, so there may be conflicts)

The commits should be fairly tidy and scoped with associated context in each commit message. I don't have time to write up much guidance, especially since I haven't touched the branch for several months.

If the branch itself fails to build successfully, that's probably what I was addressing at the time and why I didn't announce anything earlier. Pretty sure I had it working locally with the uncommitted changes.


Hopefully that's helpful enough for now?

I'll see if I can grok what I have locally, but I'm not sure if I'll have the time to get familiar with the uncommitted changes and push those improvements/refactors today.

I could open a PR with what I have on the public branch, but I wouldn't want to waste reviewer time (my local refactoring + no time to write up a proper PR description). Feel free to give it a glance/try and provide feedback though 👍

polarathene avatar Jul 20 '24 01:07 polarathene