vector icon indicating copy to clipboard operation
vector copied to clipboard

Template expressions coerce VRL `null` to string `"<null>"`

Open milas opened this issue 7 months ago • 3 comments

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

When using template expressions in config keys, e.g. foo: {{ bar }}, if the field value is a VRL null, the result is the string "<null>".

In particular, this makes using the Splunk HEC index & similar fields cumbersome to use if the intent is to specify a value sometimes but not always.

Configuration

sources:
  demo:
    type: demo_logs
    format: bsd_syslog

transforms:
  remap:
    type: remap
    inputs: [demo]
    source: .index = null

sinks:
  hec:
    type: splunk_hec_logs
    inputs: [remap]
    endpoint_target: event
    endpoint: http://splunk.example.com
    default_token: xxx
    index: "{{ index }}"
    encoding:
      codec: raw_message

Version

0.46.1

Debug Output

Debug logging isn't super helpful, but sending to an HTTP echo server to see the request sent to Splunk HEC reveals:

{
  "event": "<187>May 13 18:08:52 random.archi cron[2307]: Take a breath, let it go, walk away",
  "fields": {},
  "time": 1747159732.199,
  "host": "localhost",
  "index": "<null>"
}

(n.b. I prettified the JSON for legibility)

Notice "index": "<null>".

Example Data

No response

Additional Context

Using an empty string isn't really practical either because that will still send the field since the serde field configs only skip serializing for None for sinks like Splunk HEC. (In particular, if the Splunk token is restricted to a subset of indexes, "index": "" will result in a 400.)

Omitting the field is not possible because then the template will fail.

References

Tracking default values for field absence, which is similar but not quite the same:

  • #1692

milas avatar May 13 '25 18:05 milas

Hi @milas, I agree that this is not a great mapping. But there might be complications since we also support the TOML format: https://github.com/vectordotdev/vector/issues/12832.

I thought about doing del(.index) but this will result in an error: Failed to render template for "index". error=Missing fields on event: ["index"]. I think the suggestion in https://github.com/vectordotdev/vector/issues/1692 is the desired solution here.

pront avatar Jun 11 '25 13:06 pront

Hi @milas, I agree that this is not a great mapping. But there are complication since we also support the TOML format: https://github.com/vectordotdev/vector/issues/12832.

TIL that TOML has no null!

For template strings, I think this should be okay, i.e. in TOML/JSON/YAML, you would have a string still -- {{ bar }}. The rendered result is for the in-memory component config, so the underlying types hopefully won't need changing, only the templating logic.

I thought about doing del(.index) but this will result in an error: Failed to render template for "index". error=Missing fields on event: ["index"].

Yeah, another possibility to enable this use case could be something like:

foo:
  template: "{{ bar }}"
  optional: true

But default syntax in the template would be much cleaner I think!

--

Thanks for the response & guidance -- I will experiment a bit here & see if I can make a more concrete proposal/PR!

milas avatar Jun 11 '25 19:06 milas

The rendered result is for the in-memory component config, so the underlying types hopefully won't need changing, only the templating logic.

Sounds right.

But default syntax in the template would be much cleaner I think!

I think the | is a common choice e.g. {{ index | default("default_index_value") }}

pront avatar Jun 11 '25 19:06 pront

This is giving me trouble for structured_metadata in Loki. I've tried a couple workarounds using label expansion but null values there seem to break it. Probably going to fork and try to remove the behavior if we're having a hard time proposing a fix? 🤔

jmealo avatar Jul 24 '25 20:07 jmealo

This is giving me trouble for structured_metadata in Loki. I've tried a couple workarounds using label expansion but null values there seem to break it. Probably going to fork and try to remove the behavior if we're having a hard time proposing a fix? 🤔

Hi @jmealo, we should tackle this problem at a generic level vs patching each integration. Adding an optional field or new template syntax {{ index | null }} is the proper solution. Then we would internally evaluate to a null VRL value instead of the awkward string value.

pront avatar Jul 29 '25 15:07 pront

There's no workaround for this right now due to an unrelated bug: https://github.com/vectordotdev/vector/issues/23725

jmealo avatar Sep 04 '25 20:09 jmealo

@pront @milas Given how wildly different the behavior is between sinks, can we really afford to put off fixing bugs/inconsistencies in hope they'll be fixed later in some global way? It's making using upstream vector without forking impossible.

In other words, sinks don't act the same already. Why don't we focus on addressing problems that are impacting users, rather than theoretical ones? I understand the desire for purity here, but I think we need to be pragmatic.

Would you be open to accepting sink-specific fixes now while we work toward a unified approach? That way we unblock real users facing real problems today, without precluding a better solution tomorrow.

jmealo avatar Sep 04 '25 21:09 jmealo

I opened a discussion to see if we can find a path forward on finding a fix: https://github.com/vectordotdev/vector/discussions/23726

jmealo avatar Sep 04 '25 21:09 jmealo

Hey, a few things going on here, but first off, to make sure there's no confusion: I am not a Vector maintainer.

Two potential avenues have been proposed so far:

  1. Modify templating logic to not stringify
  2. Add a | null operator support to template strings

I still believe (1) is possible and the correct/best path and TOML's lack of null support shouldn't matter here, but I have not had time to make that change, verify it, and prepare a PR.


As an aside, speaking personally without the endorsement of my employer, please remember that Vector is open-source: just as you are free to patch it (soft fork), you are also not owed anything. It's frustrating when software doesn't meet our needs, but your comments come off very pointed and heated. I similarly proposed https://github.com/vectordotdev/vector/issues/23041#issuecomment-2877549746 to work around it in a specific sink right after opening this issue and actually still use that patch to this day. Everyone wants this fixed, and it's unfortunate for those of who are inconvenienced by it, but that does not make Vector upstream unusable, and nothing is being blocked on ideological purity, just someone having time to contribute!

milas avatar Sep 04 '25 22:09 milas

@milas Thanks for clarifying your relationship with the project. I appreciate your response, though I'd like to better understand the situation before proceeding with another contribution.

Since I've already submitted one fix, I'd find it helpful to understand:

  • The full scope of changes needed across affected sinks
  • Whether there's a documented position or RFC on how sink inconsistencies should be handled going forward
  • If there's a coordinated effort to address these systematically, or if individual fixes are preferred
  • If there's an effort to catalog/identify/avoid unintentional behavior differences between sinks/sources

I'm aware of how open source works and have been contributing for over a decade. I'm happy to help, and fix upstream bugs regularly, but would benefit from understanding the bigger picture to make my contribution as effective as possible.

jmealo avatar Sep 04 '25 23:09 jmealo

The number of open issues and discussions around this makes it look more complicated that it really is. Fixing this properly is actually easier IMO than patching each component.

(1) My suggestion is, let's ignore TOML for now and add a | null fallback for YAML/JSON Vector config templates.

For anyone interested in implementing this, this is where the code lives: https://github.com/vectordotdev/vector/blob/master/src/template.rs#L48-L77

(2) Alternatively, we can support VRL in templates (see links below for context) but that is a bit more complex and it might have performance trade-offs.

Links:

  • https://github.com/vectordotdev/vector/issues/12832
  • https://github.com/vectordotdev/vector/issues/1692
  • https://github.com/vectordotdev/vector/issues/17501

P.S. I very much agree with this https://github.com/vectordotdev/vector/issues/1692#issuecomment-581989794, we don't want to keep adding new syntax here, but null is a pretty ergonomic feature here that solves a real problem.

pront avatar Sep 05 '25 14:09 pront