nodestream icon indicating copy to clipboard operation
nodestream copied to clipboard

[BUG] `do_lowercase_strings` is being applied to all property values by default

Open bechbd opened this issue 1 year ago • 1 comments

Describe the bug All the string properties are being lowercased due to the do_lowercase_strings normalization being applied

To Reproduce Given the configuration below:

- implementation: nodestream.interpreting:Interpreter
  arguments:
    interpretations:
      - type: source_node
        node_type: Artist
        key:
          id: !jmespath nconst
        properties:
          name: !jmespath primaryName

All the name properties are being lowercased due to the do_lowercase_strings normalization being applied. This is an unexpected behavior from a CX perspective.

Expected behavior The string values are not changed

Additional context This also happens to key properties but I see the logic here in lowercasing all key values, to ensure consistency in key lookups so that does not need to change.

bechbd avatar Mar 19 '24 16:03 bechbd

I think the solution here could be something like this:

  1. Introduce key_normalization and property_normalization fields and deprecate normalization.
  2. Keep the default of key_normalization to be what normalization currently is.
  3. Have the default of property_normalization be blank.
  4. If normalization is set, then apply it to both taking precedence.
  5. Error if both normalization and one of key_normalization or property_normalization.

zprobst avatar Mar 19 '24 16:03 zprobst

This issue has been resolved and will be released with 0.13.

zprobst avatar Aug 02 '24 14:08 zprobst