framework icon indicating copy to clipboard operation
framework copied to clipboard

Numeric-like values with hyphens (96777-8) in CSV fields split into arrays in JSON output

Open megin1989 opened this issue 10 months ago • 0 comments

We are using Frictionless to process CSV files, and we encountered an issue where certain fields with numeric-like values containing hyphens (e.g., 96777-8) are incorrectly split into arrays when converted to JSON output.

For example, a CSV field with the value 96777-8 gets represented in the JSON output as: "SCREENING_CODE": [ 96777, 8 ] Expected behavior: The value should remain as is in the JSON output:

"SCREENING_CODE": "96777-8"

Steps to Reproduce: Create a CSV file (example.csv) with the following content:

SCREENING_CODE
96777-8

Use the extract or validate method to process the CSV:

from frictionless import extract
rows = extract('example.csv')
print(rows)

Observe the resulting JSON output:

[
    {
        "SCREENING_CODE": [
            96777,
            8
        ]
    }
]

Expected Behavior: The output should preserve the original value as a string:

[
    {
        "SCREENING_CODE": "96777-8"
    }
]

megin1989 avatar Dec 02 '24 07:12 megin1989