ReadStat icon indicating copy to clipboard operation
ReadStat copied to clipboard

Numeric variables files generated from CSV input always have decimals

Open eirki opened this issue 3 years ago • 1 comments

I am trying to generate SPSS and Stata files using CSV files + JSON metadata as input as described here. However, in the generated files, every numeric variable has decimals, so 1, 2, 3 become 1.0, 2.0, 3.0.

Steps to reproduce:

inputdata.csv:

"var_a","var_b"
1,1.1
2,2.2
3,3.3

meta.json:

{
    "type": "SPSS",
    "variables": [
        {
            "type": "NUMERIC",
            "name": "var_a",
            "decimals": 0
        },
        {
            "type": "NUMERIC",
            "name": "var_b",
            "decimals": 2
        }
    ]
}

cmd: readstat inputdata.csv meta.json outputdata.sav

To test, check in SPSS or with pyreadstat: python -c "import pyreadstat; print(pyreadstat.read_sav('outputdata.sav')[1].original_variable_types)"

My output: {'var_a': 'F8.2', 'var_b': 'F8.2'}

var_a should be F8.0.

I've poked around with git bisect, and it seems this is a regression introduced in https://github.com/WizardMac/ReadStat/commit/93036b0e57a10a8d9fb2cf48ea06a1f7fcb1d583. When I checked out the repo from one commit earlier (https://github.com/WizardMac/ReadStat/commit/1b64538ab4e983e92e7f037bc5d61ae71fefb445) and reran with the same input I got

{'var_a': 'F8.0', 'var_b': 'F8.2'}

Very grateful for any assistance with generating files without all those .0 decimals.

eirki avatar Aug 04 '22 14:08 eirki

Duplicate of #177

evanmiller avatar Jan 16 '23 03:01 evanmiller