ietoolkit icon indicating copy to clipboard operation
ietoolkit copied to clipboard

[iesave] comma in label breaks csv output

Open kbjarkefur opened this issue 2 years ago • 1 comments

Variables with a , in the variable label, such as Milage, mpg, is allowed in Stata but will make the data points for that variable be shifted one column in the data table.

The csv solution for that would be to enclose all cells in double quotes. Such as mpg,Milage , mpg,byte,74 becomes "mpg","Milage , mpg","byte","74". However, then all strings needs to be compounded `" "' strings. And also, these extra quotes should not be added in the .md format.

The way I avoided that in iebaltab was to write a tab separated temp file I imported to Stata as a data file and then used Stata native features to export to csv that takes care of this. Not sure if that is the best approach here as the header is different and there is no equivalent for .md.

Another approach is to no mix code that generates the data point with code that outputs. If the data point code just create all values then some other code can specialize in output. In the current version of iesave in #276 the code is structured the same way as the old approach in iebaltab. iesave is likely to not be as complex so it could be ok, but maybe worth trying to avoid.

Both the csv and the md output is structured enough that this should be possible to be abstracted away in some functions. The best approach would be if Stata had support for lists or arrays.

kbjarkefur avatar Aug 03 '22 07:08 kbjarkefur

Why is it that you always have the best idea as soon as you hit submit? 😄

The only place a comma can appear is in the variable label and in the user name (rare and idiotic but possible). So lets just enclose these in " " and handle that properly. So for the csv file you will have mpg,"Milage , mpg",byte,74 and Milage , mpg will be in a single column. The md file will be | mpg | "Milage , mpg" | byte | 74 and the quotation signs will show, but I think that is ok.

This still require that the local line is always handled as a compounded string `" "'

kbjarkefur avatar Aug 03 '22 07:08 kbjarkefur