cobrix icon indicating copy to clipboard operation
cobrix copied to clipboard

a col having empty string in ebcidic file is automatically converted to null

Open krahul19091992 opened this issue 3 years ago • 4 comments

I have a column in the raw data file which has empty string( 8 spaces) but when i load to spark cobol dataframe using cobrix that column is populated as null

since i need the actual spaces to be present in the column as its a business requirement and they expect 8 spaces only for that column what should i do with cobrix?

krahul19091992 avatar Jan 05 '22 16:01 krahul19091992

Hi,

Try

.option("improved_null_detection", "false")

so that empty strings won't be treated as nulls, and

.option("string_trimming_policy", "none")

so that 8 spaces won't be trimmed as an empty srtging.

yruslan avatar Jan 06 '22 07:01 yruslan

Thanks @yruslan but i think the csv writer will assume empty value as null and then write into the file even if i give these options while writing using csv writer its showing null values in the file , I wanted to have exact spaces as its in cobol ebcidic file in my csv file

krahul19091992 avatar Jan 07 '22 17:01 krahul19091992

It might be that CSV writer can be tweaked as well to output empty spaces. For example, by adding mandatory quotes around values.

yruslan avatar Jan 10 '22 10:01 yruslan

After using the above options the dataframe contains the Spaces.

This write will preserve the spaces in CSV file

df.write.option("ignoreLeadingWhiteSpace", "false")\
.option("ignoreTrailingWhiteSpace", "false").csv(output_path, header=True)

AnveshAeturi avatar Jul 25 '23 19:07 AnveshAeturi