cobrix
cobrix copied to clipboard
a col having empty string in ebcidic file is automatically converted to null
I have a column in the raw data file which has empty string( 8 spaces) but when i load to spark cobol dataframe using cobrix that column is populated as null
since i need the actual spaces to be present in the column as its a business requirement and they expect 8 spaces only for that column what should i do with cobrix?
Hi,
Try
.option("improved_null_detection", "false")
so that empty strings won't be treated as nulls, and
.option("string_trimming_policy", "none")
so that 8 spaces won't be trimmed as an empty srtging.
Thanks @yruslan but i think the csv writer will assume empty value as null and then write into the file even if i give these options while writing using csv writer its showing null values in the file , I wanted to have exact spaces as its in cobol ebcidic file in my csv file
It might be that CSV writer can be tweaked as well to output empty spaces. For example, by adding mandatory quotes around values.
After using the above options the dataframe contains the Spaces.
This write will preserve the spaces in CSV file
df.write.option("ignoreLeadingWhiteSpace", "false")\
.option("ignoreTrailingWhiteSpace", "false").csv(output_path, header=True)