cobrix
cobrix copied to clipboard
Cobrix returning hexadecimal value in different format (qb)
Hi All,
When I am reading hexadecimal data from Cobol file using Cobrix, the output is in a different format. I have tried to cast it using Spark SQL and Pyspark. But there is no use.
The data type defined in copy book is PIC X(06). In most cases, it is getting converted to "qb" with one or 2 spaces after that. Can anyone please help me with this? Whereas Abinitio reads the same data as "qbXXX". Some characters after "QB". But in Abinitio they were able to cast that value to hexadecimal.
As the data we are reading is highly sensitive, we could not able to get it from our customer. So I couldn't able to share the file here.
Thanks in advance, Narayana
There could be several reasons for such behavior. Possibly, the data is in a different code page (not EBCDIC common).
You can use .option("debug", "hex") to see raw bytes of each field to debug the decoding process.
Please, provide an example value and corresponding value in '_debug' field. I know it is sensitive, hope a single QBxxxx number is not
Hi Ruslan,
Thank you for your quick reply. I have used the debug option. Then the output is as follows
"column":"qb","column_debug":"988200007600" .
So EBCDIC 98 is q, 82 is b, but 00 nor 76 do not correspond to any character in EBCDIC common encoding (https://en.wikipedia.org/wiki/EBCDIC).
What output you expect for this column? What abinitio shows for this field and this record?
Hi Ruslan,
I got some information from the customer.
Ab initio reading data "988200007600" in the following format: "qb\x00\x00\x00\x00"
Then they use some function to cast it back to 988200000000
Transformation used: (decimal(12)) reinterpret_as(packed decimal(12,stripped), <ColumnName> )
I see. One workaround that you can use is: .option("binary_as_hex", "true") and make the field:
PIC X(06) COMP.
This will only work with the latest Cobrix (2.6.8)