cobrix
cobrix copied to clipboard
Packed-Decimal columns help required in outgoing EBCDIC file
Background [Optional]
Hi,
I have requirement to create outgoing EBCDIC file (fixed width) with header, detail and footer records of length 155 bytes. I am aware that cobrix dont support writing to EBCDIC using copybooks hence I am writing dataframe to TEXT and encoding to CP037.
Question
In my copybook (file spec) there are few columns with Packed-Decimal. I am trying to understand how numeric data can be converted into packed decimal before encoding to Cp037.
Attribute Name = File Date Format = S9(07) Packed Decimal (This field indicates the file creation date. The date format is Julian (YYYYDDD)) Length = 4
I tried converting to binary and populated in file but not working when reading back using cobrix (pyspark). Please help with some examples of converting to packed decimal.
For testing I am trying to create outgoing EBCDIC file and read it using cobrix in pyspark. Below is the copybook i created for reading header
01 HEADER.
10 HeaderRecordKey PIC X(32).
10 SourceStatus PIC X(01).
10 FileDate PACKED-DECIMAL PIC S9(7).
10 FileTime PACKED-DECIMAL PIC S9(11).
10 Filler PIC X(112).
Packed decimals are supported by Cobrix when such fields have USAGE IS COMP-3.
As you said, Cobrix does not support writing to ebcdic files, but maybe the source code of teh decoder can help you encode them: https://github.com/AbsaOSS/cobrix/blob/6f66bf561ee59b5f26b2ef9698f85a1111bdc906/cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/decoders/BCDNumberDecoders.scala#L30-L30
Example:
10 SOME-FIELD PIC S9(4) USAGE IS COMP-3.
Further documentation: https://www.ibm.com/docs/en/i/7.2?topic=type-packed-decimal-format https://www.geeksforgeeks.org/comp-3-in-cobol/
Hi,
Thanks for your reply.
below are sequence of steps am doing to create EBCDIC file for the above header
- All string columns (non COMP-3) Using withColumn transformation am encoding to 'cp037'
- Concatenating all columns and writing to TEXT file (not encoding again as in 1st step i have done column by column encdoding)
Issue is around COMP-3 columns - I am currently using below UDF to pack comp-3 and storing in BinaryType() column in a dataframe. As data is already converted, should i need to encode to 'CP037'? I tried but no luck
If it is not right, please help me with sequence steps to convert to COMP-3 and encoding to ebcdic (Cp037)
For example: value = 2023115 packed = pack_number(value) Result is = b' #\x11_'
def pack_number(n): """ Pack a COMP-3 number. Format: PIC 9(7). """ # Cobol numbers are stored without decimal info. Remove the decimal before # calling pack_number() n = int(n)
# Is the number negative? Remember for later.
negative = False
if n < 0:
negative = True
n *= -1
# Treat the number as a string. Makes it easier to loop over.
n_str = str(n)
b = int(n_str[0])
# For each digit, shift it onto the result.
for c in n_str[1:]:
b = (b << 4) | int(c)
# Make the number negative if needed.
if negative:
b = (b << 4) | 0xd
else:
b = (b << 4) | 0xf
# Pack the number as a long long and chop off the unused bits at the
# beginning. This will need to be changed for varying PICture clauses.
b_packed = pack('>q', b)
if len(b_packed) > 4:
b_packed = b_packed[-4:]
return b_packed
Regards, Manoj
Hi @ManojKolisetty-git , sorry for the late response.
COMP-3 fields should not be encoded in CP037 nor any other encodings. Encodings are for string and numerics in DISPLAY format only. COMP-3 defines representation of numbers encoding-agnostic.
Did the code above worked an the end to create COMP-3 fields?