cobrix
cobrix copied to clipboard
Variable size file depends on a Copybook field value
Hi guys
I have a file that I receive via FTP, this file is in EBCDIC format and does not have RDW. In Copybook there is a field MODEL-DECISION PIC X(01). that coming S or N (Yes or No)
If I have S, I have to read another 60 bytes fields complements MODEL-COMPLEMENT: MODEL-FIELD-18, MODEL-FIELD-19, MODEL-FIELD-20, 220 bytes total If it has N, I don't have to read anything else and in this case I would jump to the next record, total 160 bytes I made this Copybook with REDEFINES to try to make Spark could read, sometimes 220 bytes, sometimes read 160 bytes, but is doesn't work. MODEL-FILLER-1 PIC X.
I've already tried to use several possible combinations: .option("is_record_sequence", "true") .option("is_rdw_part_of_record_length", true) .option ("record_length_field", "160") or 220
Here's a print of the file and the Copybook
Can you help me?
01 MODEL-REG. 03 MODEL-FIELD-1 PIC 9(09) COMP-3. 03 MODEL-FIELD-2 PIC 9(05) COMP-3. 03 MODEL-FIELD-3 PIC 9(02). 03 MODEL-FIELD-4 PIC X(55). 03 MODEL-FIELD-5 PIC X(01). 03 MODEL-FIELD-6 PIC X(04). 03 MODEL-FIELD-7 PIC X(45). 03 MODEL-FIELD-8 PIC X(06). 03 MODEL-FIELD-9 PIC X(19). 03 MODEL-FIELD-10 PIC 9(09) COMP-3. 03 MODEL-FIELD-11 PIC X(02). 03 MODEL-FIELD-12 PIC 9(05) COMP-3. 03 MODEL-FIELD-13 PIC X(01). 03 MODEL-FIELD-14 PIC 9(05) COMP-3. 03 MODEL-FIELD-15 PIC 9(05) COMP-3. 03 MODEL-FIELD-16 PIC 9(04) COMP. 03 MODEL-DECISION PIC X(01). 03 MODEL-COMPLEMENT. 05 MODEL-FIELD-18 PIC X(55). 05 MODEL-FIELD-19 PIC X(01). 05 MODEL-FIELD-20 PIC 9(04). 03 MODEL-NO-DECISION REDEFINES MODEL-COMPLEMENT. 05 MODEL-FILLER-1 PIC X.
Hi, currently the only possible way to load such files is to use a custom record parser or a custom record extractor. An example of a custom record parser can be found here: https://github.com/AbsaOSS/cobrix/blob/master/examples/spark-cobol-app/src/main/scala/com/example/spark/cobol/app/SparkCodecApp.scala
An example of a custom raw record extractor can be found in a unit test: http://github.com/AbsaOSS/cobrix/blob/fb38219fff386cdd51634ac6b72587bf871031fa/spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/source/integration/Test26CustomRecordExtractor.scala#L41-L41
Between these. methods the custom raw record extractor method is preferable, and should be easier (although both are not quite easy since it involves creating a class and parsing raw data)
Let me know if it works for you.
I'll keep this issue open, maybe adding a method of defining string to record size mapping as a feature could be useful.