Ruslan Yushchenko
Ruslan Yushchenko
Hi, you can get lengths and other parameters from an AST generated by parsing a copybook using `CopybookParser.parseSimple(copyBookContents)`. Example: https://github.com/AbsaOSS/cobrix#spark-sql-schema-extraction When invoking `parseSimple()` you get an AST that you can...
I'm also thinking of adding a metadata field to the generated Spark schema that will contain maximum lengths of string fields, so converting this question to a feature request.
The new metadata field ('maxLength') for each Spark schema column is now available in the 'master' branch. Here are details on this: https://github.com/AbsaOSS/cobrix#spark-schema-metadata You can try it out by cloning...
Hi, could you attach an example copybook and a link to the documentation for the data type, please?
I see. The data types look parsable at first glance. The only thing you need a proper copybook that matches the data in order to parse records like that. And...
Duplicate record_id is getting generated when option set to generate_record_id = "true" for US_ASCII
Hi, thanks for the report. Looks like a very interesting bug. Keen to fix it. Could I ask you to attach - the copybook - the file - the exact...
Duplicate record_id is getting generated when option set to generate_record_id = "true" for US_ASCII
I'm going to try reproducing the issue according to the description you gave, it might just take longer since for our use cases it works as expected.
Duplicate record_id is getting generated when option set to generate_record_id = "true" for US_ASCII
So far unable to reproduce. However I parse various text files I get unique record_id. Could you also check 'File_Id'. 'Record_id' is not unique if multiple files are read. But...
Duplicate record_id is getting generated when option set to generate_record_id = "true" for US_ASCII
Hi @Loganhex2021, how is it going? The new upcoming version (`0.2.10-SNAPSHOT`, current `master`) has safeguards against partial records parsing caused by too long ASCII lines. You can try if it...
Duplicate record_id is getting generated when option set to generate_record_id = "true" for US_ASCII
Yes, it is very helpful! Will try to reproduce