metafacture-core icon indicating copy to clipboard operation
metafacture-core copied to clipboard

Counted leader elements in marc when encoding to marc

Open TobiasNx opened this issue 10 months ago • 3 comments

@maipet hinted that encode-marc21 or encode-marcxml cannot create the leader correctly since the elements are not counted. Could you elaborate the problem

TobiasNx avatar Apr 09 '24 20:04 TobiasNx

Is this related to #454?

blackwinter avatar Apr 10 '24 09:04 blackwinter

I am not sure. We are transforming the OERSI JSON Data to Marc, but @maipet told me about invalid results created by the transformation due to the missing leader elements that state e.g. the length of a record.

But @maipet could clarify.

TobiasNx avatar Apr 10 '24 09:04 TobiasNx

you can set the leader field, but leader "Character Positions 00-04 - Record length" & "Pos. 12-16 - Base address of data" should actually be generated automatically? It was discussed with @dr0i that we should first check whether the marc records from OERSI are 'valid' even without the correct information in the leader (the positions are currently filled with zeros).

maipet avatar Apr 10 '24 11:04 maipet

While inspecting some workaround for #454, I saw that the marc21-encoder seems to have a mechanism for that:

https://metafacture.org/playground/?flux=%22https%3A//d-nb.info/1106253078/about/marcxml%22%0A%7C+open-http%28accept%3D%22application/xml%22%29%0A%7C+decode-xml%0A%7C+handle-marcxml%0A%7C+fix%28transformationFile%29%0A%7C+encode-marc21%0A%7C+decode-marc21%28emitLeaderAsWhole%3D%22true%22%29%0A%7C+encode-yaml%0A%7C+print%0A%3B&transformation=copy_field%28%22leader%22%2C%[email protected]%22%29%0Acopy_field%28%22leader%22%2C%[email protected]%22%29%0Acopy_field%28%22leader%22%2C%[email protected]%22%29%0Acopy_field%28%22leader%22%2C%[email protected]%22%29%0Acopy_field%28%22leader%22%2C%[email protected]%22%29%0Acopy_field%28%22leader%22%2C%[email protected]%22%29%0Acopy_field%28%22leader%22%2C%[email protected]%22%29%0Acopy_field%28%22leader%22%2C%[email protected]%22%29%0A%0Asubstring%28%[email protected]%22%2C%225%22%2C%221%22%29%0Asubstring%28%[email protected]%22%2C%226%22%2C%221%22%29%0Asubstring%28%[email protected]%22%2C%227%22%2C%221%22%29%0Asubstring%28%[email protected]%22%2C%228%22%2C%221%22%29%0Asubstring%28%[email protected]%22%2C%229%22%2C%221%22%29%0Asubstring%28%[email protected]%22%2C%2217%22%2C%221%22%29%0Asubstring%28%[email protected]%22%2C%2218%22%2C%221%22%29%0Asubstring%28%[email protected]%22%2C%2219%22%2C%221%22%29%0A%0Amove_field%28%22@leader%22%2C%22leader%22%29

Someone more advanced should have a look to confirm. Probably we could reuse the parts of the encode-marc21 for encode-marcxml

TobiasNx avatar Apr 15 '24 09:04 TobiasNx

The construction of the leader (counting bytes including indicators etc magic) is done through invoking Marc21Decoder.java (which calls the Record.java) . Code can be reused for encode-marc21 - although it's ugly from a performance point of view (the whole record has to be made into tpye Record at the end of the parsing of a record). This will be done in my PR treating https://github.com/metafacture/metafacture-core/issues/454.

dr0i avatar Apr 18 '24 15:04 dr0i

Code can be reused for encode-marc21

@dr0i: Isn't encode-marc21 already doing this? See: https://github.com/metafacture/metafacture-core/issues/524#issuecomment-2056344931

TobiasNx avatar Apr 19 '24 07:04 TobiasNx

Functional review @TobiasNx and @maipet . Deployed to test-Plaground metafacture-framework feature-454-allowMarc21EncoderToGetLeaderAsOneString-SNAPSHOT.

Note that the generated leader is 02934naa a2200649uc 4500 while the original input was <leader>00000naa a2200000uc 4500</leader>. So the leader seems to be correct (record size and also other parts, while the type etc. is preserved...)

dr0i avatar Apr 19 '24 14:04 dr0i

Added my review here: https://github.com/metafacture/metafacture-core/pull/526#issuecomment-2068705357

On scenario is still not working otherwise for me this seems to work. But @maipet has more knowledge about the leader.

TobiasNx avatar Apr 22 '24 07:04 TobiasNx

It seems that this is not solved for encode-marcxml. The leader position at the beginning and in the middle are still 00000

TobiasNx avatar Apr 24 '24 11:04 TobiasNx

Ahhh I now see what the problem here is, encode-marcxml still lacks the ability to generate the counted leader info. I did not review this properly, sorry.

TobiasNx avatar Apr 24 '24 11:04 TobiasNx

We decided with @maipet and @dr0i that marcXML does not need to count but either use the provided leader info if the leader is provided as whole (even if the record itself changed) or set the Leader Pos 00-04 and 12-16 to zero if the leader is only provided in separated elements as it is done by decode-marc21.

For further info see: https://github.com/metafacture/metafacture-core/issues/527#issuecomment-2076585889

TobiasNx avatar Apr 25 '24 08:04 TobiasNx