PRONOM_Research
PRONOM_Research copied to clipboard
x-fmt/80 (Mac Pict) and fmt/1427 (Mac Draw 2) signatures overlap
Signatures for these two formats currently overlap. x-fmt/80: 44525747(4D44|4432){516}1101
fmt/1427: 44525747(0000|4432)
So an x-fmt/80 with 'D2' following DRWG header will necessarily also match fmt/1427. This is evident in the file 'DOODLE.PCT' found in the EDRM 1.0 dataset (https://edrm.net/resources/data-sets/#1598455996696-88a3bd82-aedf) which is currently getting dual identification outcome based on signature.
Not yet sure the best resolution so just raising an issue for now...
CC @thorsted
Hmm, I went back at looked at my original submission and there might have been a segment missed for fmt/1427.
<InternalSignature ID="3" Specificity="Specific">
<ByteSequence Reference="BOFoffset">
<SubSequence MinFragLength="0" Position="1" SubSeqMaxOffset="0" SubSeqMinOffset="0">
<Sequence>44525747</Sequence>
<DefaultShift>5</DefaultShift>
<Shift Byte="44">4</Shift>
<Shift Byte="52">3</Shift>
<Shift Byte="57">2</Shift>
<Shift Byte="47">1</Shift>
<RightFragment MaxOffset="0" MinOffset="0" Position="1">0000</RightFragment>
<RightFragment MaxOffset="0" MinOffset="0" Position="1">4432</RightFragment>
<RightFragment MaxOffset="0" MinOffset="0" Position="2">0000</RightFragment>
</SubSequence>
</ByteSequence>
</InternalSignature>
The current signature doesn't have that second position fragment. I believe I added it to protect it from some similarities in other versions, but this was one of my earlier signatures, any advice would be appreciated. I remember @jayGattusoNLNZ also used the additional zero's on the signature he developed around the same time. Might warrant a second look.
@Dclipsham @thorsted should this be picked up by the skeleton suite? Do you have any insight into why it isn't?
From a quick glance at the v97 suite (not the most recent, but just readily available), fmt/1427 whole file is '44 52 57 47 00 00', and x-fmt/80 begins '44 52 57 47 4D 44' so neither include the 0x44 32 at offset 0x04-05
@Dclipsham @thorsted not an ideal solution but maybe in time for the next release before a closer look can happen but should I prioritise x-fmt/80 over fmt/1427 for v.108? Unless this issue has already been picked up on
I'm not hugely au fait with the formats yet so can't make that call right now I'm afraid.