format-corpus icon indicating copy to clipboard operation
format-corpus copied to clipboard

Mapping of files to PUIDs

Open dd388 opened this issue 4 years ago • 1 comments

Would it be useful for this repository to have something that maps every item in format-corpus to its respective PRONOM PUID? I am currently working on some comparison testing between file identification utilities and I've found this corpus to be helpful, though as it is there's no standard way of knowing any file's expected PUID. For my own testing, I've created a spreadsheet of items and my best guess for what the appropriate PUID is, but I'm not sure it's 100% accurate. It might be a start, though.

dd388 avatar Jul 23 '21 14:07 dd388

I'd love to see this.

There's also been a lot of progress with adding data to wikidata and enabling siegfried to identify wikidata IDs directly. If you had time to both try that out and add them that would be wonderful and I'm sure the siegfried folks would love any feedback that you might have as it's new functionality.

On Fri, 23 Jul 2021 at 10:33, Dianne Dietrich @.***> wrote:

Would it be useful for this repository to have something that maps every item in format-corpus to its respective PRONOM PUID? I am currently working on some comparison testing between file identification utilities and I've found this corpus to be helpful, though as it is there's no standard way of knowing any file's expected PUID. For my own testing, I've created a spreadsheet of items and my best guess for what the appropriate PUID is, but I'm not sure it's 100% accurate. It might be a start, though.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/openpreserve/format-corpus/issues/19, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABM66JXZKVGMBSU76QTJZTTZF4UPANCNFSM5A4D4JPA .

euanc avatar Jul 23 '21 14:07 euanc