Optional list of images to ignore when inferring nonvisual reading

Open HadrienGardeur opened this issue 11 months ago • 6 comments

Our current inference rule for "accessModeSufficient": "textual" is limited to publications that have either:

zero images
or where the only image present is a cover

We'd like to add the ability to ignore images based on their cryptographic or perceptual hash as well. This would be used for a list of well-known images (such as logos or decorative images shared across a collection) that would be automatically considered as decorative when inferring "accessModeSufficient": "textual".

We need to support two types of inputs:

a directory of images to ignore (for a manual workload)
or a list of hashes (for automated workloads)

Jan 04 '25 14:01 HadrienGardeur