go-toolkit icon indicating copy to clipboard operation
go-toolkit copied to clipboard

Optional list of images to ignore when inferring nonvisual reading

Open HadrienGardeur opened this issue 11 months ago • 6 comments

Our current inference rule for "accessModeSufficient": "textual" is limited to publications that have either:

  • zero images
  • or where the only image present is a cover

We'd like to add the ability to ignore images based on their cryptographic or perceptual hash as well. This would be used for a list of well-known images (such as logos or decorative images shared across a collection) that would be automatically considered as decorative when inferring "accessModeSufficient": "textual".

We need to support two types of inputs:

  • a directory of images to ignore (for a manual workload)
  • or a list of hashes (for automated workloads)

HadrienGardeur avatar Jan 04 '25 14:01 HadrienGardeur