puremagic icon indicating copy to clipboard operation
puremagic copied to clipboard

For Python 3.13: A drop-in replacement for `sndhdr.what()` and `sndhdr.whathdr()`

Open cclauss opened this issue 1 year ago • 3 comments

Like #72 but for sndhdr instead of imghdr. Given that puremagic.what() is now mentioned in What's new in Python 3.13 should we do something for the sound file formats in https://docs.python.org/3/library/sndhdr.html that will also be removed in Python 3.13? Does puremagic have support for the twelve sound file formats that sndhdr supports?

It might be quite helpful to have a drop-in replacement for sndhdr.what(). It would provide a smooth transition to Py3.13 if developers could confidently replace all instances of sndhdr.what() with puremagic.what() -- same args, same results.

  • python/cpython#120871
  • #72
  • #76
  • #81

@NebularNerd

cclauss avatar Jun 24 '24 14:06 cclauss

I feel a PR coming on 😁

sndhdr.py

sndhdr unlike imghdr seems to perform some fancier tests to test/provide better information. We can look to borrow some of this especially for #70 and #69. With the current state of PureMagic we can certainly make a good job of identifying them with decent confidence.

sndhdr support these formats:

  • aifc Compressed Audio Interchange Files: ✔️ Looks to support it, some tiding/enhancements required
  • aiff Audio Interchange Files: ✔️ Looks to support it, some tiding/enhancements required
  • au Au Files: ✔️ Looks to support it, some tiding/enhancements required
  • hcom HCOM Files: ❌ Not supported
  • sndt Sndtool Sound Files: ❌ Not supported
  • voc Creative Labs Audio Files: ✔️ Supported some tiding required
  • wav Waveform Audio File Format Files: ✔️ Supported, maybe some enhancements later for V2.0
  • 8svx 8-Bit Sampled Voice Files: ✔️ Supported
  • sb Signed Byte Audio Data Files: ❌ Not supported
  • ub UB Files: ❌ Not supported
  • ul uLAW Audio Files: ❌ Not supported
  • sndr ❌ Not mentioned in the docs but there is a test for it in the .py. It's used by Sounder

Notes:

  • HCOM is a weird possibly ancient Mac format, there seems little to no information about it aside from here
  • SNDT again almost no info beyond a mention here and some sample files here
  • SB and UB Again, not much info aside from here, there is no specific test in sndhdr.py. At a guess I'll assume these are some sort of raw stream (signed or unsigned). At present this may be beyond the scope of PureMagic as we would need to explicitly force a raw stream test somehow.
  • UL uLAW is a label that can be applied to a multitude of audio formats, there is no specific test in sndhdr.py. I remember this format from later in the Amiga era being mainly for speech and comms so I'll assume that's what they are aiming for. I'll likely try and match any/all things it could be.
  • SNDR Sounder is some old DOS/Win program, not much info at present other than here

This is going to be a trickier PR, realistically we can at least try to ensure PureMagic can understand the files to a basic level, but the advanced tests will be reliant on how V2.0 testing logic pans out.

NebularNerd avatar Jun 24 '24 17:06 NebularNerd

Hi @cclauss, my PR #87 has been merged with updates for most of the missing sound formats from sndhdr. Unlike imghdr it's not a complete replacement for a few reasons mentioned in the PR. Hopefully once v2 is in progress we can look at ways to handle more esoteric formats 🙂

NebularNerd avatar Jul 14 '24 09:07 NebularNerd

Awesome! I did not see that PR. I agree that when I looked into sndhdr, I knew that making a complete replacement would be much more difficult than the relatively simple imghdr module. I see two things that should be added.

  1. A section in README.rst like https://github.com/cdgriffith/puremagic?tab=readme-ov-file#imghdr-replacement
    • Or a separate imghdr_and_sndhdr_replacements.rst file that goes into more detail like your commit messages in #87
  2. A pull request like:
    • python/cpython#120871 but focused on sndhdr.

cclauss avatar Jul 14 '24 11:07 cclauss