pitchy
pitchy copied to clipboard
Voice Pitch Detection
I read through the paper that you linked in the ReadMe(http://www.cs.otago.ac.nz/tartini/papers/A_Smarter_Way_to_Find_Pitch.pdf). In the conclusion, it mentions that the Tartini projects works well with variety of instruments like string, woodwind, brass and voice. And also, the algorithm is optimised for vibrato. What are your thoughts on using it for a voice?
I tested the app with a Garageband Piano and also Voice pitch. It is was off by +-40 cents. I tried simple frequencies generated from InsTuner(Tone Generator Section) and it seems to be off by +-15 cents.
Is this something that is a known issue? How did your tests look like? Or am I missing something?
That's a good question; I haven't specifically tried the algorithm with voice input, since my original (and primary) use-case for myself was for tuning a string instrument, and it seems to work well for that, but that doesn't involve any vibrato. All the current unit test cases for the project are using constant waves of various types (e.g. sine wave, square wave), so it's hard to say. It would definitely be something interesting to try out when I have time to work on something for #15 and try tweaking different parameters such as the update frequency or the number of samples collected.
The screenshot of the Tartini software from the paper seems to indicate that it's probably collecting pitch samples at smaller intervals than something like Temperatune or the usage example in this repository; in the case of the example page, it only updates the pitch every 100ms: https://github.com/ianprime0509/pitchy/blob/2899ed4fec4434ee091c81af990918a82c65c246/website/docs/examples/mic-input.raw.html#L32-L35
So it might also be interesting to create an example page more similar to the screenshot from the paper, where instead of showing a constantly changing note and offset, it would show a graph of pitch over time to make it possible to visualize the vibrato. For a tuner application, it might make more sense to average multiple samples, since otherwise it probably depends on when the sample is taken, especially if there's vibrato involved.
Hello! I tried to build a simple tuner app using pitchy, and used basic code available in the example: https://ianjohnson.dev/pitchy/. However, it works very well, but has some occasional random peaks. For example, it can show that the frequency is 100-120 hz, and the next second it shows 3000 hz. Could you tell where might be the problem? or is there a way to somehow filter the pitch we get?
I'm by no means an expert in audio processing, but there are at least a few things you could try to avoid that (which the playground page will let you experiment with, if you want):
- Increase the minimum clarity needed to consider the pitch to be properly detected. The clarity (the second value returned by the pitch detector) is a number between 0 and 1 indicating how "clear" the pitch is: if the clarity is low, then the algorithm isn't very confident about the pitch. If the correct pitches (between 100-200 Hz) are coming in at a clarity of 0.97 or something and the incorrect pitches (3000 Hz) are coming in around 0.9, then you could try ignoring all pitches detected at a clarity of less than 0.95.
- Simply ignore any pitches detected higher than a certain frequency. If you know that the pitches you're detecting will never go up to 3000 Hz, you could set a cutoff value lower than that and just ignore any higher pitches that are detected. This is probably less ideal than the first option, though.
Thanks a lot! That's exactly what I did, found out about min clarity percent on same day. It doesn't have random peaks anymore. And, I also set some limit by comparing previous and current frequencies, so that it didn't exceed some limit. Hard-coded, but works. Thank you for such useful library!
Great! I'm glad you found a solution that works for you and that the library has been useful.