f2e-spec
f2e-spec copied to clipboard
Rewrite in HTML5
I've been closely following the development of web technologies, and it would appear that finally the audio processing APIs have become good enough that Performous' vocal gameplay could be ported to browsers. Based on WebGL, it could easily run graphics like it already does, but HTML and SVG rendering in browsers is lightyears ahead of what our legacy C libs can do (rendering any graphics on CPU is stupidly slow, so browsers do it on GPU).
The older Web Audio API has FFT built in but doesn't give phase information, and as such cannot be used for high quality pitch detection, and this has been a major blocker for this port. AudioWorklets and WebAssembly solve this issue.
- The audio processing could be compiled from C++ to WebAssembly AudioWorklet
- Javascript code gets pitch data out in
Float32Array
- Javascript code gets pitch data out in
- All navigation/menus/hud should be easily implemented in HTML/CSS.
- Graphics rendering with 2d Canvas or WebGL, both of which are lightning-fast and have
requestAnimationFrame(animate)to draw at screen refresh rate (vsync on)- three.js provides an easy to use 3d engine on top of WebGL, avoiding all the nasty bits
- Browsers' built in decoding and playback of audio and video formats would also simplify things quite a bit (browsers do FullHD and 4K video far faster than Performous can).
- Note files should probably be exported into JSON or msgpack so that they are easily read in Javascript, rather than implementing all the format parsers again in another language (Javascript is horrible in string processing and even worse for decoding MIDI files).
In case someone was thinking of using Emscripten to compile the whole Performous for the browser, let me assure that this wouldn't work because too much functionality is not supported, and because browsers operate in so different manner that even if it did work with major changes (like getting rid of threads), it wouldn't run too well. Individual modules can be compiled into WebAssembly and then used from Javascript but this is only practical for computation (like FFT and pitch detection), not for I/O, UI etc.
What I am saying is that if anyone wants to make the bleeding edge revolutionary web karaoke game, now is the time. A year or two ago the technology just wasn't there, yet now if can run even on smartphones.
Personally, it would be really cool if somebody made it but it's not gonna be me; among other things, because in my experience as a user, WebGL is a buggy mess that crashes the entire browser more often than not.
Quite true about WebGL crashes and slowing down the entire desktop, although that also is getting much better lately. However, since Performous is based on 2d graphics, a 2d Canvas would probably suffice (and it can do various pseudo 3d effects just fine). Performous used OpenGL instead of 2d SDL mainly because 2d in SDL is super slow, not because we actually needed 3d. 2d Canvas has none of those issues.
i think if we want to proceed with this it might be just another repository instead of completely rewriting everything in the same repo
Definitely another repository, and obviously it is a huge undertaking, especially if instruments are also to be supported (there are Gamepad and MIDI APIs for web, so guitars and drums could work). In any case it would not be realistic to expect Performous to be superseded any time soon. More realistically, basic vocal gameplay could be implemented with reasonable effort.
Forgive me if this is inappropriate. I'm interested to take this idea and run with it toward a somewhat different goal, but I am a very novice programmer, despite a hobbyist level interest for 35 years...
There is a strong need in the music education community for tying a pitch and rhythm algorithm like this to musical notation for immediate student feedback. Current commercial solutions are inadequate and expensive. The GUI could be incredibly simple; definitely 2D, and audio input would be sufficient for all relevant musical tasks. It would be tangential to the main goals of porting the project, but could be a first step in a proof of concept, while also giving a lifeline to aural skills and theory classes that are being forced online during the pandemic. It's a niche market, but one that needs something desperately.
The existing functionality of Performus seems ready to meet these needs, adding custom content in the existing format and a custom GUI. I'm particularly interested in the possibility of a web app for the ability to run on more platforms, especially smartphones, which nearly all my students have.
Would someone be interested to help look at the feasibility of this, including how much time and effort might be involved to make a prototype? Would a bounty be appropriate for a rough proof of concept? I could contribute in various ways, including developing relevant musical and graphical content, and would be willing to fundraise toward such a goal, though I might just be in the way as a programmer, except perhaps in HTML/CSS coding. A proof of concept could also bring in additional interested parties with more programming experience.
There is also the possibility that significant grant money could be found to develop this, including state level grants to provide open source learning materials that replace textbooks.
So before saying this is feasible i think we have to write down some requirements. For now i'll be only focussing on the singing part of performous:
- Render a basic 2d canvas (with three.js as this is the most mature one)
- Add screens
- Main
- Songbrowser
- Singing
- Settings
- Create a menu to navigate around above screens
- Play a audio file
- Detect microphones
- Voice pitch algorithm
- Theming
Besides these requirements we'll have to decide on the technology to use. Personally i'm quite a fan of Angular since it's strongly typed and it has some nice features like: component based system, dependency injection, internationalization and globalisation. As for the 2d/3d enging i suppose Three.js is the best choice here since we don't wanna go low level. Three.js can be used to render 2d/3d, play audio and video and has a built-in audio analyser for microphone detection.
As for the text fileformats we currently use. It's possible to convert them to json. The header of the file could be attributes whilst the lyrics itself is a dictionairy with timestamps as keys and the text to sing as a string. To support a portability i think we'll also have to come with a song-converter program which reads ultrastar and outputs json.
@Baklap4 What you are talking of is essentially a port of Performous. I believe that a rewrite would be far more feasible. Graphics could even be plain HTML+CSS or drawn on 2d canvas. Three.js is a lot harder and much heavier on GPU (consumes laptop/phone battery quickly) and OpenGL (GLES, WebGL) is not actually needed for anything that Performous currently does, except for GuitarGraph. Playing files is really simple with <audio> (should suffice for vocals, even if not for guitar mode). And for starters, skip all the menu/song browser stuff. Mic selection can be done via browser/OS GUI, no need to implement that at first.
The proof of concept should, however, be able to capture and analyse sound, and output the results graphically in near real time. @Newmanda perhaps you could hack something together for audio capture, like a realtime VU meter using the raw waveform data, to get started?
I will do some research and see what I'm able to do with my limited skill set. Thank you for the support and advice!
I've also been in touch with the developer of https://pitchy.ninja/ who has a similar engine running a web based app. He has developed several open source projects, though he didn't say he would open the source of pitchy ninja. He did say he put the initial prototype together in a weekend. https://twitter.com/11111110b
Okay I can do this. I've found many examples to learn from, and what looks like a good tutorial on the basics of WebAudio. Thank you for your encouragement! I may be slow to hack this together, since I also need to hack together my syllabi for the semester, but I feel empowered.
together, since I also need to hack together my syllabi for the semester, but I feel empowered.
Any progress?
This might be interesting: Live demo of pitch detecting There is a good article about how to implement it: https://www.toptal.com/webassembly/webassembly-rust-tutorial-web-audio
Out of curiosity, what is the difference between McLeod pitch detector and the one that Performous uses?
Out of curiosity, what is the difference between McLeod pitch detector and the one that Performous uses?
I am not familiar with that algorithm but it seems like there is quite little in common. I'd be interested on how it performs if there is a demo or something that can be easily tried.
You can try McLeod demo from the link I posted. :)
I started a new open source project Vocalous (source) based on that demo code. It honors Performous with its name but it's not intended to be HTML5 port of Performous. It's a light-weight web app that tracks singer's pitch and shows the melody notes without lyrics. There is no database and the data is provided by URL query parameters (melody notes, links to music and lyrics etc.). This prevents the app containing any copyrighted content.
The pitch detection engine should be pluggable. Currently the app supports only McLeod's algorithm but it should be possible to use other engines. It would be interesting to try Performous' pitch detection engine.
Quick testing with laptop mic (sorry I haven't been able to testing the previous demo because my karaoke set is not installed).
Responds fairly quickly (not all too much latency) and has plenty of precision so that vibrato and such are seen on the wave. But it also is far less reliable than Performous, so the detection jumps around quite a bit. Not sure whether this is because of the algorithm or because of the extra temporal filtering done in Performous. And quite possibly the problem would be gone if a proper microphone was used instead.
The pitch detection engine should be pluggable. Currently the app supports only McLeod's algorithm but it should be possible to use other engines. It would be interesting to try Performous' pitch detection engine.
I have good experience with this pitch detection algorithm: https://github.com/antoineschmitt/dywapitchtrack Was trying several algorithms for UltraStar Play and this one performed best (just playing UltraStar, no representative tests).
It can best be described as a dynamic wavelet algorithm (dywa): The heart of the algorithm is a very powerful wavelet algorithm, described in a paper by Eric Larson and Ross Maddox : “Real-Time Time-Domain Pitch Tracking Using Wavelets” of UIUC Physics.
You can find the C# implementation for UltraStar Play here: https://github.com/UltraStar-Deluxe/Play/blob/master/UltraStar%20Play/Assets/Common/Audio/Recording/DywaPitchTracker.cs This C# implementation runs as a managed language (garbage collection etc.) and the algorithm still performs in real-time at 60FPS on mobile. Thus, I think it would be a good fit for JavaScript and browser environment as well.
Hi, I have been working on a web-based singing game inspired by Performous for about a year, and I have just made it open source! You can find it at https://github.com/Singularity-Game/Singularity
This game supports multiple user accounts, allowing you to easily share your hosted instance. Additionally, you can install it as a progressive web app to use it offline.
I am utilizing the same WebAssembly McLeod's algorithm as @iqqmuT did in Vocalous.