Wordcast
Wordcast copied to clipboard
Want to help?
Hi there! Thanks for your interest. You're already helping. As you may already know, I want to try and build an HTML5 captioning app using Chrome's webkitSpeechRecognition
API. I'm thinking it could be a tiny Node/Express app that sets up a Socket.IO to broadcast the transcription results.
I've never deployed with any of these so I don't know anything, but it seems super doable. You help is invaluable. Thanks.
Dave, My wife is deaf and 8 times out of 10 she's the frustrated person in the back row you mentioned in your article. I'm a relative noob in the world of FrEDs, but I have the passion and desire for this app, and can pass your article on to people that know more than me. I'd love to see Wordcast happen, thanks for even thinking of this. Matt Bivins
@matthewbivins Thanks, Matthew! Hopefully we can get something up and going soon. Don't worry about n00bness, every little bit helps. Also, I don't want to volunteer your wife or anything, but she sounds like she'd be really helpful in giving us a :+1: / :-1: as we progress. Thanks for helping!
Hey Dave! Thanks for doing this. I'd like to echo Matt, I'm fairly new to the world of HTML API's / Node, but would love to help. My fiancee is an audiologist, so on the flip side I'll be sharing this with her and see if she can provide any input. But please let me know how I can help (I can write the hell out of documentation at the very least) and I'll be keeping drag to see if I can dig into the code.
Thanks, Jason. Hopefully have something up soon. Started playing around seriously with Socket.io last night. :pray:
If you want to try and give Meteor* a try for this project, I'd be happy to donate a copy or our book as well as help out however I can if you have any Meteor questions :)
*I think it'd actually be a pretty good fit since it does all the real-time stuff out of the box, and is pretty fast to learn.
This can definitely be done with Node/Express, I can jump in, if you tell me more... [email protected]
WoW Dave! It's just the kind of thing I'm looking for some time ago! I worked for seven months in Dublin, Ireland and my English is not very fluent. I thinked some times If can read with subtitles the conversations, meetings, etc. It would be nice to improve more quickly my English. Every time I dreamed with the possibilty of create a kind of iOS/Android app solving this question. I'm UI Designer and Front developer. ¿Can I help you in the project? [email protected]
@SachaG Sweet! Thanks! Re-looking at the demo, that seems like it could get up and running fairly quickly. I assume there's a cost that will one day be associated with deploying Meteor apps. Is there any information/plans about that yet?
@parkerproject I stared playing around a bit with Express/Socket.io, I can push that or I jot down a roadmap, if that would help you.
@aaromnido Thanks. Hopefully it can help you. We may need some UI/Frontend work, but it might actually be very little (just text on a page basically). Will let you know if we need anything.
Well, there's a couple options. Deploying on *.meteor.com is free and very easy, so it's a good way to get up and running.
Another easy option is Modulus, but it's not free (probably something like $15/month for most apps).
Finally, you can also set up your own instance on Digital Ocean, should be about $5-$10/month.
Oh and if you want to host your database on MongoHQ (not required, but it's sometimes good to separate the DB from the app), that can add another $15/month.
@SachaG Cool. Meteor looks great, I just didn't want to back the open project into a paid service too far. Will try it out in a branch and see if it works. No idea about a MongoDB instance yet. Possibly tho if we have "rooms" that need to be persistent.
I would definitely like to get involved with this. I've got a little bit of experience with both express and meteor, so let me know what I can do to help! I'd love to be involved in a project like this.
Same here, I've made webapps with Express/MongoDB and Express/CouchDB. Not a pro though, still learning :)
Hey all, I feel like I might be holding this up. It could be next week before I'm able to take another stab. But if someone wants to take the original CodePen and port it into Express or Meteor, by all means, do it and make a pull request.
It doesn't have to be super good right now. And we could even test it by reading technical blog posts to eachother. Here's a mini-roadmap that's in my mind. I can make it more official on the readme:
- [ ] Create an alpha node where Computer One can broadcast captions to Computer Two and Computer Three.
After that some user settings might be nice:
- [ ] Add broadcaster setting for
event.results[i][0].confidence
level. - [ ] Add broadcaster setting for user language input.
Then. Once that is off the ground. We can test it and start styling it a bit.
- [ ] Beta test with deaf/HH users in a small setting.
- [ ] Beta test with deaf/HH users in meetup environment.
- [ ] Beta test with deaf/HH users in conference environment.
- [ ] Design: Make it look good and be useful.
Super future (optional) stuff I've been mulling around:
- [ ] Broadcast audio feed over WebRTC to create a Audio induction loop system. e.g., iPod + headphones == Induction loop.
- [ ] Allow stenographers to broadcast captions instead of auto-captioner.
- [ ] Allow remote sign interpreters to share video over WebRTC.
- [ ] Add settings for User language output (requires Google Translate API)
Pretty slick. I was able to use this (codepen version) in Chrome on my phone.
Should the transcripts be stored? And if so, in what format?
For example, I was thinking it could be useful to store each new transcribed sentence fragment as a separate document with a timestamp. This way, it would be possible to then sync up the transcript with a video afterwards. Any thoughts?
http://www.html5rocks.com/en/tutorials/track/basics/
VTT would probably be best. Or just timestamps and text inside a JSON file and we could generate VTTs somehow.
I think this would be a great feature tho not essential to getting it tested.
My thinking was that we'd need to store that data somehow to transmit it from one client to the other. So we might as well use a good format from the start?
α big update:
We have a working alpha version thanks to @JordanForeman. I've included some installation instructions on the readme. Requires a little bit of Node-fu to get going. To try out:
- Broadcast browser goes to http://localhost:3000/listener
- Subtitle viewer browser to http://localhost:3000/viewer
URLs will definitely change in the future, but give it a try. I've started a Feeback thread in https://github.com/a11yproject/Wordcast/issues/4
Very cool! Looks like I wasn't fast enough with my own Meteor version :) I might still hack on it as a fun exercise or material for a tutorial though, I'll let you know if I ever get something working.
@davatron5000 gonna have to give this a go. Just wanted to let you know I talked to my fiancee re: induction loops and this is what I got:
Induction loops are present in a lot of theaters, so conferences might have a good chance of having them equipped. Hearing aids use something called a telecoil (T-coil) in order to pick up the signals from an induction loop. It's not much more then a wrapped copper wire, but it is able to pick up the vibrations from an induction loop and will cut out background sounds way better then a shotgun mic or portable microphone could ever do. Finding telecoil's however, that can hook into a computer, might be a bit of a challenge. I haven't found anything yet but maybe someone else has some info on that.
The other options would be to feed straight from the source, the mixer that the speaker's microphone is hooked into. But how that can be done without some bluetooth module I don't know so it may not be super plausible. Anyway, my research continues and I'll keep you posted. Thanks for putting this up!
@JasonHoffmann
The other options would be to feed straight from the source, the mixer that the speaker's microphone is hooked into.
I think that's our ideal scenario. We could easily broadcast audio from a laptop with an audio interface into the soundboard. In theory, in the near future, it could be done by plugging in a phone or iPod into the soundboard. Good to know about that induction loop stuff. Will investigate it more.
@SachaG: No worries. Thanks for thinking about the project and if you get a post up, let me know and I'll be sure to link stuff.
Cool beans, y'all! My node-fu is not powerful but I know folk that do have fighting tiger coding dragon skills, and will see what it's all about. And start using my wife as a guinea pig!
Speaking of, @JasonHoffmann , Lindsay does use the t-coil every once and a while, and it could be something interesting, for sure. It really depends on the quality of mic'ing that the theatre does, though...I can imagine that in some cases it might actually make things worse, but I don't really know. It's not like she gets a nice mix of t-coil and hearing aid; it's all or nothing.
However! She is a new recipient of a cochlear implant, the most recent one from Advanced Bionics, and it has a wireless system (http://www.advancedbionics.com/us/en/products/accessories/connectivity.html) that lets her sync sound directly to the CI via bluetooth to her iPhone or an external microphone (that we haven't bought yet). And I think that this technology was originally developed by Phonak, which is a major hearing aid company...I could be wrong about that, but still, the bluetoothy-ness is out there on these devices. We haven't played with the ComPilot much yet, but this project is even more incentive to experiment with it and I'll report back.
@davatron5000 I'm very interested in become a part of this project! This is exactly what I've been looking to work on. Please e-mail me some more info and I can start doing some research. [email protected]
@tystrong @davatron5000 I'm currently (slowly) working to get rooms working using Socket.IO. I've got a branch on my fork that I did some work on a few days ago. I've been pretty swamped with work and the holidays so I haven't had much time. If you wanna take a look at that branch, feel free. Otherwise I think Dave's got a pretty good roadmap written up above if you want to start working on a different feature?
@davatron5000 I've been Deaf since birth. I use subtitles everywhere and this project gets me excited. Experienced Front End Dev.
I'd love to help. I have a lot of experience with MVC's and I can provide a lot of UI assistance. Was thinking about a similar project some time ago but didn't know who I could work with and here you are with this project. Perfect. Please let me know how I can contribute. The roadmap looks good. I'm not sure if it's mentioned yet but it would be cool if I could type in stuff and have the text dictate sound, so it's two way and I can use it at bars or doctors' office. I love the idea of video interpretation integration.
Really looking forward to be able to contribute.
@sethjgore Thanks for the help and input, Seth!
- Good news! The Speech Synthesis API is already possible in Chrome (also partial support in iOS7?), so it is a feature that we could potentially add: http://html5-examples.craic.com/google_chrome_text_to_speech.html
- The sign language interpretation thing is very interesting. Remote sign language interpretation could almost be a whole new business for interpreters. I have a friend who is a sign interpreter, but she's also a mother of three. I'll get some feedback from her on how useful this would be. Do you feel like it's something you'd use and/or maybe pay for?
Hopefully we'll have an alpha up online soon. It's coming along nicely.
@davatron5000 — glad to be of help.
What areas do you need contributing? I can jump into the code, but was wondering if there’s any specific areas you’d need the help most. Looking forward to the alpha.
Remote sign language interpretation already exists (video relay — check out convorelay.com and there are several other companies doing the same). It’s just that we could provide more open sourced options & provide automatic transcriptions so both parties of the conversation can keep something for themselves.
Sometimes when the conversation being interpreted is serious, I prefer to do it via text for some reason. Perhaps because there’s no chance of misinterpretation when I use text. Currently, there is no way for me to switch to that from video interpretation or even do both. This could be something wordcast can do.
I’m excited!
Seth J Gore Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
On Friday, January 24, 2014 at 3:05 PM, Dave Rupert wrote:
@sethjgore (https://github.com/sethjgore) Thanks for the help and input, Seth!
Good news! The Speech Synthesis API is already possible in Chrome (also partial support in iOS7?), so it is a feature that we could potentially add: http://html5-examples.craic.com/google_chrome_text_to_speech.html
The sign language interpretation thing is very interesting. Remote sign language interpretation could almost be a whole new business for interpreters. I have a friend who is a sign interpreter, but she's also a mother of three. I'll get some feedback from her on how useful this would be. Do you feel like it's something you'd use and/or maybe pay for?Hopefully we'll have an alpha up online soon. It's coming along nicely.
— Reply to this email directly or view it on GitHub (https://github.com/a11yproject/Wordcast/issues/1#issuecomment-33260657).
Is this still being worked on? I'd love to contribute.
@tystrong Stalled out a bit due to work life. There's a working alpha if you clone the project. Check it out. Start new issues or whatever.