tevr-asr-tool
tevr-asr-tool copied to clipboard
Startup time, recognition and realtime mode
I tried to test it with "telefon, rufe 017876534 an" with some background noise to make a realistic scenario (I can share the wav on demand over a separate channel). I'm using a Jabra Speak 750 which is actually quite good with echo and background cancellation. Issues I got:
- it took several minutes on an Intel i7-4810MQ @ 2.8 GHz processor with 32 GB RAM
- the result was: "telefon für null siebenüsechsfünfstreivpierantababnabnabnabnabnabnsnsnsnsnsnsnsnsnsts"
Questions:
- Is it planned to support real time recognition on standard CPUs?
- Is the start up time expected?
- Could you add a listen while recording mode if 1. is supported?
I also tried it a second time with very good audio quality with "telefon, wähle 0176385402" and got: "telefon wählen null ein siebensechsdreiachtfünfviernullzweigderabnabntsttert". The end seems to be scrambled always with garbage.