tevr-asr-tool icon indicating copy to clipboard operation
tevr-asr-tool copied to clipboard

Startup time, recognition and realtime mode

Open kaoh opened this issue 3 years ago • 2 comments

I tried to test it with "telefon, rufe 017876534 an" with some background noise to make a realistic scenario (I can share the wav on demand over a separate channel). I'm using a Jabra Speak 750 which is actually quite good with echo and background cancellation. Issues I got:

  1. it took several minutes on an Intel i7-4810MQ @ 2.8 GHz processor with 32 GB RAM
  2. the result was: "telefon für null siebenüsechsfünfstreivpierantababnabnabnabnabnabnsnsnsnsnsnsnsnsnsts"

Questions:

  1. Is it planned to support real time recognition on standard CPUs?
  2. Is the start up time expected?
  3. Could you add a listen while recording mode if 1. is supported?

I also tried it a second time with very good audio quality with "telefon, wähle 0176385402" and got: "telefon wählen null ein siebensechsdreiachtfünfviernullzweigderabnabntsttert". The end seems to be scrambled always with garbage.

kaoh avatar Aug 15 '22 21:08 kaoh