trafficstars

NTSpeechRecognition

NTSpeechRecognition is a iOS/macOS framework, written in Objective-c, providing speech recognition functionality. For decoding PocketSphinx is used.

Features

Wrapper for the PocketSphinx decoder
Recognizer based on the PocketSphinx decoder
- Switch between searches immediatly
- Partial hypotheses (Before end of utterance is detected)
- Keyword Spotting, Grammar, NGram
Fake Recognizer
- Receive hypetheses from UDP connection
- Can be used to test apps (Define exactly which hypothesis should show up)

Installation

Carthage

You can use carthage to install NTSpeechRecognition by adding to following to your Cartfile:

github "ynop/NTSpeechRecognition"

Manual

You also can add this project as subproject.

Documentation

Checkout out API Reference.

Basic Usage

Setup Recognizer

At first the recognizer needs to be setup. For this purpose create an audio source, where the recognizer gets data from. Then we need to create the pronunciation dictionary and one or more searches (Check out NTSpeechTools.

  // Create an audio source
  NTMicrophoneAudioSource source = [NTMicrophoneAudioSource new];

  // CREATE RECOGNIZER WITH AUDIO SOURCE
  NTPocketSphinxRecognizer *recognizer = [[NTPocketSphinxRecognizer alloc] initWithAudioSource:source];
  [recognizer addDelegate:self];

  // CREATE SEARCHES
  NTSpeechSearch *numbersSearch = [NTJsgfFileSearch searchWithName:@"Numbers" path:@"path/to/numbergrammar"];
  NTSpeechSearch *dateSearch = [NTJsgfFileSearch searchWithName:@"Date" path:@"path/to/dategrammar"];
  
  // CREATE DICTIONARY
  NTPronunciationDictionary *dictionary = [[NTPronunciationDictionary alloc] initWithName:@"Default"];
  [dictionary loadWordsFromFileAtPath:@"path/to/number/dict"];
  [dictionary loadWordsFromFileAtPath:@"path/to/date/dict"];

  // ADD DICTIONARY AND SEARCH
  [recognizer loadPronunciationDictioanry:dictionary];
  [recognizer addSearch:numbersSearch];
  [recognizer addSearch:dateSearch];

Handle start/suspend/resume/stop

Now we can control the recognizer.

// Start recognizer and audiosource
[recognizer start];
[source start];

// Activate Searches
[recognizer setActiveSearchByName:@"Numbers"];

// Use suspend/resume for pausing
[recognizer suspend];
[recognizer resume];

// Stop
[recognizer stop];
[source stop];

Listen for hypotheses

To get informed about hypotheses and state changes we implement the NTSpeechRecognizerDelegate methods.

// Receive Hypotheses
- (void)speechRecognizer:(id<NTSpeechRecognizer>)speechRecognizer didReceiveHypothesis:(NTHypothesis*)hypothesis forSearch:(NTSpeechSearch*)search
{
}

// Receive partial hypotheses (End of utterance wasn't detected yet)
// First you need to set returnPartialHypotheses = YES; 
- (void)speechRecognizer:(id<NTSpeechRecognizer>)speechRecognizer didReceivePartialHypothesis:(NTHypothesis*)hypothesis forSearch:(NTSpeechSearch*)search
{
}

// Receive information about state changes of the recognizer (listening/not listening)
- (void)speechRecognizer:(id<NTSpeechRecognizer>)speechRecognizer didChangeListeningState:(BOOL)isListening
{
}

NTSpeechRecognition
NTSpeechRecognition copied to clipboard

Metadata

NTSpeechRecognition

Features

Installation

Carthage

Manual

Documentation

Basic Usage

Setup Recognizer

Handle start/suspend/resume/stop

Listen for hypotheses

← Metadata

Owner

Metadata

NTSpeechRecognition NTSpeechRecognition copied to clipboard

Metadata

NTSpeechRecognition

Features

Installation

Carthage

Manual

Documentation

Basic Usage

Setup Recognizer

Handle start/suspend/resume/stop

Listen for hypotheses

← Metadata

Owner

Metadata

NTSpeechRecognition
NTSpeechRecognition copied to clipboard