STT icon indicating copy to clipboard operation
STT copied to clipboard

first pass at getting logits to the client code

Open ftyers opened this issue 2 years ago • 5 comments

First pass at maintaining the output of the acoustic model and returning it to the client. Example output is here. Comments welcome.

ftyers avatar Jan 20 '22 17:01 ftyers

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

CLAassistant avatar Jan 20 '22 17:01 CLAassistant

Also, please sign the CLA!

reuben avatar Jan 25 '22 16:01 reuben

Comments from Matrix: imatge

ftyers avatar Jan 31 '22 13:01 ftyers

@reuben so in this case I should collect them in processBatch, store them in StreamingState? The problem there is that the Metadata is returned in:

Metadata*
ModelState::decode_metadata(const DecoderState& state,
                            size_t num_results)

which has access to only ModelState and DecoderState.

So then I would edit:

Metadata*
StreamingState::intermediateDecodeWithMetadata(unsigned int num_results) const

And have it fill in the logits there?

ftyers avatar Jan 31 '22 13:01 ftyers

Ok, I moved it to StreamingState. There is something I don't like, which is having to reallocate the memory for the metadata because we can't just update the struct apparently:

native_client/stt.cc: In member function 'Metadata* StreamingState::finishStreamWithMetadata(unsigned int)':
native_client/stt.cc:175:18: error: assignment of read-only member 'Metadata::emissions'
  175 |     m->emissions = emissions;
      |     ~~~~~~~~~~~~~^~~~~~~~~~~

Comments welcome!

ftyers avatar Jan 31 '22 14:01 ftyers

Someone else will work on this who can sign the CLA.

ftyers avatar Feb 01 '23 21:02 ftyers