SynapseML
SynapseML copied to clipboard
How to add Phrase List to SpeechToTextSDK to improve transcription?
SynapseML version
synapseml_2.12:0.10.0
System information
- Language version : Python 3.8.10
- Spark Version (e.g. 3.2.2): 10.4 LTS (includes Apache Spark 3.2.1, Scala 2.12)
- Spark Platform : Databricks
Describe the problem
Hello all,
I'm using SpeechToTextSDK of SynapseML Cognitives in Databricks to transcribe audio files into texts with below following code that works successfully but only without Phrase List :
I found a reference to create regular phrase list here : https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/improve-accuracy-phrase-list?tabs=terminal&pivots=programming-language-python#implement-phrase-list
But how can I add a Phrase List to the SpeechToTextSDK in SynapseML please? Thankyou,
Code to reproduce issue
import synapse.ml from synapse.ml.cognitive import *
stt = (SpeechToTextSDK() .setSubscriptionKey(YOUR_API_KEY) .setLocation(REGION) .setOutputCol("text") .setAudioDataCol("content") .setFormat("detailed") .setFileTypeCol("format") .setLanguageCol("lang") .setStreamIntermediateResults(False) )
results = stt.transform(audio_w_lang_format) display(results)
Other info / logs
No response
What component(s) does this bug affect?
- [X]
area/cognitive
: Cognitive project - [ ]
area/core
: Core project - [ ]
area/deep-learning
: DeepLearning project - [ ]
area/lightgbm
: Lightgbm project - [ ]
area/opencv
: Opencv project - [ ]
area/vw
: VW project - [ ]
area/website
: Website - [ ]
area/build
: Project build system - [ ]
area/notebooks
: Samples under notebooks folder - [ ]
area/docker
: Docker usage - [ ]
area/models
: models related issue
What language(s) does this bug affect?
- [ ]
language/scala
: Scala source code - [X]
language/python
: Pyspark APIs - [ ]
language/r
: R APIs - [ ]
language/csharp
: .NET APIs - [ ]
language/new
: Proposals for new client languages
What integration(s) does this bug affect?
- [ ]
integrations/synapse
: Azure Synapse integrations - [ ]
integrations/azureml
: Azure ML integrations - [X]
integrations/databricks
: Databricks integrations
AB#1956013
Hey @dhhailinh :wave:! Thank you so much for reporting the issue/feature request :rotating_light:. Someone from SynapseML Team will be looking to triage this issue soon. We appreciate your patience.
Hi @dhhailinh - It looks like we don't currently support the PhraseList functionality. Thank you for bringing this to our attention. It seems like something we should add. I've added this request to our list of potential work items. Whether it gets picked up will depend on where it lands in relation to other already scheduled items. Note that we do accept PRs from the public, should you be interested in contributing. Thanks again.
Hi @dhhailinh - It looks like we don't currently support the PhraseList functionality. Thank you for bringing this to our attention. It seems like something we should add. I've added this request to our list of potential work items. Whether it gets picked up will depend on where it lands in relation to other already scheduled items. Note that we do accept PRs from the public, should you be interested in contributing. Thanks again.
Hello @niehaus59 , Thanks for your answer.
PhraseList is indeed a very important funtionality, without this, I will need to come back to regular way of working with speechsdk in python and get rid of synapseml with spark power in Databricks. I guess that many spark or databricks users will be in my situation with SpeechToTextSDK or have to make a custom Transformer.
How can I contribute to accelerate the process? Should I create a new pull request? Thanks for your advice,
@dhhailinh - Yes a PR would be the way to go. See https://github.com/microsoft/SynapseML/blob/master/website/docs/reference/contributing_guide.md and https://github.com/microsoft/SynapseML/blob/master/website/docs/reference/developer-readme.md
SpeechToTextSDK is at https://github.com/microsoft/SynapseML/blob/master/cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/SpeechToTextSDK.scala
Hey @dhhailinh happy to hop on a call to help you get started. Thanks for your interest, it should be a fairly local fix!
Heres the main arch of this work
-
Add an extra
ServiceParam
on SpeechSDKBase with type Array[String] -
Add the calls to add the phrases somewhere around these lcations
https://github.com/microsoft/SynapseML/blob/4115d4f0f2ea5210b9eafd777ff7dc6f4567a7fb/cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/SpeechToTextSDK.scala#L445
and
https://github.com/microsoft/SynapseML/blob/4115d4f0f2ea5210b9eafd777ff7dc6f4567a7fb/cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/SpeechToTextSDK.scala#L535
-
Make sure the data necessary flows through by adding args to those functions, compiling, and seeing where upstream arguments need to be plumbed in
-
Write a test to demonstrate the functionality works as expected
Thanks @mhamilton723 and @niehaus59 ,
I will have a look on the source code and come back to you.