mimic-recording-studio icon indicating copy to clipboard operation
mimic-recording-studio copied to clipboard

changes in trim_silence function regarding the agressive trimming issue

Open amoljagadambe opened this issue 3 years ago • 4 comments

How to use this template

Under each heading below is a short outline of the information required. When submitting a PR, please delete this text and replace it with your own.

The CLA section can be deleted entirely.

Description

update the trimming function using rolling windows to create the mask around the signal {“fixes #{35}”}

If needed follow up with as much detail as required.

Type of PR

  • [ ] Bugfix
  • [ ] Feature implementation
  • [ *] Refactor of code (without functional changes)
  • [ ] Documentation improvements
  • [ ] Test improvements

Documentation

The most important and tweakable part is threshold_value do play around with the value to get your desired trimming

CLA

To protect you, the project, and those who choose to use Mycroft technologies in systems they build, we ask all contributors to sign a Contributor License Agreement.

This agreement clarifies that you are granting a license to the Mycroft Project to freely use your work. Additionally, it establishes that you retain the ownership of your contributed code and intellectual property. As the owner, you are free to use your code in other work, obtain patents, or do anything else you choose with it.

If you haven't already signed the agreement and been added to our public Contributors repo then please head to https://mycroft.ai/cla to initiate the signing process.

amoljagadambe avatar May 18 '21 10:05 amoljagadambe

I think you above statement is right NumPy and pandas are adding stress into the alpine version of python. meanwhile, I am also working on a different approach which is less heavy and computationally efficient

amoljagadambe avatar May 19 '21 07:05 amoljagadambe

above approach also need some tweak in dockerfile

amoljagadambe avatar May 19 '21 07:05 amoljagadambe

'ffmpeg -i {} -ab 160k -ac 2 -ar 44100 -vn {}.wav -y'.format( webm_file_name, path )

why are we using 2 channels while saving the file

amoljagadambe avatar May 19 '21 08:05 amoljagadambe

Hey, glad that you're finding the project helpful and able to modify it to fit your use case.

The removal of all silence makes more sense now as it seems you're recording single words rather than whole sentences.

It feels like these changes would be well suited to be configuration options set in the docker-compose.yaml eg:

  • stereo vs mono
  • remove all silence vs trim start and finish

krisgesling avatar May 26 '21 01:05 krisgesling