nvda icon indicating copy to clipboard operation
nvda copied to clipboard

Sound split

Open mltony opened this issue 1 year ago • 42 comments

Link to issue number:

Fixes #12985

Summary of the issue:

Feature request: sound split. Splits system sound into two channels: NVDA speaks in one channel (e.g. left), while all other applications play their sound in the other channel (e.g. right).

Description of user facing changes

  • Added global command NVDA+alt+s that toggles sound split between off, NVDA on the left and NVDA on the right (default behavior).
  • Added combo box on Audio panel in NVDA settings that also allows to switch between Sound split modes.
  • Added list of checkboxes in Audio panel, that allows to change behavior of NVDA+alt+s command: it allows to select all modes that the global command will cycle through.

Description of development approach

  1. Added pycaw library as a dependency.
  2. Created file source\audio\soundSplit.py where I implemented all logic.
  3. Contrary to what I said before, I managed to implement sound split without creating an extra monitor thread. It works like this:
    • When sound split is toggled, it uses IAudioSessionEnumerator to set volume in all currently active audio sessions.
    • Then it does sessionManager.RegisterSessionNotification() to create a callback that listens for any new audio sessions being created, an it executes the the same volume updating function upon creation.
    • On the next call or on shutdown we unregister the previous notification callback.

Testing strategy:

  • Performed thorough manual testing on Windows 10 and Windows 11.

Known issues with pull request:

  • Sound split only works in wasapi mode. The reason for that is that when NVDA is using WinMM, adjusting its volume via core audio API doesn't work. In theory this can be worked around by still applying volume inside NVWave player; however since NVDA is migrating to wasapi, I thought this doesn't make much sense at this point.
  • During shotdown we try to restore audio volume of all applications. However if an app is currently not playing any sounds, it doesn't have an active audio session and therefore there is no way to update its volume. As a result some applications might be playing sounds only in 1 channel long after NVDA is shut down. This is a known issue and there appears to be no work around.
  • There is a very unlikely race condition that might occur when an application starts playing sound between the call sessionManager.GetSessionEnumerator() and sessionManager.RegisterSessionNotification(). In this case sound split volume will not be applied to that application. I don't see a way to fix this in code since this is rather a problem of COM API. However turning sound split off and on again is an easy and intuitive workaround in this case.
  • Sound split won't work with applications playing sounds in mono mode or outputting to more than 2 channels (e.g. Dolby Surround). In practice I never encountered such applications.
  • Please note, that sound split doesn't work as a mixer. For example, if an application is playing a stereo sound track while sound split is set to "NVDA on the left and applications on the right", then you will only hear the right channel of the sound track, while the left channel of the sound track will be muted. in order to allow NVDA to use the left channel.
  • Sound split won't work with applications using legacy winMM API to play audio. However during the life span of sound split feature in Tony's enhancements add-on no users have reported of such applications.

Code Review Checklist:

  • [x] Documentation:
    • Change log entry
    • User Documentation
    • Developer / Technical Documentation
    • Context sensitive help for GUI changes
  • [x] Testing:
    • Unit tests
    • System (end to end) tests
    • Manual testing
  • [x] UX of all users considered:
    • Speech
    • Braille
    • Low Vision
    • Different web browsers
    • Localization in other languages / culture than English
  • [x] API is compatible with existing add-ons.
  • [x] Security precautions taken.

mltony avatar Jan 20 '24 00:01 mltony

  • FAIL: Translation comments check. Translation comments missing or unexpectedly included. See build log for more information.
  • PASS: Unit tests.
  • FAIL: Lint check. See test results for more information.
  • FAIL: System tests (tags: installer NVDA). See test results for more information.
  • Build (for testing PR): https://ci.appveyor.com/api/buildjobs/gunmito0320ss5ca/artifacts/output/nvda_snapshot_pr16071-30730,6b4c666a.exe
  • CI timing (mins): INIT 0.0, INSTALL_START 0.8, INSTALL_END 0.9, BUILD_START 0.0, BUILD_END 25.1, TESTSETUP_START 0.0, TESTSETUP_END 0.2, TEST_START 0.0, TEST_END 0.7, FINISH_END 0.2

See test results for failed build of commit 6b4c666ae8

AppVeyorBot avatar Jan 20 '24 01:01 AppVeyorBot

  • PASS: Translation comments check.
  • PASS: Unit tests.
  • FAIL: Lint check. See test results for more information.
  • FAIL: System tests (tags: installer NVDA). See test results for more information.
  • Build (for testing PR): https://ci.appveyor.com/api/buildjobs/po6i33hf3tks2vg1/artifacts/output/nvda_snapshot_pr16071-30731,9df913ae.exe
  • CI timing (mins): INIT 0.0, INSTALL_START 1.0, INSTALL_END 1.0, BUILD_START 0.0, BUILD_END 26.3, TESTSETUP_START 0.0, TESTSETUP_END 0.2, TEST_START 0.0, TEST_END 0.8, FINISH_END 0.1

See test results for failed build of commit 9df913ae92

AppVeyorBot avatar Jan 20 '24 01:01 AppVeyorBot

  • PASS: Translation comments check.
  • PASS: Unit tests.
  • FAIL: Lint check. See test results for more information.
  • FAIL: System tests (tags: installer NVDA). See test results for more information.
  • Build (for testing PR): https://ci.appveyor.com/api/buildjobs/214xh13lbkesttkf/artifacts/output/nvda_snapshot_pr16071-30732,e5238840.exe
  • CI timing (mins): INIT 0.0, INSTALL_START 0.9, INSTALL_END 0.9, BUILD_START 0.0, BUILD_END 27.8, TESTSETUP_START 0.0, TESTSETUP_END 0.2, TEST_START 0.0, TEST_END 0.8, FINISH_END 0.2

See test results for failed build of commit e523884049

AppVeyorBot avatar Jan 20 '24 02:01 AppVeyorBot

Glad to see this PR has been drafted. But I sincerely hope you will consider the following request.

Sometimes, when we are watching a movie, we often want to appreciate the sound effects of this movie, but the sound of NVDA is relatively not so important. I want to reduce the presence of NVDA sounds as much as possible, At this point it's a good idea to direct it to the left or right channel. Obviously, I don't want to direct the sound of the film to the left or right, because that would definitely destroy the sound of the movie. It would be nice to have an option that allows adjusting only the output channel of NVDA. But I'm not denying the validity of that pull request, just pointing out another dilemma I faced.

cary-rowen avatar Jan 20 '24 18:01 cary-rowen

@mltony, a big thank you for taking the time to integrate this feature in core! I think that it will be a real plus in NVDA's features.

I'll take time tomorrow and next week to test it, and will provide review comments (if needed), probably more on the UX point of view.

CyrilleB79 avatar Jan 20 '24 22:01 CyrilleB79

Also one small issue: For applications playing a stereo signal, sound split keeps only one channel and puts NVDA in the other one. This can be heard for example listening to Bohemian Rhapsody (Youtube) at time 3:15. With sound split enabled, we can hear well only one "Galelileo" in two, the other one being heard very far. A better behaviour would be to have a mix (a sum) of left and right channel of the song on one side and NVDA on the other side.

If it is not possible for technical reasons or other, this request should not block this PR though, since the feature as developed in this PR is already more than useful, e.g. in meetings where stereo is of no use. Though, this limitation should be documented, at least in this PR, and maybe also in the User Guide.

CyrilleB79 avatar Jan 20 '24 23:01 CyrilleB79

@lukaszgo1,

  1. Updated PR description.
  2. I am looking at tests\lint\flake8.ini and I see source/comInterfaces/*, appears to be already excluded. So not sure why linter is still complaining.
  3. In order to register a notification we'll need to keep NVDA process running. Since applications may play sound any time, this would mean we'd need to keep NVDA (or any other surrogate processs of it) to keep running indefinitely after we quit NVDA. That can be done in theory, but I believe this is not worth it. Moreover it might create some confusion, as we'll be running a process quietly that adjusts audio volumes of all applications - that just doesn't sound like a good idea.

mltony avatar Jan 21 '24 18:01 mltony

Also one small issue: For applications playing a stereo signal, sound split keeps only one channel and puts NVDA in the other one. This can be heard for example listening to Bohemian Rhapsody (Youtube) at time 3:15. With sound split enabled, we can hear well only one "Galelileo" in two, the other one being heard very far. A better behaviour would be to have a mix (a sum) of left and right channel of the song on one side and NVDA on the other side.

If it is not possible for technical reasons or other, this request should not block this PR though, since the feature as developed in this PR is already more than useful, e.g. in meetings where stereo is of no use. Though, this limitation should be documented, at least in this PR, and maybe also in the User Guide.

I believe this is not possible - in coreAudio API I can only set volumes for left and right channels. What you are talking about would be mixer functionality I guess. I mentioned this both in this PR and documentation.

mltony avatar Jan 22 '24 19:01 mltony

@lukaszgo1 , @CyrilleB79, I addressed all your comments. Also @cary-rowen, I implemented your idea as well.

mltony avatar Jan 22 '24 19:01 mltony

Great, thanks @mltony for the great work. I'll try this out once the CI build is complete.

cary-rowen avatar Jan 22 '24 20:01 cary-rowen

  • PASS: Translation comments check.
  • PASS: Unit tests.
  • FAIL: Lint check. See test results for more information.
  • PASS: System tests (tags: installer NVDA).
  • Build (for testing PR): https://ci.appveyor.com/api/buildjobs/mxt1rp0ajpq3p3a9/artifacts/output/nvda_snapshot_pr16071-30762,db34b63c.exe
  • CI timing (mins): INIT 0.0, INSTALL_START 1.0, INSTALL_END 1.0, BUILD_START 0.0, BUILD_END 28.8, TESTSETUP_START 0.0, TESTSETUP_END 0.4, TEST_START 0.0, TEST_END 16.7, FINISH_END 0.1

See test results for failed build of commit db34b63cd0

AppVeyorBot avatar Jan 22 '24 20:01 AppVeyorBot

  • PASS: Translation comments check.
  • PASS: Unit tests.
  • FAIL: Lint check. See test results for more information.
  • PASS: System tests (tags: installer NVDA).
  • Build (for testing PR): https://ci.appveyor.com/api/buildjobs/nyb88os09mrohrtf/artifacts/output/nvda_snapshot_pr16071-30766,b4f60f47.exe
  • CI timing (mins): INIT 0.0, INSTALL_START 1.0, INSTALL_END 1.0, BUILD_START 0.0, BUILD_END 28.9, TESTSETUP_START 0.0, TESTSETUP_END 0.3, TEST_START 0.0, TEST_END 16.9, FINISH_END 0.2

See test results for failed build of commit b4f60f4746

AppVeyorBot avatar Jan 23 '24 02:01 AppVeyorBot

You can add comInterfaces/coreAudio to the Glob(exclude) parameter of the Clean function in comInterfaces_sconscript. to exclude it during SCons cleanup.

hwf1324 avatar Jan 23 '24 05:01 hwf1324

  • Sound split only works in wasapi mode. The reason for that is that when NVDA is using WinMM, adjusting its volume via core audio API doesn't work. In theory this can be worked around by still applying volume inside NVWave player; however since NVDA is migrating to wasapi, I thought this doesn't make much sense at this point.

Thanks for the additional explanation @mltony. I'm slightly worried that this may mean that sound split will not work for arbitrary applications using WinMM. @jcsteh Is this something you could comment on?

  • multiple lint errors are coming from pycaw interface definitions. I hope an exemption can be made here, since we don't want to make potential updating of these files to require manual reformatting every time just to make linter happy.

I guess Flake8 relies on Python's glob, which does not recurse to directories by default. I'd try changing the exclusion to source/comInterfaces/**.

  • During shotdown we try to restore audio volume of all applications. However if an app is currently not playing any sounds, it doesn't have an active audio session and therefore there is no way to update its volume. As a result some applications might be playing sounds only in 1 channel long after NVDA is shut down. This is a known issue and there appears to be no work around.

This does not need to be done in this PR, however if users will complain we can probably store all sessions for which we changed channels and restore them to the original setting when exiting NVDA.

lukaszgo1 avatar Jan 23 '24 16:01 lukaszgo1

Thanks for the additional explanation @mltony. I'm slightly worried that this may mean that sound split will not work for arbitrary applications using WinMM. @jcsteh Is this something you could comment on?

In my understanding that's correct, this won't work for winmm applications. However, sound split functionality has been around for many years and nobody complained about that which makes me think that winmm applications are exceptionally rare these days.

I guess Flake8 relies on Python's glob, which does not recurse to directories by default. I'd try changing the exclusion to source/comInterfaces/**.

That doesn't work - at least with runlint command ran locally.

This does not need to be done in this PR, however if users will complain we can probably store all sessions for which we changed channels and restore them to the original setting when exiting NVDA.

SG as in we can try to preserve COM object. But I suspect that won't work since session enumerator only enumerates active sessions and so I suspect all the calls on saved com objects might fail. But again happy to try if we receive complaints.

mltony avatar Jan 23 '24 18:01 mltony

  • PASS: Translation comments check.
  • PASS: Unit tests.
  • FAIL: Lint check. See test results for more information.
  • PASS: System tests (tags: installer NVDA).
  • Build (for testing PR): https://ci.appveyor.com/api/buildjobs/w8npa1u4k0ac5kst/artifacts/output/nvda_snapshot_pr16071-30781,c6b5e39c.exe
  • CI timing (mins): INIT 0.0, INSTALL_START 1.0, INSTALL_END 1.0, BUILD_START 0.0, BUILD_END 28.0, TESTSETUP_START 0.0, TESTSETUP_END 0.3, TEST_START 0.0, TEST_END 16.8, FINISH_END 0.1

See test results for failed build of commit c6b5e39cf5

AppVeyorBot avatar Jan 23 '24 19:01 AppVeyorBot

That doesn't work - at least with runlint command ran locally.

It turns out our exclusions weren't working for some time now. I have created #16087 to address this, so after that PR is merged and master is merged to this branch Linter should no longer complain.

lukaszgo1 avatar Jan 23 '24 21:01 lukaszgo1

  • PASS: Translation comments check.
  • PASS: Unit tests.
  • FAIL: Lint check. See test results for more information.
  • PASS: System tests (tags: installer NVDA).
  • Build (for testing PR): https://ci.appveyor.com/api/buildjobs/xgs164lgvcxw3n0j/artifacts/output/nvda_snapshot_pr16071-30783,ba99d426.exe
  • CI timing (mins): INIT 0.0, INSTALL_START 1.0, INSTALL_END 0.9, BUILD_START 0.0, BUILD_END 29.4, TESTSETUP_START 0.0, TESTSETUP_END 0.3, TEST_START 0.0, TEST_END 17.2, FINISH_END 0.1

See test results for failed build of commit ba99d426cb

AppVeyorBot avatar Jan 23 '24 23:01 AppVeyorBot

Hi.

I am using an audio streaming software (AudioRelay) to stream audio to other devices for use. When I apply this PR, all other modes work fine, but there is no sound return from NVDA in "NVDA on the left and applications on the right" and "NVDA on the right and applications on the left" modes.

This audio streaming software creates a virtual audio device through which it gets audio from the rest of the applications. In both modes NVDA's visual audio bar is jumping, but the virtual audio device backend has no audio output.

Sorry, I'm not sure if this is a known issue with this PR.

hwf1324 avatar Jan 24 '24 13:01 hwf1324

@hwf1324, could you explain more?\

  1. Which modes do work? Paste the whole list.
  2. Which device you're streaming from and which device streaming to? In general I have the feeling that sound split just won't be compatible with that application you're using. Target audience of sound split would be people who use either speakers or headphones connected directly.

mltony avatar Jan 24 '24 19:01 mltony

  • PASS: Translation comments check.
  • PASS: Unit tests.
  • FAIL: Lint check. See test results for more information.
  • PASS: System tests (tags: installer NVDA).
  • Build (for testing PR): https://ci.appveyor.com/api/buildjobs/r3ko6mtivkjy13jn/artifacts/output/nvda_snapshot_pr16071-30794,fdd31185.exe
  • CI timing (mins): INIT 0.0, INSTALL_START 1.1, INSTALL_END 1.1, BUILD_START 0.0, BUILD_END 29.8, TESTSETUP_START 0.0, TESTSETUP_END 0.4, TEST_START 0.0, TEST_END 17.4, FINISH_END 0.2

See test results for failed build of commit fdd3118520

AppVeyorBot avatar Jan 24 '24 20:01 AppVeyorBot

1. Which modes do work? Paste the whole list.
  • "Disabled sound split": normalcy
  • "NVDA on the left and applications on the right": does not transmit NVDA sound
  • "NVDA on the left and applications in both channels": supporting
  • "NVDA on the right and applications on the left": does not transmit NVDA sound
  • "NVDA on the right and applications in both channels": supporting
  • "NVDA in both channels and applications on the left": supporting
  • "NVDA in both channels and applications on the right": supporting
2. Which device you're streaming from and which device streaming to?

Transfer sound from your laptop to your cell phone and listen to multiple devices with one headset.

   In general I have the feeling that sound split just won't be compatible with that application you're using. Target audience of sound split would be people who use either speakers or headphones connected directly.

I think it should only be partially incompatible, if you can set which audio streams can follow the NVDA sound, or set the mode of certain audio streams, or set to exclude certain audio streams, you can solve this problem. But I think it goes beyond the scope of this PR, and even beyond NVDA.

hwf1324 avatar Jan 24 '24 23:01 hwf1324

@hwf1324 , based on your response it appears that your application doesn't work when the set of NVDA channels and the set of channels for other apps are disjoints - e.g. no common elements. We can thus deduce that the app takes NVDA sound, but also the sound goes through setting applied to all other apps and then it transmits sound to mobile phone, which ends up being zero in both channels. I don't see a way to fix that except by not using sound split in conjunction with that ap.

mltony avatar Jan 25 '24 19:01 mltony

global command NVDA+alt+s that toggles sound split between off, NVDA on the left and NVDA on the right (default behavior).

Although this is configurable, I was wondering, if there is a need to improve this default behavior: If we confirm that we only need three options to cycle, Then one of the options can be removed to include NVDA on the left(or right) and applications in both channels".

To me, if there are too many options to cycle, then we might need a reverse cycle gesture. The current default behavior "NVDA on the left and NVDA on the right" doesn't make much difference to me between the two cases.

cary-rowen avatar Jan 26 '24 03:01 cary-rowen

global command NVDA+alt+s that toggles sound split between off, NVDA on the left and NVDA on the right (default behavior).

Although this is configurable, I was wondering, if there is a need to improve this default behavior: If we confirm that we only need three options to cycle, Then one of the options can be removed to include NVDA on the left(or right) and applications in both channels".

I fully agree. Given it is configurable, there is no point in including symmetrical modes in the default available modes. Either we include only 2 modes (off and NVDALeft+AppsRight) or 3 modes as described by @cary-rowen. Having 2 modes would satisfy the needs of the majority of the people, so it would make sense to set it as a default. On the opposite, a third mode would help promote the movie/music use case (NVDALeft+AppsBoth). I have no strong preference among these 2 options.

To me, if there are too many options to cycle, then we might need a reverse cycle gesture. The current default behavior "NVDA on the left and NVDA on the right" doesn't make much difference to me between the two cases.

I agree here too. There are actually too many options to cycle if all the modes are included in the cycling command. Though, that's not a common use case. Thus, I'd leave the backward cycling command unassigned.

CyrilleB79 avatar Jan 26 '24 07:01 CyrilleB79

I updated default setting as suggested by #cary-rowen. I think it is premature to introduce reverse cycle command, as I would think most of the users will only cycle between off and their favorite sound split mode.

mltony avatar Jan 26 '24 19:01 mltony

This sets the volume of the corresponding process by its PID.

Is it possible to add a way to ignore volume settings by process name?

First get the PID by process name, then skip the process with that PID.

hwf1324 avatar Jan 26 '24 19:01 hwf1324

  • PASS: Translation comments check.
  • PASS: Unit tests.
  • FAIL: Lint check. See test results for more information.
  • PASS: System tests (tags: installer NVDA).
  • Build (for testing PR): https://ci.appveyor.com/api/buildjobs/l44dfhxh5ry4s8rg/artifacts/output/nvda_snapshot_pr16071-30834,3dc60d74.exe
  • CI timing (mins): INIT 0.0, INSTALL_START 1.0, INSTALL_END 1.0, BUILD_START 0.0, BUILD_END 27.2, TESTSETUP_START 0.0, TESTSETUP_END 0.3, TEST_START 0.0, TEST_END 16.7, FINISH_END 0.1

See test results for failed build of commit 3dc60d74d9

AppVeyorBot avatar Jan 26 '24 20:01 AppVeyorBot

@hwf1324 , You can implement this in an add-on. You can call applyToAllAudioSessions function with a custom lambda function that performs any logic you can think of. I'm not sure there is a concept of process name, but you can lookup information given pid by calling appModuleHandler.AppModule(16064)

mltony avatar Jan 31 '24 00:01 mltony

@seanbudd, that would be a significant rewrite. I can try to send pr to pycaw maintainer. ~~Also if I switch to pycaw API, then we'll automatically bring a new useless dependency psUtil package that we don't really need. Most likely it won't be stripped away by py2exe as looking at pycaw code querying audioSessions also resolves process names through psUtil even though we don't need that. Are you going to be ok with that?~~ Update: actually if py2exe smart enough it will probably drop psutil - upon closer inspection psUtil is only activated in specific methods that I won't call.

mltony avatar Feb 08 '24 02:02 mltony