wezterm
wezterm copied to clipboard
Accessibility for the visually impaired
My intent with this issue is to understand better what first-class support for the visually impaired looks like in a terminal emulator, and then figure out how to implement it.
My reading so far has found:
- TDSR - A console-based screen reader wherein I posted https://github.com/tspivey/tdsr/issues/19
- emacspeak which has some interesting concepts
- Slint an accessible linux distro
A discussion with Didier of the slint project gave me some leads:
- mlterm has support for brltty: https://github.com/arakiken/mlterm/blob/master/doc/en/README.brltty
- The
#a11y
IRC channel onirc.linux-a11y.org
is a good place to ask questions
Required Features
- For low-vision users, it may be sufficient to have a quick and easy way to activate a color scheme with high contrast and large text
- Some kind of screen reader and/or braille display support
Screen Reader
Based on my reading of the TDSR and emacspeak documentation, it seems like a key feature of a good terminal screen reading experience is being able to manage what is being read. I think of the terminal as having a few distinct interface elements:
- The scrollback, which is the entirety of the text that is navigable/viewable
- The viewport, which is the section of the scrollback that is visible in the window. This defines a relatively coarse visual "cursor" representing what the user is looking at.
- The mouse cursor, which is used for selecting and interacting with text
- The text cursor, which is used primarily by the application
WezTerm has CopyMode and Quick Select Mode for mouse-less selection capabilities.
For a screen reader, I think it may make sense to introduce an explicit read cursor that is conceptually similar to the viewport but much finer grained; based on the TDSR docs, it seems desirable to be able to specify the position based on character, word or line. To me, controlling this feels similar to the navigation functions available in mouseless copy mode, except that the data is read aloud instead of being copied.
In a situation where the rate of output is high, it seems like it might be useful to have the read cursor move independently from the viewport so that it can lag behind the output. Having a notification to inform the user that the output rate is high seems like it might be valuable; eg: "NOTE: one megabyte of text has been output in the past ten seconds", or "The next line is no longer in the scrollback" if it got scrolled away. Notifications will need to trigger some audio cue (perhaps an adjustment to the speed/tone parameters) to disambiguate the notification from what was being read.
Part of support this in wezterm is understanding how to model and interact with the read cursor, but another part is actually having the text get read out. WezTerm runs on multiple platforms so there are a number of possibilities.
Targeting Speech Dispatcher seems like a good, portable, first step, and seems like there is a pretty straightforward way to address and communicate with the speech server.
Braille Displays
I don't fully understand these, but am working so far under the assumption that they can be modeled similarly to a screen reader, except that instead of (or perhaps in addition to) speaking the output, the output is sent to the braille display. It seems likely that the read cursor concept touched on above can map to both screen readers and braille displays.
Appreciate seeing a desire to increase the accessibility of wezterm
.
Some quick thoughts:
- The project you linked to as "Speed Dispatcher" seems to be a typo for "Speech Dispatcher". :)
- There is a Rust Text-To-Speech crate with cross-platform support: https://crates.io/crates/tts
- I first interacted with @/ndarilek (the developer of
tts-rs
) within the Godot game engine community where he created a plugin to add TTS support to the editor & games. He has been very generous with his time & experience in terms of both technical issues & sharing the personal impact of such technology. - I'd suggest checking out this Accessibility-related issue on
egui
(Rust immediate mode GUI system) as he also contributed TTS support foregui
/Bevy (also https://github.com/emilk/egui/pull/412) & I suspect the thread will be useful in terms of both how to integrate the crate & what integration/UX issues need to be considered. In the same thread there's also a comment where I collated a bunch of developer-focussed Accessibility-related resources which you might find informative.
Hope this helps to move the accessibility process forward a little. :)
This is just a drive-by message, but a blind friend of mine says their best experience comes from using Apple's built in accessibility with the standard Apple terminal.
@wez You mentioned in #912 that Slint is looking for a more accessible terminal for the installation process. I'm following up here since this issue is more relevant to that discussion. Can you point to a page or post where they've talked about this? I wonder what they find lacking in the current solutions, Speakup and brltty.
There are basically two ways you could approach accessibility in this project: implement the platform accessibility APIs (UI Automation on Windows, NSAccessibility on Mac, and AT-SPI on Unix desktops) so screen readers can find out what's in the window and present it as they see fit, or add text-to-speech output directly in wezterm. Your interest in running wezterm directly on the framebuffer, without using X or Wayland, never mind something like GNOME, suggests to me that you're interested in the latter approach. Is that correct? I doubt that that's something we really need, though as I said, I'd like to know what the Slint project is looking for in this area.
@ChrisJefferson I'm surprised that your blind friend had such a positive experience using macOS's Terminal app with VoiceOver. If I remember correctly, tdsr was written specifically to improve access to the terminal on macOS.
@mwcampbell per https://github.com/tspivey/tdsr/issues/19#issuecomment-868096868, I reached out to Didier from the Slint project on libera.chat. It's possible they have an archive from that channel that you can search for the full context, but summarizing it here: the scenario they described to me was for an accessible terminal to be usable during installation of the distribution itself, running against the linux framebuffer console. That conversation was a bit more about low-vision users than it was about screen readers: large fonts, high contrast.
In terms of what I'd like to implement: I think it would be good to integrate with the various platform accessibility APIs, but the limited set of people I've interacted with about this were generally negative about the platform provided features. If the sort of people that are in the intersection of being terminal users and having accessibility requirements are not well-served by the platform provided features, then I'm not excited to implement support for each platform and would rather target something that is more impactful for that group of users. The impression I had was that targeting Speech Dispatcher might be something of a sweet spot.
@wez Thanks for clarifying what the Slint team wants.
It's possible that once my AccessKit project matures some more, it could help you implement both approaches to accessibility, without duplicating work on your end. The central concept of AccessKit is an accessibility tree. For each distinct frame, the application (or its GUI toolkit) pushes either a full tree snapshot or an incremental tree update to an AccessKit consumer. Usually the consumer is a platform adapter, which implements an accessibility API such as UI Automation or AT-SPI using the accessibility tree. The Windows platform adapter is the most mature at this point, though work has started on Mac and AT-SPI adapters. None of these adapters support multi-line text widgets yet, and I haven't finalized the representation of these widgets in the tree, so AccessKit isn't ready yet for wezterm to use. But another thing I've been thinking about, in addition to these platform adapters, is implementing a screen reader as an AccessKit consumer, probably using the tts crate. One could just as well output to a braille display using BrlAPI. So when AccessKit is ready, you could implement support for it in wezterm, then offer a choice of accessibility solutions.
AccessKit looks interesting! I'd definitely be interested in integrating it in wezterm once we're both a little further along; I'm considering some internal changes in wezterm that will make it easier to translate its content to the accessibility tree.
@wez, Any updates on this? I definitely want to try out wezterm :)
@TheQuinbox sorry, no progress yet. There's still some work needed to the internals to get data into the right shape; that work is needed to help with some rendering performance/caching changes, so it will happen, but hasn't yet.
Then: figure out how to plumb it into something that is actually useful for users such as yourself.
Would you mind sharing details about which platform(s) you use?
And of course, everyone's waiting for me to get back to work on AccessKit. Hopefully that will start in the next few weeks.
@wez, I use Windows 10 with the NVDA screen reader, and macOS Ventura with TDSR/VoiceOver, and i can tell you, tDSR does it exactly how I want it to work