Handy icon indicating copy to clipboard operation
Handy copied to clipboard

Add support for fn key and Escape to cancel on macOS

Open dannysmith opened this issue 1 month ago • 10 comments

  1. fn key as trigger (macOS) - Use the fn/Globe key to start/stop recording
  2. Escape to cancel recording - Cancel mid-recording without transcribing (most useful when PTT is off)

Feature 1: fn Key Support (macOS)

How it works

The fn/Globe key is a modifier key that generates NSEventType::FlagsChanged events with NSEventModifierFlags::Function. Standard shortcut libraries (like tauri-plugin-global-shortcut) cannot capture modifier-only keys.

Implementation

Parallel input system for fn key:

  1. src-tauri/src/shortcut/fn_monitor.rs (macOS-only)

    • Uses NSEvent::addGlobalMonitorForEventsMatchingMask_handler from objc2
    • Monitors for NSEventMask::FlagsChanged events
    • Checks NSEventModifierFlags::Function to detect fn press/release
    • Calls the same dispatch_binding_event() function as regular shortcuts
  2. src-tauri/src/shortcut/mod.rs

    • Routes "fn" bindings to fn_monitor instead of tauri-plugin-global-shortcut
    • is_fn_binding() helper checks for fn-only bindings
    • validate_shortcut_string() allows "fn" as valid on macOS
  3. Dependencies (macOS only in Cargo.toml):

    [target.'cfg(target_os = "macos")'.dependencies]
    objc2 = "0.6"
    objc2-app-kit = { version = "0.3", features = ["NSEvent"] }
    objc2-foundation = "0.3"
    block2 = "0.6"
    
  4. Frontend (optional): "Use fn" button in HandyShortcut.tsx - see Reviewer Notes

Permissions

Requires Accessibility permission (same as already needed for enigo pasting). No additional permission prompts for users.


Feature 2: Escape to Cancel Recording

How it works

The cancel shortcut is a dynamic binding - only registered while recording is active.

Implementation

  1. CancelAction in actions.rs:

    • Calls cancel_current_operation() to discard recording
    • Does NOT unregister itself (would deadlock inside callback)
  2. Cancel binding in settings.rs:

    • dynamic: true - not registered at startup
    • Default binding: "Escape"
  3. Dynamic registration in shortcut/mod.rs:

    • register_dynamic_binding() - idempotent (unregisters first if already registered)
    • unregister_dynamic_binding() - removes binding at runtime
    • init_shortcuts() skips dynamic bindings
  4. Lifecycle:

    • TranscribeAction::start() registers cancel via run_on_main_thread()
    • TranscribeAction::stop() unregisters cancel via run_on_main_thread()
    • CancelAction::start() does NOT unregister (next registration handles cleanup)

Key design decisions

Why idempotent registration?

Unregistering from inside the shortcut's own callback causes a deadlock (global_shortcut holds internal locks). Instead, register_dynamic_binding() unregisters first, so CancelAction doesn't need to unregister itself.

Why release toggle lock before calling action?

dispatch_binding_event() releases the toggle state lock BEFORE calling action.start()/stop(). This prevents deadlock when CancelAction calls cancel_current_operation() which also needs the lock.

Linux Notes

Dynamic shortcut registration (used for the cancel shortcut) is disabled on Linux due to instability with the tauri-plugin-global-shortcut plugin. See PR #392.

This means the Escape-to-cancel feature is not available on Linux. The cancel shortcut will silently be a no-op.

Potential future improvement: The dynamic binding architecture in this branch provides a cleaner foundation than the original async-spawn approach. If the underlying Linux shortcut issues are resolved upstream, enabling dynamic bindings on Linux would only require removing the #[cfg(target_os = "linux")] guards in register_dynamic_binding() and unregister_dynamic_binding() in shortcut/mod.rs.

Reviewer Notes: fn Key UI

The fn key backend works regardless of UI changes. Users can always configure fn key manually by editing settings_store.json. Eg:

    "bindings": {
      "transcribe": {
        "current_binding": "fn",
        "default_binding": "option+space",
        "description": "Converts your speech into text.",
        "dynamic": false,
        "id": "transcribe",
        "name": "Transcribe"
      }
    }

UI Visibility Option

The commit "Remove fn key UI (manual config only)" removes the "Use fn" button from the settings UI. This commit is structured to be easily included or excluded:

Decision Action
Keep fn key as manual-config only Keep the commit as-is
Add "Use fn" button to UI Revert or drop the commit

The UI additions that commit removes:

  • isFnBinding() helper function
  • "Use fn" button (shown on macOS when not already using fn)
  • "fn (Globe)" display text when fn is the current binding
  • Tooltip text explaining fn key usage

Reference PRs

  • PR #136 - Original fn key implementation (tekacs)
  • PR #224 - Cancel shortcut approach (jacksongoode)
  • PR #392 - Disable cancel on Linux (stability fix)

dannysmith avatar Dec 01 '25 17:12 dannysmith

@cjpais This is my attempt at #136 and #163 .

Currently it's settings-only (no UI changes) but if you recent the last commit there's a UI too.

Shout if you'd like me to change anything, and if you'd rather not merge this feel free to close - I need both fn and Esc working for Handy to be useful to me, and thought I might as well open an upstream PR 🙂

dannysmith avatar Dec 01 '25 17:12 dannysmith

So I haven't had a ton of time to review, but just my overall comment is I really don't want any special case handling for the function key in an ideal world. I really would like the keyboard handling on Mac OS to be all in one package.

Even if it's not using global shortcut to do that, I'm fine with that. I just want there not to be a tiny weird branch in the code path. I'd rather that branch be a major branch at the top level: if you're on MacOS, use MacOS specific keyboard handling.

Part of the reason for this is I don't really like the global shortcut plugin as it is, it seems to have some limitations which don't fit Handy's use case perfectly. In the early versions (not sure if they were published or not) the keyboard handling was my own version I wrote that supported input from any keys cross platform. I would like to eventually move back to something like that, the main blocker on MacOS for that particular code was that it needed additional permissions when built which I do not want to give. The code at this commit is what I am talking about https://github.com/cjpais/Handy/tree/83d845284d9f7dcfe5561524925cf59e9a5155e6. It went through some improvements after this specific commit, but generally it did work, though I think it probably could be reworked a fair amount, to be much better. But just giving an overall idea that we don't necessarily need to rely on the global shortcut plug-in and I eventually would like to deprecate that support entirely in favor of something better, something that works really, really perfectly as a cross-platform library that can be distributed independently of the Handy application as well as being used within Handy itself, if that makes sense. It definitely will start in handy and then move outwards as a library from there, kind of in the spirit of open sourcing the core of this application and making it even easier to develop things like this for everyone.

And I think eventually I would like to extract a library that works better cross platform generally and maybe is not Tauri specific

cjpais avatar Dec 04 '25 04:12 cjpais

Gotcha - thanks!

I just want there not to be a tiny weird branch in the code path. I'd rather that branch be a major branch at the top level: if you're on MacOS, use MacOS specific keyboard handling.

Yeah I totally get this. When I've got some time I'll have a look at what a "minimum viable" macOS keyboard implementation could look like in Handy. (Notes to self: 1. make it modular for other OSs in the future. 2. Consider a similar API to global-shortcut-plugin.

something that works really, really perfectly as a cross-platform library that can be distributed independently of the Handy application as well as being used within Handy itself, if that makes sense. It definitely will start in handy and then move outwards as a library from there, kind of in the spirit of open sourcing the core of this application and making it even easier to develop things like this for everyone.

And I think eventually I would like to extract a library that works better cross platform generally and maybe is not Tauri specific

Yeah this makes tons of sense. I'm kinda amazed there isn't something out there already.

dannysmith avatar Dec 04 '25 04:12 dannysmith

Thank you for understand haha, I know you put time and effort into this

Yeah this makes tons of sense. I'm kinda amazed there isn't something out there already.

There might be! I just probably didn't look too deeply, but the underlying libraries are there (or close enough, enigo, obj-c bindings, etc)

If there's a better dev experience to be had by diverging from the global-shortcut api structure that is okay too, but honestly from what I recall global shortcut api is decent

cjpais avatar Dec 04 '25 04:12 cjpais

Looks like rdev might support the fn key? Tho I'm sure there was a reason I used objc2 directly here and not rdev - dammed if I can remember tho 😂

Will have a play.

dannysmith avatar Dec 04 '25 04:12 dannysmith

Maybe the same reason why we are using a fork of rdev? I forget exactly why I did this, but there was something broken in the main repo at the time I created handy hahah

rdev = { git = "https://github.com/rustdesk-org/rdev" }

I'm almost certain it was because of this PR hadn't been merged at the time I brought rdev in, but can't say for certain: https://github.com/Narsil/rdev/pull/147

more context: https://github.com/orgs/tauri-apps/discussions/7839

As far as I can tell they also use: objc2 lol

cjpais avatar Dec 04 '25 05:12 cjpais

So https://github.com/cjpais/Handy/pull/224 allows the Escape key to be used to cancel recording. This PR includes a very similar a similar (though more generic) implemetation, written before 224 was merged. I've just extracted the only bit of 2915aea92f8b3000f263ef1beb64dba978d16e93 that's worth keeping into #408.

When that's merged, this branch can be simplified to remove the Escape-related stuff (worth doing because I'm using it in my fork until we work out how to handle fn elegantly).

dannysmith avatar Dec 04 '25 15:12 dannysmith

This would be a really highly desired feature if you could incorporate it because in the past I've used tools like Aqua Voice and it's really easy to use the globe slash FN key and escape.

meh256 avatar Dec 05 '25 18:12 meh256

@meh256 Yeah that's my preferred setup too (the dannys-build branch in my fork currently includes this feature, plus #391 if you wanna build from that while we work out how to sensibly incorporate the fn key thing into Handy.

dannysmith avatar Dec 05 '25 21:12 dannysmith