normcap icon indicating copy to clipboard operation
normcap copied to clipboard

[Linux, AWM] Can't OCR From All Monitors

Open Lonniebiz opened this issue 8 months ago • 4 comments

When running NormCap-0.4.4-x86_64.AppImage on Debian 12 with AWM, it will only OCR from my laptop's built-in monitor.

If I attempt to capture from either of my other two monitors, the behavior is not what's you'd expect. I'll try to explain it.

When I try to OCR text appearing on monitors 2 or 3, NormCap does indeed encapsulate that desired monitor with a highlighted border (just like normal), but what's not normal is that it moves what ever is on monitor 1 to the other monitor I'm trying to OCR on.

Explained differently, it seems like NormCap takes a screenshot of Monitor 1 and displays it full screen in front of the content that's on the monitor where I'm really trying to OCR something from. I hope that make sense.

I've only noticed this behavior when using AWM. I work around it by moving the window I want to OCR to my laptop's built-in monitor. As long as I OCR from Monitor 1, NormCap is normal.

Lonniebiz avatar Nov 28 '23 12:11 Lonniebiz

Hi @Lonniebiz, thanks for opening this as a separate issue :+1:

It will need some time for me to create WM with AWM and reproduce this issue, but I'll try to take a look at it, soon!

In the meanwhile, allow me to give some background on how NormCap's UI algorithm works in a multi-monitor setup:

  1. Take a screenshot of every display
  2. For each of the screenshot, create a window that...
    • ... has the size of the corresponding display
    • ... no border or window decoration
    • ... show the screenshot (with pink border) as the its only content (primary monitor also gets the menu cog-wheel)
  3. Move each window to its corresponding display by setting the coordinates of the window's top left corner to the absolute position of the top left corner, which the corresponding display has on the virtual desktop (which spans all displays).

While this is far from perfect, it is so far the most reliable way I found to work cross platform and setups.

The most common issues with this method are:

  1. The dimensions and/or scaling information that NormCap receives through its UI framework (Qt), are off. This can lead to a) wrong window dimensions, b) wrong position to move to, c) wrong transpose of size/position of the selected area. Unfortunately, all of this depends a lot on the system setup, specifically on the window manager and the scaling type (for Wayland: "fractional scaling" or not).
  2. The window manger interferes with window placement. Especially tiling window managers (like AWM) are build to force certain position and size onto applications, but this obviously conflicts with NormCap relying on sizing/positioning its windows by itself.

What you could try yourself: Most tiling window managers allow to create exceptions for certain applications to exclude them from tiling, and instead making their windows "float" on top of the tiled windows. You could try to configure such an exception for NormCap in AWM. I don't know AWM, and it is totally possible that this alone doesn't improve anything, but it should be quite easily to try.

dynobo avatar Dec 01 '23 22:12 dynobo

What solved the same problem for me in XMonad: add a manageHook that ignores windows with

stringProperty "WM_NAME" =? "NormCap" --> doIgnore

markus1189 avatar Jan 03 '24 08:01 markus1189

Hi @Lonniebiz, the new NormCap 0.5.4 release includes a fix that should improve robustness of window positioning. Not 100% sure, if this also works on AWM, but do you want to give it a try?

dynobo avatar Jan 16 '24 20:01 dynobo

@dynobo : When running NormCap-0.5.4-x86_64.AppImage on Debian 12 with AWM, it will only attempt to OCR capture on my top-right external monitor:

2024-01-17_08-36

Awesome has a configuration script that loads each time you login. That script is located at: ~/.config/awesome/rc.lua

At the bottom of that script, this command is what sets the monitor configuration shown in the GUI above: awful.spawn('nvidia-settings --assign CurrentMetaMode="DP-0: nvidia-auto-select +1921+2160, DP-1: nvidia-auto-select +0+0, DP-3: nvidia-auto-select +3840+0"')

Before I used Nvidia video card drivers, that command was instead: awful.spawn.with_shell("xrandr --output eDP-1 --mode 3840x2160 --pos 3840x0 --primary --output DP-1 --mode 3840x2160 --pos 0x0 --output DP-2 --mode 3840x2160 --pos 7680x0")

In this monitor layout, the bottom monitor is my laptop's built-in monitor, and if I want to OCR on it, or on the top-left monitor, I have to move that window to the top-right external monitor before NormCap can OCR any contents of the target window.

Here's a video that showcases doing a Normcap OCR capture on each monitor (in both 4.4 and 5.4) : https://youtu.be/SYqhpu1T4Wo

Note 1: I've just uploaded that video, so it may take a while to process to full quality. When processing is complete, you can watch the video in 4K quality.

Note 2: 5.4 exhibits similar issues to 4.4, but 5.4 will not capture for me at all.

Lonniebiz avatar Jan 17 '24 19:01 Lonniebiz