nut.js
nut.js copied to clipboard
OCR is Inconsistent in Windows Context Menus
Short summary I want to use OCR successfully and consistently in the context menus of my application.
Desired execution environment / tested on Microsoft Windows
- [ ] Virtual machine
- [ ] Docker container
- [x] Dev/Host system
node version: 16.20.2
OS type and version: Windows 10 Home v.22H2
Full code sample related to question
// get menubar region
let activeWin = await nut.getActiveWindow();
pro_reg = await activeWin.region;
await nut.screen.highlight(pro_reg);
let menubar_reg = new nut.Region(pro_reg.left + 8, pro_reg.top + 8, 500, 30)
await nut.screen.highlight(menubar_reg);
// get menu bar item
let menuBarItem = await nut.screen.find(nut.singleWord(itemName), {
confidence: 0.40,
searchRegion: menubar_reg
});
// click menubar item
await nut.mouse.move(nut.centerOf(menuBarItem));
await nut.mouse.click(nut.Button.LEFT);
await nut.sleep(100);
// make menu region
let menu_reg = new nut.Region(menuBarItem.left + 25, (menuBarItem.top + menuBarItem.height + 5), 125, 470);
await nut.screen.captureRegion('just-region-cap', menu_reg, nut.FileType.PNG, './test-imagery/');
await nut.screen.highlight(menu_reg);
await nut.sleep(2000);
// find menu item (ocr)
let menuitem = await nut.screen.find(nut.textLine('Presentation Editor'), {
confidence: 0.50,
searchRegion: menu_reg,
providerData: {
preprocessConfig: {
binarize: false
}
}
});
await nut.mouse.move(nut.centerOf(menuitem));
Detailed question In another thread, this issue was discussed, and some solutions were proposed. Essentially, this is what I'm trying to do with this code.
- Get the region of my application.
- Locate the app menubar and select the item 'view'
-
Manually constrain a search region around the text of the menu
-
getActiveWindow()
doesn't work for menus on Windows
-
- Search in said region for "Presentation Editor"
- disable preprocessing for easier locating
This is the search region I narrowed it down to:
Many of the above menu items do work. ('Show', 'Action Palette', etc.)
However, this one, and possibly others, do not. So, my question is this: is there anything else I can try in order to make consistent OCR text matching a reality for my application?
Hi @joel-duffie 👋
I’ll do another iteration on preprocessing to improve OCR results. Alternatively, have you tried the Azure OCR plugin?
@joel-duffie Since the Azure OCR plugin seems to be working well, may I close this issue?
@s1hofmann that's fine with me.