Feature request: Add a settings page for managing installed OCR languages
Motivation
Some users might get confused when they see so few available OCR languages and have no idea how to install a new language.
The official way to install a new OCR language would be going to the Language page in Windows settings and adding the language in the Preferred languages list.
However, this method has several downsides.
- Not every installable language is a valid OCR language. In other words, not every language supported by Windows is supported by the Windows OCR engine. On my system there's only 35 valid OCR languages, but the installable languages in "Preferred languages" is much more.
- OCR language modification can require a reboot to take effect, but Windows setting won't tell you this. So it's possible that your newly installed language won't get shown in the available OCR languages until you happen to reboot your machine.
- Installed "preferred languages" can add a new item in your input language/IME list, or the list that brings up when you press Win+Space. If you switch your input language often, having many input language selections just to be able to OCR them is annoying. However, you cannot remove a "preferred language" without removing its corresponding OCR language.
So, I think that having a separate settings page for managing only OCR languages would be helpful, but unfortunately Windows settings doesn't have such a page. As Text Grab is a tool that utilizes this maybe-not-so-known feature of Windows, including such a settings page would be appreciated.
How to manage installed OCR languages using PowerShell
I'm not sure how to do this using C# code as for now, but here's some PowerShell code. Note that all of the following operations require elevated (Administrators) privileges.
Get a list of all valid OCR languages on your system
Get-WindowsCapability -Online | where {$_.Name.StartsWith("Language.OCR~~~")}
It will return a list of OCR-related Windows capabilities, which are just the installable OCR languages.
PowerShell output on my system
Name : Language.OCR~~~ar-SA~0.0.1.0
State : Installed
Name : Language.OCR~~~bg-BG~0.0.1.0
State : NotPresent
Name : Language.OCR~~~bs-LATN-BA~0.0.1.0
State : NotPresent
Name : Language.OCR~~~cs-CZ~0.0.1.0
State : NotPresent
Name : Language.OCR~~~da-DK~0.0.1.0
State : NotPresent
Name : Language.OCR~~~de-DE~0.0.1.0
State : NotPresent
Name : Language.OCR~~~el-GR~0.0.1.0
State : NotPresent
Name : Language.OCR~~~en-GB~0.0.1.0
State : NotPresent
Name : Language.OCR~~~en-US~0.0.1.0
State : Installed
Name : Language.OCR~~~es-ES~0.0.1.0
State : NotPresent
Name : Language.OCR~~~es-MX~0.0.1.0
State : NotPresent
Name : Language.OCR~~~fi-FI~0.0.1.0
State : NotPresent
Name : Language.OCR~~~fr-CA~0.0.1.0
State : NotPresent
Name : Language.OCR~~~fr-FR~0.0.1.0
State : NotPresent
Name : Language.OCR~~~hr-HR~0.0.1.0
State : NotPresent
Name : Language.OCR~~~hu-HU~0.0.1.0
State : NotPresent
Name : Language.OCR~~~it-IT~0.0.1.0
State : NotPresent
Name : Language.OCR~~~ja-JP~0.0.1.0
State : Installed
Name : Language.OCR~~~ko-KR~0.0.1.0
State : Installed
Name : Language.OCR~~~nb-NO~0.0.1.0
State : NotPresent
Name : Language.OCR~~~nl-NL~0.0.1.0
State : NotPresent
Name : Language.OCR~~~pl-PL~0.0.1.0
State : NotPresent
Name : Language.OCR~~~pt-BR~0.0.1.0
State : NotPresent
Name : Language.OCR~~~pt-PT~0.0.1.0
State : NotPresent
Name : Language.OCR~~~ro-RO~0.0.1.0
State : NotPresent
Name : Language.OCR~~~ru-RU~0.0.1.0
State : NotPresent
Name : Language.OCR~~~sk-SK~0.0.1.0
State : NotPresent
Name : Language.OCR~~~sl-SI~0.0.1.0
State : NotPresent
Name : Language.OCR~~~sr-CYRL-RS~0.0.1.0
State : NotPresent
Name : Language.OCR~~~sr-LATN-RS~0.0.1.0
State : NotPresent
Name : Language.OCR~~~sv-SE~0.0.1.0
State : NotPresent
Name : Language.OCR~~~tr-TR~0.0.1.0
State : NotPresent
Name : Language.OCR~~~zh-CN~0.0.1.0
State : Installed
Name : Language.OCR~~~zh-HK~0.0.1.0
State : NotPresent
Name : Language.OCR~~~zh-TW~0.0.1.0
State : NotPresent
Get a list of only installed OCR languages on your system
Just filter the items by its State property value.
Get-WindowsCapability -Online | where {$_.Name.StartsWith("Language.OCR~~~") -and $_.State -eq [Microsoft.Dism.Commands.PackageFeatureState]::Installed}
Get the corresponding capability of a specific OCR language
All OCR-related capabilities have names like Language.OCR~~~<language>~<version>, so you can change the filter criterion to just match a single capability. An example would be:
Get-WindowsCapability -Online | where {$_.Name.StartsWith("Language.OCR~~~zh-CN")}
Then you can pass it to Add-WindowsCapability or Remove-WindowsCapability to install/uninstall the OCR language.
Install an OCR language
For example zh-CN:
Get-WindowsCapability -Online | where {$_.Name.StartsWith("Language.OCR~~~zh-CN")} | Add-WindowsCapability -Online
Its result will tell you whether a restart is needed or not.
Uninstall an OCR language
For example zh-CN:
Get-WindowsCapability -Online | where {$_.Name.StartsWith("Language.OCR~~~zh-CN")} | Remove-WindowsCapability -Online
How those can be used in Text Grab
Of course you can try to invoke PowerShell in the program.
As those PowerShell scripts use the DISM APIs under the hood, you can invoke those APIs directly as well.
Other notes
- You can add an OCR language independently of the "preferred languages" setting.
- However, it seems that if you modify the "preferred languages" setting, the system can install/uninstall OCR languages to match your "preferred languages" list, so "additional" OCR languages may get removed.
@gexgd0419 a few years later I'm finally looking to this issue and I don't know if it is possible to do well. Here are the challenges, let me know if you have any solutions or other examples for these fixes.
Issue 1: Powershell as admin
- We need to run Powershell code
- As an administrator
- Because of this we cannot use CLI wrap
- We have to use process Start,
- Process start is not recognizing the consecutive Powershell commands
Issue 2
- It take a long time to install the languages
- Process start does not give feedback
- There will be little to no feedback on stall/fail
- Locking the Text Grab UI for a couple minutes does not seem like a good user experience
I'm hoping there are fixes for these issues, but for now I have not found anything.
You can try the DISM API directly, which I think is what those PowerShell commands use under the hood.
Here's some code that uses the Microsoft.Dism NuGet package to use the DISM APIs.
using Microsoft.Dism;
using System.Globalization;
namespace TestDismCSharp
{
internal class Program
{
static void ProgressCallback(DismProgress progress)
{
// Set progress.Cancel to true when you want to cancel the installation.
Console.WriteLine($"Installation progress {progress.Current} of {progress.Total} ...");
}
static void Main(string[] args)
{
DismApi.Initialize(DismLogLevel.LogErrorsWarnings); // once per process
using (var sess = DismApi.OpenOnlineSession())
{
Console.WriteLine("Currently installed OCR languages:");
foreach (var cap in DismApi.GetCapabilities(sess))
{
string capName = cap.Name;
if (!capName.StartsWith("Language.OCR~~~"))
continue;
if (cap.State != DismPackageFeatureState.Installed)
continue;
string localeName = capName["Language.OCR~~~".Length..capName.LastIndexOf('~')];
CultureInfo culture = new(localeName);
Console.WriteLine(culture.DisplayName);
}
Console.WriteLine();
Console.WriteLine("Installing OCR language zh-CN...");
var langToInstall = DismApi.GetCapabilities(sess).First(cap => cap.Name.StartsWith("Language.OCR~~~zh-CN"));
DismApi.AddCapability(sess, langToInstall.Name, false, null, ProgressCallback, null);
Console.WriteLine("Installation complete.");
}
DismApi.Shutdown();
}
}
}
If the main process doesn't have administrator's privilege, you have to launch a separate process as an administrator to perform installation.
The DISM API allows you to provide a progress callback function, which periodically tells your program the installation progress, and allows you to cancel it. You can display a progress dialog with a Cancel button during that period. The DismApi.AddCapability function won't return until the installation process either succeeds or fails (in which case an exception DismException will be thrown), so you may have to use a background thread to prevent blocking the UI.
Note that by default, it will throw DismRebootRequiredException when the operation succeeded but required reboot to take effect.
You can also replace:
using (var sess = DismApi.OpenOnlineSession())
with:
using (var sess = DismApi.OpenOnlineSessionEx(new DismSessionOptions { ThrowExceptionOnRebootRequired = false }))
so that a DismRebootRequiredException won't be thrown on success, and then check sess.RebootRequired after installing.
This is great, thank you for all of the information!