normcap icon indicating copy to clipboard operation
normcap copied to clipboard

Normcap sometimes crashes trying to recognize Kanji/Hanzi

Open CookieDoodle opened this issue 1 year ago • 0 comments

What happened?

I enabled Normcap to stay open in the system tray. But sometimes Normcap crahes trying to parse Kanji/Hanzi characters.

Initially, I thought it was just a problem with Chinese. But after trying to recognize both Japanese and Chinese text. And then converting the Japanese text to Hiragana (no Kanji characters) and comparing the crash rates between Japanese w/ Kanji and without. I realized the problem was with the Kanji characters in general.

The crashing happens around 40% of the time. It happens pretty consistently. (Chinese crashes more than Japanese, obviously.) And the crashing is not dependent on length of the text being parsed. Normcap can crash parsing one word or multiple paragraphs. It doesn't matter.

I don't know if it's certain characters that Normcap has issues with. As in, it fails to recognize a certain character and that's why it crashes. Though, I'm not sure if that it the case. Since if Normcap crashes the first time trying to recognize a word. If you try to recognize it again, it doesn't crash a second time.

Expected behavior: Normcap is supposed to stay open after parsing Japanese/Chinese text instead of the program closing.

How did you install NormCap?

FlatPak (Linux)

Operating System + Version?

SteamOS 3.5.17

[Linux only] Display Server (DS) + Desktop environment (DE)?

Wayland/KDE Plasma

Debug log output?*

11:22:16 - INFO    - normcap:49 - Start NormCap v0.5.4
11:22:16 - DEBUG   - normcap.gui.tray:77 - System info:
{'normcap_version': '0.5.4', 'python_version': '3.11.9', 'cli_args': '/app/bin/normcap -v debug', 'is_briefcase_package': False, 'is_flatpak_package': True, 'is_appimage_package': False, 'platform': 'linux', 'desktop_environment': <DesktopEnvironment.KDE: 2>, 'display_manager_is_wayland': False, 'pyside6_version': '6.6.1', 'qt_version': '6.6.1', 'qt_library_path': '/usr/share/runtime/lib/plugins, /app/lib/python3.11/site-packages/PySide6/Qt/plugins, /usr/bin', 'locale': 'DEFAULT', 'config_directory': PosixPath('/home/deck/.var/app/com.github.dynobo.normcap/config/normcap'), 'resources_path': PosixPath('/app/lib/python3.11/site-packages/normcap/resources'), 'tesseract_path': PosixPath('/app/bin/tesseract'), 'tessdata_path': PosixPath('/home/deck/.var/app/com.github.dynobo.normcap/config/normcap/tessdata'), 'envs': {'TESSDATA_PREFIX': '/app/share', 'LD_LIBRARY_PATH': ''}, 'screens': [Screen(left=0, top=0, right=1279, bottom=799, device_pixel_ratio=1.0, index=0, screenshot=None)]}
11:22:16 - DEBUG   - normcap.gui.settings:162 - Skip update of non existing setting (show_introduction: None)
11:22:16 - DEBUG   - normcap.gui.settings:162 - Skip update of non existing setting (cli_mode: False)
11:22:16 - DEBUG   - normcap.gui.settings:162 - Skip update of non existing setting (background_mode: False)
11:22:16 - DEBUG   - normcap.gui.settings:162 - Skip update of non existing setting (clipboard_handler: None)
11:22:16 - DEBUG   - normcap.gui.tray:394 - Another instance is already running. Sending capture signal.
11:22:16 - INFO    - normcap.gui.tray:610 - Exit normcap
11:22:16 - DEBUG   - normcap.gui.tray:611 - Debug images saved in /tmp/normcap
(deck@steamdeck ~)$ 

.0, index=0, screenshot=None)]}

CookieDoodle avatar Apr 22 '24 09:04 CookieDoodle

Strawberry might be able to use "setlocal" at the top of any compiler batch file that would allow PATH and other environment variables to be used as normal but which all go away after the batch file exits.

LeeThompson avatar Jan 08 '20 03:01 LeeThompson

IMO removing c:\strawberry\c\bin from PATH will break a lot of things.

kmx avatar Feb 07 '20 08:02 kmx

The idea is to have it in the PATH for things that need it, started from e.g. some wrapper like EUMM etc.., and not in the user's PATH.

rkitover avatar Feb 07 '20 16:02 rkitover

In that case it seems to be rather a feature request for EUMM

kmx avatar Feb 07 '20 16:02 kmx

Ok, but do you understand what the problem is. Try running e.g. cmake for a project from powershell with the default strawberry PATH entries, it will find all the mingw crap instead of the user's intended toolchain.

rkitover avatar Feb 07 '20 17:02 rkitover

An update on this. The problem is actually not as bad as I originally thought.

With Strawberry and MinGW installed from chocolatey, the chocolatey MinGW takes precedence in the PATH, with anything missing like gmake being picked up from strawberry/c/bin.

I built and installed an XS module with the chocolatey MinGW in the PATH with no issues.

As for cmake, while it will default to using the MinGW toolchain, simply specifying -DCMAKE_C_COMPILER=cl will make it select the Visual Studio toolchain instead. I will follow up with the cmake project about this default.

Likewise, specifying -DCMAKE_C_COMPILER=gcc will make it select the mingw toolchain, the chocolatey one in this case.

I still think this is bad, but at least not bad enough for me to avoid installing Strawberry entirely.

rkitover avatar Aug 14 '20 21:08 rkitover

If you're worried about having Strawberry always in your path, install the portable version to the location of your choice. That location is prepended to the PATH only when portableshell.bat (which is located in the "location of your choice") is executed.

I really don't see any need for Strawberry to do anything differently.

The practice of having the msi install append the relevant directories to the PATH is a long-established one. If there's already an identical item (eg make, or pk-config, or gcc) in the PATH, then someone must have put it there - and the intent therefore is that the pre-existing item be used in preference to the one provided by Strawberry.

This contrasts with the portable version's portableshell.bat which puts everything at the beginning of the PATH. In this case, the view is that the user is accepting everything that Strawberry provides.

I think the user should accept responsibility for taking care of these issues when they arise. It's not always pretty - I can recall renaming Strawberry's pkg-config to pkg-config-hide just to get it out of the way, so the MSYS2 utility of the same name would be used.

Cheers, Rob

sisyphus avatar Aug 15 '20 03:08 sisyphus

The issue is having a whole MinGW toolchain in your global PATH, rather than just perl and perl tools like perldoc, which is what people would expect. This can cause all kinds of problems with other tools, and is only done so that perl modules can build themselves. Perl modules should be able to build themselves without doing this.

rkitover avatar Aug 15 '20 14:08 rkitover

This is not an issue, but a requirement for Perl on Windows. In the unfortunate event that you have an alternate MinGW installed, you will have to be careful about your environment. There's no other way for Perl to function on Windows.

Please close this as there's simply no other way. Your work-around is to use the Portable installs and handle your own PATH how you like, but for the masses, this is how things have to be.

Perl use requires XS modules and third party libraries for developers to be able to write code. XS modules and those third-party libraries require MinGW on Windows as that's what Strawberry is built on. They require that the versions of the compiler used to build Perl and the the ones building the XS/libraries to match up enough that things behave well.

genio avatar Aug 15 '20 18:08 genio

You can yank the C compiler out of your own PATH unless you're installing things if that's your preference. However, some FFI modules and others will misbehave if there's no compiler available. https://metacpan.org/pod/FFI::Platypus::Bundle for example

genio avatar Aug 15 '20 18:08 genio

This is not an issue, but a requirement for Perl on Windows. In the unfortunate event that you have an alternate MinGW installed, you will have to be careful about your environment. There's no other way for Perl to function on Windows.

I respectfully disagree. Perl functions just fine without a compiler, for things involving perl and not installing modules. For the 99% of cases anyway, not including the use case you mentioned.

Quite a lot of people also use Perl as just a tool and don't even care about any non-core modules, many of which come pre-installed with Strawberry anyway.

When I was testing this, I actually had to spend some time searching for an XS module to build, because everything I tried was already pre-installed.

Please close this as there's simply no other way. Your work-around is to use the Portable installs and handle your own PATH how you like, but for the masses, this is how things have to be.

"This is how things have to be" why? There are many other simple alternatives to sticking a non-standard MinGW toolchain in the system global PATH by default. Which is actually quite rude. That is the actual issue.

For example, a shortcut could be installed for using cpan tools that does this. This is what Visual Studio does, I modify my own PowerShell profile to add Visual Studio tools to my PATH.

Or it could be an option that defaults to off, the user could easily modify their own PATH.

Or it could be a separate package entirely, with its own shortcut, etc..

Perl use requires XS modules and third party libraries for developers to be able to write code. XS modules and those third-party libraries require MinGW on Windows as that's what Strawberry is built on. They require that the versions of the compiler used to build Perl and the the ones building the XS/libraries to match up enough that things behave well.

Python on Windows is in a similar situation, it is built with Visual Studio and requires Visual Studio installed to build extensions. But they do not do this for you automatically and stick it in the PATH, though they easily could.

For that matter, why is Strawberry incompatible with the standard MinGW toolchain? I should be able to run something like:

choco install -y StrawberryPerl mingw

and have everything work.

And when I tried everything did in fact work with the standard MinGW gcc at least, some other things like gmake it does not come with.

Feel free to close this, I don't care about this that much, and I have a way to deal with it for myself which I described above, I just think it's wrong, from the perspective of Strawberry being a well-behaved Windows tool and development tool.

rkitover avatar Aug 15 '20 23:08 rkitover

It matters what version of MinGW the version of Strawberry you're using was built with. Many of them around that are installable on your version of Windows. They're not all built with the same MinGW. That version of MinGW from chocolatey would not work with, say, Strawberry Perl 5.16.

This isn't Python and wasn't built with Visual Studio, so that's not a great comparison. The installer for Strawberry installs a binary of the language as it was built at the time of release. In order to help you with all of the things I mentioned, it provides the packaged up MinGW it used as well.

Again, the fix here is to remove things out of the PATH that you don't want or to use the Portables and only put on the PATH what you want.

genio avatar Aug 15 '20 23:08 genio

It's true that the path is easily removable in your own environment, for example putting this line into your PowerShell $profile will do it:

$env:PATH = ($env:PATH -split ';' | ?{ $_ -notmatch '\\Strawberry\\c\\bin$' }) -join ';'

rkitover avatar Aug 16 '20 02:08 rkitover

Putting strawberry perl in the path is also a problem for me, as it adds it's own pkg-config binary, and I have my own version of pkg-config, so it messes up my GStreamer build. It should at least be an option during install, with default being to NOT add it to the path.

boxerab avatar Oct 02 '20 17:10 boxerab

Unfortunately, we can't account for every non-vanilla machine build out there. We need these paths in the environment to allow Perl to function as expected. Knowing how the install works and the paths it puts in your $env:PATH you can alter that path yourself to account for your machine setup after install.

Do note that you may run into issues with installing Perl modules that rely on those things being in the path and being what came with the install.

genio avatar Oct 02 '20 19:10 genio

Fair enough - how about a large warning during install, stating that the PATH will be modified which may affect the functioning of other programs. And put this in the Release notes too. In my case, after installing Strawberry Perl, my GStreamer build suddenly started failing, and I had no idea why.

boxerab avatar Oct 02 '20 19:10 boxerab

I second this issue. Here is a little story why:

We have an internal toolchain for unit testing based on MinGW. We use lcov to get the code coverage, which requires Perl. When I upgraded the Perl dependency of our package manager to StrawberryPerl, gcov spit out a bunch of errors. It took me awhile to figure out what happened, but when I realized that StrawberryPerl ships with MinGW, it all made sense, because lcov happened to call the wrong version of gcov.

For somebody like me who merely upgrades a dependency, it is somewhat unexpected that StrawberryPerl is not just Perl, but a collection of tools including Perl. As there are probably many users who migrate from perl-with-an-incompatible-license version to StrawberryPerl, I think it would be great to provide an upgrade path.

I would like to see a StrawberryPerl light version that includes just the core elements of Perl. My gut feeling also tells me that the inclusion of an entire compiler package is going to bite back sooner or later. At the very least, we should get the MinGW folder out of the PATH variable. That would limit the problem that some components accidentally picks up an executable (or worse, just a DLL).

eur2fe avatar Apr 22 '21 13:04 eur2fe

The problem is that for Windows, MinGW is a core part of Perl and it has to be the same version of MinGW that Perl was built with. There is no fix to this other than managing your own PATH as we discussed above.

genio avatar Apr 22 '21 14:04 genio

On Thu, Apr 22, 2021 at 11:33 PM eur2fe @.***> wrote:

I would like to see a StrawberryPerl light version that includes just the core elements of Perl.

You can, of course, just remove or hide (by renaming) Strawberry's "c" folder, and perl will still be functional. That will work because copies of the MinGW dlls that perl needs are also located in Strawberry's perl/bin folder.

One of those dlls is libwinpthread-1.dll, which gcov also needs ... so even if you do get rid of Strawberry's MinGW installation, you're still faced with having 2 different (possibly incompatible ?) libwinpthread-1.dll files in your PATH .... unless you instead keep Strawberry's MinGW installation and get rid of that other MinGW (which is how I would be trying to arrange things).

Having 2 different dlls with the same name in one's PATH is best avoided, if possible. It can sometimes be difficult (even impossible, as a worst case scenario) to have your app load the one that needs to be loaded What was the actual error you were getting before you worked out what was going wrong ?

The other two MinGW dlls in perl/bin are libstdc++-6.dll and libgcc_s_seh-1.dll, though, depending upon which build of Strawberry Perl you have, the second of those 2 files might instead be named libgcc_s_dw2-1.dll or libgcc_s_sjlj-1.dll

Cheers, Rob

sisyphus avatar Apr 22 '21 15:04 sisyphus

I have read the discussion, and I understand that you want to keep MinGW in the distribution. On the other hand, we have users who have a requirement to choose an arbitrary compiler, which must not conflict with the MinGW distribution used by StrawberryPerl.

I think a solution that serves these two conflicting requirements is to not rely on modifying the system or user PATH variable (BTW, I like the installer modification of the system/user PATH variable to be optional).

Am I correct to assume that MinGW is included to compile stuff on the fly, such as XS modules? That would mean that only the child processes of perl.exe must be able to do so. A quick and dirty solution could be to modify the PATH at runtime, i.e. only in the environment of the child process to be launched for compilation.

eur2fe avatar Apr 22 '21 15:04 eur2fe

@sisyphus: Thanks for the reply. I actually did not need any help, I just wanted to provide feedback from a corporate point of view. The thing is that someone at our company maintains perl packages, and that we combine many packages in an environment (including MinGW compilers) to do stuff. It just sucks when certain combinations break. Fiddling with the path may work at home, but it is not a good solution for us.

eur2fe avatar Apr 26 '21 09:04 eur2fe

The Meson build system looks up dependencies using pkg-config, if pkg-config is available. If not, it will find some things using specialized system probe rules, or fallback to fetching the dependency as a wrapped subproject.

Strawberry Perl breaks this, because the non-functional pkg-config it installs claims that stuff like zlib is successfully found, but then that cannot actually be used during building, and unfortunately that means it didn't try to find a working version instead.

Why is this a problem, you say? Just remove it from your PATH temporarily.

Well, it is a problem because I never heard of Strawberry Perl in my life before, I do not use it, I do not want to use it, and I certainly don't want its pkg-config implementation. So why do I have it anyway? Because... a github actions CI image uses runs-on: windows-latest, and apparently, through the grace of Github, the "windows-latest" CI runner helpfully includes a variety of tools out of the box which includes... perl. Apparently Strawberry Perl at that.

I don't even need or use perl in this CI, but hey, thanks anyway. I guess now I know what Strawberry Perl is, but given the precise manner of my introduction to it, I suspect I'll try to avoid it in future.

This was annoying to debug and it really feels like I should not have to hack my CI because Github is popularizing your bad design decisions.

Please stop adding broken tools to the global environment, which are only intended to be used for a private local environment.

eli-schwartz avatar Oct 10 '21 21:10 eli-schwartz

your bad design decisions

The history of this thread contains ample explanation as to why the things that are installed are required.

I would suggest you not let Github Actions CI install things to your main environment; as you noted it installs quite a lot of things, and these may get in the way of your normal preferences.

karenetheridge avatar Oct 10 '21 23:10 karenetheridge

I'm not sure how the pkg-config.bat can be useful to anyone:

PS C:\> pkg-config --libs zlib
-lz
PS C:\> pkg-config --cflags zlib

PS C:\> cat C:\Strawberry\c\lib\pkgconfig\zlib.pc
prefix=/usr/local
exec_prefix=/usr/local
libdir=${pcfiledir}/../../lib
sharedlibdir=${pcfiledir}/../../lib
includedir=${pcfiledir}/../../include

Name: zlib
Description: zlib compression library
Version: 1.2.11

Requires:
Libs: -L${libdir} -lz
Cflags: -I${includedir}

You'll never be able to find headers with that...

xclaesse avatar Oct 11 '21 00:10 xclaesse

I would suggest you not let Github Actions CI install things to your main environment; as you noted it installs quite a lot of things, and these may get in the way of your normal preferences.

The Windows image is provided by GitHub, and it includes Strawberry Perl. We have no control over that , AFAIK.

xclaesse avatar Oct 11 '21 00:10 xclaesse

It's ample explanation that I object to and disagree with (installing the perl script interpreter does NOT mean installing the perl module compilation environment), so I reiterate:

your bad design decisions.

And it is a bit difficult for me to just say "hey, don't let github actions CI install things" when it comes embedded in the VM image rather than being "installed in CI". Unless your advice is to just not use Github workflows at all?

It is even more difficult for me to say "hey, as a build system in wide use by people who also use github actions, I don't have to care that this is broken out of the box on github actions". Maybe no one should use Github Actions. Because that's sure going to be the kind of advice everyone happily accepts!

Realistically speaking, there is only one solution here and that is the one I've PRed above. Meson will learn to detect when Strawberry Perl is installed, and blacklist its broken pkg-config because it is badly behaved and we do not want it corrupting our build system.

This problem is not exclusive to our regression testing and WrapDB services. When discussing the problem on IRC and trying to figure out why the zlib system dependency was found but did not have zlib.h, TWO projects chimed in and said they had the same issue. One of them simply added a custom script which emitted a fatal warning if Strawberry Perl was detected for any reason whatsoever. The other project was manually fiddling with the entire installation so that the perl interpreter and dlls were available, for linking to, but not the broken build environment.

So yes, we will add special code to the Meson build system to detect if Strawberry perl is installed, raise a warning, and ignore your broken stuff. Thank you for your bad design decisions.

Because apparently alone of all programming languages, Strawberry Perl not only needs to ship its own compiler toolchain so people can use it to build more extensions... it needs to stick that unconditionally in the PATH.

This is handing people a loaded gun and tricking them into shooting themselves in the foot, and justifying it with "well anyone who installed Strawberry Perl obviously wanted to use Strawberry GCC and Strawberry Make and Strawberry pkg-config as their one and only exclusive compiler toolchain, also what is this "PATH has multiple things" stuff you're talking about". And then, sure enough, github shoots the loaded gun at the flagship CI image's preinstalled software.

eli-schwartz avatar Oct 11 '21 00:10 eli-schwartz

See for example GStreamer project: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/blob/main/meson.build#L29

xclaesse avatar Oct 11 '21 00:10 xclaesse

And it is a bit difficult for me to just say "hey, don't let github actions CI install things" when it comes embedded in the VM image rather than being "installed in CI".

Ok, sorry, I thought you were talking about installations on your local machine.

karenetheridge avatar Oct 11 '21 00:10 karenetheridge

Removing Perl from $env:PATH is quite simple. Since you know it's installed on the GH Actions image, go ahead and remove those entries from your PATH. Perl requires those things in order to build many modules that are written in XS.

Yelling at us and being extremely rude does nothing to solve your actual problem.

If you don't want the version of pkg-config or MinGW we use in the Perl install, then remove them from your PATH. Perl puts three paths in the PATH environment. Remove those and anything else GitHub puts in the image by default that you don't need. Yelling at other people who have no say over how GitHub builds their images does you no good.

genio avatar Oct 11 '21 00:10 genio

@eli-schwartz @xclaesse You're barking up the wrong tree. We have absolutely no control over what GitHub puts in their images. You should ask them to remove Strawberry from %PATH%. Even if we made Strawberry not pollute %PATH% with its mingw binaries today, it won't fix your issue because the images will be still using the old version of Strawberry.

xenu avatar Oct 11 '21 00:10 xenu