pyosmium icon indicating copy to clipboard operation
pyosmium copied to clipboard

PyOsmium script hangs indefinitely on Windows

Open gabortim opened this issue 10 months ago • 3 comments

After installing Osmium and running a script using PyOsmium on Windows, the execution never terminates. The script runs but does not exit as expected.

Steps to reproduce:

  1. Install PyOsmium in a standard library virtual environment (venv)
  2. Run the following example from the documentation:
import osmium

for obj in osmium.FileProcessor('test.osm'):
    print(obj)

Expected behavior The script should process the input file, print the result and exit.

Actual behavior The script prints the result and does not terminate.


OS: Windows 10 22H2 (build 19045.5371) Python version: 3.12.8 PyOsmium version: 4.0.2 Installation method: Installed via pip in a standard venv

gabortim avatar Feb 01 '25 11:02 gabortim

Already discussed at length in #233.

This needs somebody with access to a Windows machine to find the root cause.

lonvia avatar Feb 06 '25 10:02 lonvia

What I see is that CPU cycles burning in the kernel ntdll.NtWaitForAlertByThreadId+14 process. Probably a deadlock?

Btw I usually develop in a virtual machine, OS type is not a concern.

gabortim avatar Feb 14 '25 09:02 gabortim

~@lonvia I noticed a related PR that might already address this issue. Could you confirm?~

Update: Sorry, that PR only changes a local file and isn’t related to pyosmium, so it probably doesn’t fix this issue. However, I noticed that the thread handling and file reader were modified there, and something similar might be needed here as well.

gabortim avatar Dec 19 '25 07:12 gabortim

Not quite sure why that PR references our issues, although they are probably right that the problem is somewhere with the implicitly allocated thread pool.

@gabortim Can you give the branch https://github.com/lonvia/pyosmium/tree/refs/heads/reader-with-pool a try? This is a quick hack to give every reader their own thread pool to work with.

lonvia avatar Dec 19 '25 20:12 lonvia

Well, I have good news, now it terminates :) Thanks for the fix!

I tested osmium-4.2.0-cp314-cp314-win_amd64.whl on Windows 11 (26200.7462) + Python 3.14.2, but please give me some more time to confirm across different projects.

gabortim avatar Dec 19 '25 21:12 gabortim

Great news, thanks for testing. Don't worry too much about complete testing yet. This is only a quick prototype, not a full implementation. Some parts of the code might still cause hangs. But at least I know where the likely root cause is.

lonvia avatar Dec 20 '25 08:12 lonvia

@gabortim would you mind also testing branch https://github.com/lonvia/pyosmium/tree/refs/heads/global-pool-for-readers ? This uses a module-global pool and would be a bit cleaner to implement.

lonvia avatar Dec 20 '25 10:12 lonvia

Unfortunately this module-global pool version does not terminate.

gabortim avatar Dec 20 '25 11:12 gabortim

Is it possible somehow to provide debug symbols (pdb) for the Cpp bindings? I see now that a breakpoint(?) is getting hit in .venv\Lib\site-packages\osmium\io.cp314-win_amd64.pyd. But I have no idea what generates that file and how can I further inspect without setting up the whole env on Windows. Which is not easy from what I see in the readme.

0:000> !analyze -v
..............................................
*******************************************************************************
*                                                                             *
*                        Exception Analysis                                   *
*                                                                             *
*******************************************************************************

*** WARNING: Unable to verify checksum for io.cp314-win_amd64.pyd

KEY_VALUES_STRING: 1

    Key  : Analysis.CPU.mSec
    Value: 390

    Key  : Analysis.Elapsed.mSec
    Value: 1095

    Key  : Analysis.IO.Other.Mb
    Value: 0

    Key  : Analysis.IO.Read.Mb
    Value: 1

    Key  : Analysis.IO.Write.Mb
    Value: 0

    Key  : Analysis.Init.CPU.mSec
    Value: 171

    Key  : Analysis.Init.Elapsed.mSec
    Value: 2066

    Key  : Analysis.Memory.CommitPeak.Mb
    Value: 86

    Key  : Analysis.Version.DbgEng
    Value: 10.0.29482.1003

    Key  : Analysis.Version.Description
    Value: 10.2509.29.03 amd64fre

    Key  : Analysis.Version.Ext
    Value: 1.2509.29.3

    Key  : Failure.Bucket
    Value: BREAKPOINT_80000003_io.cp314-win_amd64.pyd!Unknown

    Key  : Failure.Exception.Code
    Value: 0x80000003

    Key  : Failure.Hash
    Value: {9fd103f8-5396-3fa0-f2b3-d10a7b143bea}

    Key  : Failure.ProblemClass.Primary
    Value: BREAKPOINT

    Key  : Faulting.IP.Type
    Value: Null

    Key  : Timeline.OS.Boot.DeltaSec
    Value: 755691

    Key  : Timeline.Process.Start.DeltaSec
    Value: 16

    Key  : WER.OS.Branch
    Value: ge_release

    Key  : WER.OS.Version
    Value: 10.0.26100.1

    Key  : WER.Process.Version
    Value: 3.14.2150.1013


FILE_IN_CAB:  python.dmp

NTGLOBALFLAG:  0

APPLICATION_VERIFIER_FLAGS:  0

EXCEPTION_RECORD:  (.exr -1)
ExceptionAddress: 0000000000000000
   ExceptionCode: 80000003 (Break instruction exception)
  ExceptionFlags: 00000000
NumberParameters: 0

FAULTING_THREAD:  1cf4

PROCESS_NAME:  python.exe

ERROR_CODE: (NTSTATUS) 0x80000003 - {KIV TEL}  T r spont  A program t r sponthoz  rkezett.

EXCEPTION_CODE_STR:  80000003

STACK_TEXT:  
00000070`908df2c8 00007ff8`08174d4e     : 00000248`057d007c 00000248`057e2c20 00000248`057e2a78 00007ff8`08103687 : ntdll!NtWaitForAlertByThreadId+0x14
00000070`908df2d0 00007ff8`058de268     : 00000000`00000000 00007fff`d3781e38 00000070`908df390 00000070`908df480 : ntdll!RtlSleepConditionVariableSRW+0x1de
00000070`908df370 00007fff`d375d610     : 00000000`00000000 00007fff`d37a2d9f 0002af4c`f40da744 00000000`0000000a : KERNELBASE!SleepConditionVariableSRW+0x38
00000070`908df3b0 00007fff`d3745219     : 0002af4c`f40da744 000013d1`71d011e0 00000000`00989680 7fffffff`ffffffff : io_cp314_win_amd64!PyInit_io+0x20910
00000070`908df3f0 00007fff`d3762a53     : 0002af4c`f37510c4 00000070`908df480 00000000`0000000a 00000000`00000001 : io_cp314_win_amd64!PyInit_io+0x8519
00000070`908df450 00007ff8`0579bc75     : 00000248`060ad7e0 00000248`057efaa0 00000248`058a3d10 00007ff8`05764375 : io_cp314_win_amd64!PyInit_io+0x25d53
00000070`908df480 00007ff8`0579b897     : 00000070`908df500 00000070`908df558 00000070`908df840 00007ff8`058888c0 : ucrtbase!<lambda_f03950bc5685219e0bcd2087efbe011e>::operator()+0xa5
00000070`908df4d0 00007ff8`0579b84d     : 00000000`00000000 00000000`00000001 00000000`00000000 00000070`908df548 : ucrtbase!__crt_seh_guarded_call<int>::operator()<<lambda_7777bce6b2f8c936911f934f8298dc43>,<lambda_f03950bc5685219e0bcd2087efbe011e> &,<lambda_3883c3dff614d5e0c5f61bb1ac94921c> >+0x3b
00000070`908df500 00007fff`d375df31     : 00007fff`d3781528 00000000`00000002 00007ff8`00000002 00000070`908df540 : ucrtbase!execute_onexit_table+0x3d
00000070`908df540 00007fff`d375e051     : 00000000`00000001 00000248`057e1370 00000000`00000000 00007ff8`080ea6af : io_cp314_win_amd64!PyInit_io+0x21231
00000070`908df580 00007ff8`0823f86e     : 00007fff`d3700000 00000248`00000000 00000000`00000001 00000000`7ffe0385 : io_cp314_win_amd64!PyInit_io+0x21351
00000070`908df5e0 00007ff8`080ebcae     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!LdrpCallInitRoutineInternal+0x22
00000070`908df610 00007ff8`0816d4af     : 00000248`07d3a260 00007fff`d3700000 00000000`00000000 00007ff8`03fe5650 : ntdll!LdrpCallInitRoutine+0x10e
00000070`908df680 00007ff8`0816c67e     : 00000000`00000000 00000000`00000000 00000248`057df690 00000000`00000000 : ntdll!LdrShutdownProcess+0x17f
00000070`908df790 00007ff8`073c18ab     : 00000000`00000000 00000000`00000000 ffffffff`fffffffe 00000000`00000000 : ntdll!RtlExitUserProcess+0x9e
00000070`908df7c0 00007ff8`057f0093     : 00000000`00000000 00000000`00000000 00000000`00000000 00000070`908df810 : kernel32!ExitProcessImplementation+0xb
00000070`908df7f0 00007ff7`0a6c1297     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000001 : ucrtbase!common_exit+0xc7
00000070`908df850 00007ff8`073ae8d7     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : python+0x1297
00000070`908df890 00007ff8`0816c53c     : 00000000`00000000 00000000`00000000 000004f0`fffffb30 000004d0`fffffb30 : kernel32!BaseThreadInitThunk+0x17
00000070`908df8c0 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x2c


STACK_COMMAND: ~0s; .ecxr ; kb

SYMBOL_NAME:  io_cp314_win_amd64+5d610

MODULE_NAME: io_cp314_win_amd64

IMAGE_NAME:  io.cp314-win_amd64.pyd

FAILURE_BUCKET_ID:  BREAKPOINT_80000003_io.cp314-win_amd64.pyd!Unknown

OS_VERSION:  10.0.26100.1

BUILDLAB_STR:  ge_release

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

FAILURE_ID_HASH:  {9fd103f8-5396-3fa0-f2b3-d10a7b143bea}

Followup:     MachineOwner
---------

gabortim avatar Dec 20 '25 16:12 gabortim

Try adding cmake.build-type= "Debug" in your pyproject.toml file here.

lonvia avatar Dec 20 '25 20:12 lonvia

Thanks, but unfortunately I had no luck with even the most basic pre-build CMake config.

I am currently using CLion with CMake 4.1.2 and libosmium fails before anything else (see osmcode/libosmium/issues/391). Some dependencies are resolved via FetchContent_Declare, while others require manual download. So I rewrote the pyosmium's CMake config, but once again failed on libosmium's CMake (osmcode/libosmium/issues/399), with a different issue this time. After a few hours of trial and error, I gave up trying to figure out the logic behind the mix-and-match dependency resolution approach.

When the build system will be in a bit better shape, I'll try again.

gabortim avatar Dec 21 '25 09:12 gabortim

This should be fixed by #311.

It would be great if Windows users could confirm that with their setup. You can download ready-to-use Python wheels from the CI runs: get pyosmium-win64-dist.zip, unpack it and install the wheel fitting your python version using pip install.

lonvia avatar Dec 23 '25 08:12 lonvia

It seems to be fixed, tested with a simple script from comment#1. Thank you! Checking with other projects.

gabortim avatar Dec 23 '25 09:12 gabortim

I've also tested on Windows 11 with Python 3.13, and can confirm that the wheel you linked to above has fixed the problem I was having with a simple script using a ForwardReferenceWriter hanging after completion. Thanks!

bridgecommand avatar Dec 23 '25 19:12 bridgecommand

I can also confirm that a first test on win 11 and python 3.9.13 is working now. Thanks!

Ebe66 avatar Dec 24 '25 00:12 Ebe66

Alright, I shall close the issue then. Thanks for testing. A new release with this will follow early next year.

lonvia avatar Dec 24 '25 09:12 lonvia