psutil icon indicating copy to clipboard operation
psutil copied to clipboard

[AIX] RuntimeError: structure size mismatch when process_iter is used

Open bartekmp opened this issue 6 years ago • 12 comments

AIX

  • { AIX 7.2 }
  • { psutil 5.6.2 }

Bug description From time to time we encounter weird issue when using psutil on our AIX setup. It is yet unclear how to reproduce it. Notice the AttributeError being thrown in _common.py at first.

Traceback (most recent call last):
  File "/home/user/tests/venv/lib/python3.6/site-packages/psutil/_common.py", line 342, in wrapper
    ret = self._cache[fun]
AttributeError: _cache

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/tests/test_common/platform_utils.py", line 46, in 
    running_processes = [process.info for process in psutil.process_iter(attrs=["pid", "name"])]
  File "/home/user/tests/venv/lib/python3.6/site-packages/psutil/__init__.py", line 1570, in process_iter
    if proc.is_running():
  File "/home/user/tests/venv/lib/python3.6/site-packages/psutil/__init__.py", line 694, in is_running
    return self == Process(self.pid)
  File "/home/user/tests/venv/lib/python3.6/site-packages/psutil/__init__.py", line 446, in __init__
    self._init(pid)
  File "/home/user/tests/venv/lib/python3.6/site-packages/psutil/__init__.py", line 473, in _init
    self.create_time()
  File "/home/user/tests/venv/lib/python3.6/site-packages/psutil/__init__.py", line 823, in create_time
    self._create_time = self._proc.create_time()
  File "/home/user/tests/venv/lib/python3.6/site-packages/psutil/_psaix.py", line 329, in wrapper
    return fun(self, *args, **kwargs)
  File "/home/user/tests/venv/lib/python3.6/site-packages/psutil/_psaix.py", line 422, in create_time
    return self._proc_basic_info()[proc_info_map['create_time']]
  File "/home/user/tests/venv/lib/python3.6/site-packages/psutil/_psaix.py", line 329, in wrapper
    return fun(self, *args, **kwargs)
  File "/home/user/tests/venv/lib/python3.6/site-packages/psutil/_common.py", line 345, in wrapper
    return fun(self)
  File "/home/user/tests/venv/lib/python3.6/site-packages/psutil/_psaix.py", line 378, in _proc_basic_info
    return cext.proc_basic_info(self.pid, self._procfs_path)
RuntimeError: structure size mismatch

For now I use this snippet to reproduce the problem, cause I've got no idea what is a precondition to make it happen again:

for i in range(1000000000):
    try:
        running_processes = [process.info for process in psutil.process_iter(attrs=["pid", "name"])]
    except RuntimeError as re:
      print(f"Error caught: {re}")
      break 

bartekmp avatar Aug 30 '19 08:08 bartekmp

I am unable to reproduce this on my setup. Can you please share a few more details?

  • Exact AIX version (including TL and SP - oslevel -s)
  • Python version (python -V)
  • Python build architecture (python -c "import platform; print(platform.architecture())")

wiggin15 avatar Sep 18 '19 08:09 wiggin15

Here it is.

AIX version:

$ oslevel -s
7200-03-02-1845

Python version:

$ python -V
Python 3.6.6

Python build:

$ python -c "import platform; print(platform.architecture())"
('64bit', 'COFF')

bartekmp avatar Sep 18 '19 09:09 bartekmp

I'm afraid I still can't reproduce this, with the same OS level and build architecture of Python. One way to start debugging this would be to edit the code, so the message will be more helpful, and then rebuild psutil... The relevant code is in _psutil_aix.c, line 83:

-        PyErr_SetString(PyExc_RuntimeError, "structure size mismatch");
+        PyErr_SetString(PyExc_RuntimeError, "structure size mismatch %s %zu %zu", path, size, nbytes);

This way we can find out which file is casing trouble and how.

wiggin15 avatar Oct 02 '19 17:10 wiggin15

Sure, I get it, it's very difficult to reproduce it, even in our setup. I don't know if my team will be eager to recompile psutil with your change, so I guess you can put this ticket "on hold" for now. If I collect some more interesting data, I will get back here and let you know.

bartekmp avatar Oct 03 '19 07:10 bartekmp

Any news on this issue?

pauloamed avatar Aug 25 '20 11:08 pauloamed

This issue has been automatically closed because there has been no response for more information from the original author. Please reach out if you have or find the answers requested so that this can be investigated further.

no-response[bot] avatar Jan 02 '21 20:01 no-response[bot]

@bartekmp did you manage to fix this? We also saw it happen.

albertvaka avatar Apr 22 '21 14:04 albertvaka

I am also seeing this issue on AIX

KyleTheScientist avatar Sep 08 '22 17:09 KyleTheScientist

I can reproduce the problem within a few minutes on AIX 7.3. Test script:

#!/usr/bin/python3
import psutil
while True:
  processes = list(psutil.process_iter())

When the script is executed after a while it terminates with one of the two errors:

  • OSError: [Errno 16] Device busy: /proc/SOMEPID/status'
  • RuntimeError: structure size mismatch
  • OSError: [Errno 22] Invalid argument: '/proc/SOMEPID/status'

AIX 7.3, TL3, SP0 Python 3.9.20, GCC 10.3.0 Psutil version 7.0.0

jose1711 avatar Sep 23 '25 13:09 jose1711

while :; do time ./check.py; done

The amount of time until the issue is reproduced:

  • 42 seconds
  • 43 seconds
  • 3 seconds
  • sub-second (repeatedly)
  • 22 seconds

jose1711 avatar Sep 23 '25 13:09 jose1711

Unfortunately there's basically nobody who has an AIX box to test against, so it's unlikely that this 6 years old issue will get any traction. If somebody is willing to grant me SSH access to an AIX box I could give it a try, and while I'm at it also polish the AIX code in general. which is something I always wanted to do, but then again: no AIX box. :)

giampaolo avatar Sep 23 '25 13:09 giampaolo

@giampaolo I would love to help, sadly in my case the infrastructure is not owned by me and providing an access to a 3rd party would raise so many security alerts that it would equal writing a letter of resignation. I can provide additional information, compile output from the code compiled with debug flags, etc but unfortunately the direct access it out of question

jose1711 avatar Sep 23 '25 16:09 jose1711