python-blosc2 icon indicating copy to clipboard operation
python-blosc2 copied to clipboard

py-cpuinfo issues on macOS

Open edbarnard opened this issue 1 year ago • 3 comments

Describe the bug Blosc2 uses py-cpuinfo library on module import. There are some issues with py-cpuinfo on macOS (https://github.com/workhorsy/py-cpuinfo/issues/216, https://github.com/workhorsy/py-cpuinfo/issues/218) that could cause blocsc2 to fail or hang during import due to cpuinfo.

py-cpuinfo is unlikely to be updated (see https://github.com/workhorsy/py-cpuinfo/issues/213, last commit Nov 2022) and may cause issues for users going forward. Is py-cpuinfo a necessary dependency?

To Reproduce

(tiled-client) esbarnard@esbstudio ~ % pip install blosc2
Collecting blosc2
  Using cached blosc2-3.0.0-cp311-cp311-macosx_10_9_x86_64.whl.metadata (13 kB)
Requirement already satisfied: numpy>=1.25.0 in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from blosc2) (2.0.2)
Requirement already satisfied: ndindex in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from blosc2) (1.9.2)
Requirement already satisfied: msgpack in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from blosc2) (1.1.0)
Requirement already satisfied: numexpr in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from blosc2) (2.10.2)
Requirement already satisfied: py-cpuinfo in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from blosc2) (9.0.0)
Requirement already satisfied: httpx in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from blosc2) (0.28.1)
Requirement already satisfied: anyio in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from httpx->blosc2) (4.7.0)
Requirement already satisfied: certifi in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from httpx->blosc2) (2024.8.30)
Requirement already satisfied: httpcore==1.* in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from httpx->blosc2) (1.0.7)
Requirement already satisfied: idna in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from httpx->blosc2) (3.10)
Requirement already satisfied: h11<0.15,>=0.13 in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from httpcore==1.*->httpx->blosc2) (0.14.0)
Requirement already satisfied: sniffio>=1.1 in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from anyio->httpx->blosc2) (1.3.1)
Requirement already satisfied: typing_extensions>=4.5 in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from anyio->httpx->blosc2) (4.12.2)
Using cached blosc2-3.0.0-cp311-cp311-macosx_10_9_x86_64.whl (4.0 MB)
Installing collected packages: blosc2
Successfully installed blosc2-3.0.0
(tiled-client) esbarnard@esbstudio ~ % python
Python 3.11.11 (main, Dec 11 2024, 10:28:39) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import blosc2
^CTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages/blosc2/__init__.py", line 191, in <module>
    cpu_info = get_cpu_info()
               ^^^^^^^^^^^^^^
  File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages/blosc2/core.py", line 1192, in get_cpu_info
    cpu_info_dict = _get_cpu_info()
                    ^^^^^^^^^^^^^^^
  File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages/blosc2/core.py", line 1156, in _get_cpu_info
    cpu_info = cpuinfo.get_cpu_info()
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages/cpuinfo/cpuinfo.py", line 2759, in get_cpu_info
    output = get_cpu_info_json()
             ^^^^^^^^^^^^^^^^^^^
  File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages/cpuinfo/cpuinfo.py", line 2742, in get_cpu_info_json
    output = p1.communicate()[0]
             ^^^^^^^^^^^^^^^^
  File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/subprocess.py", line 1209, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/subprocess.py", line 2115, in _communicate
    ready = selector.select(timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/selectors.py", line 415, in select
    fd_event_list = self._selector.poll(timeout)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
>>> 

Expected behavior

import blosc2 should return, but in this case it hangs waiting for a subprocess call in cpuinfo library

Desktop (please complete the following information):

  • Blocs2 3.0.0
  • py-cpuinfo 9.0.0

tested computer/os examples:

Computer macOS python works?
M3 Macbook Air macOS 14.4 Native ARM Python 3.12 WORKS
M3 Macbook Air macOS 15.2 Native ARM Python 3.12 WORKS
M1 Ultra Mac Studio macOS 15.1.1 Rosetta Python 3.12 FAILS
M1 iMac macOS 14.6.1 Rosetta Python 3.12 WORKS
Intel i9 MacBook Pro macOS 15.0 Native x86 Python 3.12 WORKS

Additional context Add any other context about the problem here.

edbarnard avatar Jan 10 '25 22:01 edbarnard

That's unfortunate. Well, python-blosc2 essentially uses the cpu_info = cpuinfo.get_cpu_info() in py-cpuinfo for two things:

  • Get the number of cores in the system (cpu_info['count'])
  • Get the cache sizes (cpu_info["l1_data_cache_size"] and other levels).

We already have code in place for the cache sizes, but only for Linux and Mac. For getting rid of the py-cpuinfo package we should need to complete cache sizes guessing for Windows, and provide the number of cores for three major platforms. Would you like to contribute a PR?

FrancescAlted avatar Jan 13 '25 06:01 FrancescAlted

I will certainly look into it! From my investigations, getting number of cores is relatively straightforward in the standard packages:

import multiprocessing
multiprocessing.cpu_count()
# or
import os
os.cpu_count()

So it looks like L1,L2,L3 cache for Windows is the main hurdle.

edbarnard avatar Jan 13 '25 18:01 edbarnard

I don't have Windows to test, but calling GetLogicalProcessorInformation through ctypes should help get the cache sizes on Windows:

import ctypes

GetLogicalProcessorInformation = ctypes.windll.kernel32.GetLogicalProcessorInformation

DimitriPapadopoulos avatar Jul 12 '25 11:07 DimitriPapadopoulos