py-cpuinfo issues on macOS
Describe the bug
Blosc2 uses py-cpuinfo library on module import. There are some issues with py-cpuinfo on macOS (https://github.com/workhorsy/py-cpuinfo/issues/216, https://github.com/workhorsy/py-cpuinfo/issues/218) that could cause blocsc2 to fail or hang during import due to cpuinfo.
py-cpuinfo is unlikely to be updated (see https://github.com/workhorsy/py-cpuinfo/issues/213, last commit Nov 2022) and may cause issues for users going forward. Is py-cpuinfo a necessary dependency?
To Reproduce
(tiled-client) esbarnard@esbstudio ~ % pip install blosc2
Collecting blosc2
Using cached blosc2-3.0.0-cp311-cp311-macosx_10_9_x86_64.whl.metadata (13 kB)
Requirement already satisfied: numpy>=1.25.0 in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from blosc2) (2.0.2)
Requirement already satisfied: ndindex in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from blosc2) (1.9.2)
Requirement already satisfied: msgpack in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from blosc2) (1.1.0)
Requirement already satisfied: numexpr in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from blosc2) (2.10.2)
Requirement already satisfied: py-cpuinfo in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from blosc2) (9.0.0)
Requirement already satisfied: httpx in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from blosc2) (0.28.1)
Requirement already satisfied: anyio in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from httpx->blosc2) (4.7.0)
Requirement already satisfied: certifi in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from httpx->blosc2) (2024.8.30)
Requirement already satisfied: httpcore==1.* in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from httpx->blosc2) (1.0.7)
Requirement already satisfied: idna in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from httpx->blosc2) (3.10)
Requirement already satisfied: h11<0.15,>=0.13 in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from httpcore==1.*->httpx->blosc2) (0.14.0)
Requirement already satisfied: sniffio>=1.1 in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from anyio->httpx->blosc2) (1.3.1)
Requirement already satisfied: typing_extensions>=4.5 in ./opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages (from anyio->httpx->blosc2) (4.12.2)
Using cached blosc2-3.0.0-cp311-cp311-macosx_10_9_x86_64.whl (4.0 MB)
Installing collected packages: blosc2
Successfully installed blosc2-3.0.0
(tiled-client) esbarnard@esbstudio ~ % python
Python 3.11.11 (main, Dec 11 2024, 10:28:39) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import blosc2
^CTraceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages/blosc2/__init__.py", line 191, in <module>
cpu_info = get_cpu_info()
^^^^^^^^^^^^^^
File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages/blosc2/core.py", line 1192, in get_cpu_info
cpu_info_dict = _get_cpu_info()
^^^^^^^^^^^^^^^
File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages/blosc2/core.py", line 1156, in _get_cpu_info
cpu_info = cpuinfo.get_cpu_info()
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages/cpuinfo/cpuinfo.py", line 2759, in get_cpu_info
output = get_cpu_info_json()
^^^^^^^^^^^^^^^^^^^
File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/site-packages/cpuinfo/cpuinfo.py", line 2742, in get_cpu_info_json
output = p1.communicate()[0]
^^^^^^^^^^^^^^^^
File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/subprocess.py", line 1209, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/subprocess.py", line 2115, in _communicate
ready = selector.select(timeout)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/esbarnard/opt/anaconda3/envs/tiled-client/lib/python3.11/selectors.py", line 415, in select
fd_event_list = self._selector.poll(timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
>>>
Expected behavior
import blosc2 should return, but in this case it hangs waiting for a subprocess call in cpuinfo library
Desktop (please complete the following information):
- Blocs2 3.0.0
- py-cpuinfo 9.0.0
tested computer/os examples:
| Computer | macOS | python | works? |
|---|---|---|---|
| M3 Macbook Air | macOS 14.4 | Native ARM Python 3.12 | WORKS |
| M3 Macbook Air | macOS 15.2 | Native ARM Python 3.12 | WORKS |
| M1 Ultra Mac Studio | macOS 15.1.1 | Rosetta Python 3.12 | FAILS |
| M1 iMac | macOS 14.6.1 | Rosetta Python 3.12 | WORKS |
| Intel i9 MacBook Pro | macOS 15.0 | Native x86 Python 3.12 | WORKS |
Additional context Add any other context about the problem here.
That's unfortunate. Well, python-blosc2 essentially uses the cpu_info = cpuinfo.get_cpu_info() in py-cpuinfo for two things:
- Get the number of cores in the system (cpu_info['count'])
- Get the cache sizes (
cpu_info["l1_data_cache_size"]and other levels).
We already have code in place for the cache sizes, but only for Linux and Mac. For getting rid of the py-cpuinfo package we should need to complete cache sizes guessing for Windows, and provide the number of cores for three major platforms. Would you like to contribute a PR?
I will certainly look into it! From my investigations, getting number of cores is relatively straightforward in the standard packages:
import multiprocessing
multiprocessing.cpu_count()
# or
import os
os.cpu_count()
So it looks like L1,L2,L3 cache for Windows is the main hurdle.
I don't have Windows to test, but calling GetLogicalProcessorInformation through ctypes should help get the cache sizes on Windows:
import ctypes
GetLogicalProcessorInformation = ctypes.windll.kernel32.GetLogicalProcessorInformation