comtypes
comtypes copied to clipboard
Delayed loading of npsupport
I have been implementing some COM servers using comtypes, which are generally working very well. The design of the system is such that individual processing operations are provided by each COM server and are launched from a client GUI. In order to keep the GUI thread responsive, the servers each have a second "worker" thread to do their processing, the main thread only handles the COM interactions.
One of the issues that I have been confronting is start-up times for the server application, as this necessarily blocks the client software. This seems to be mainly caused by import times for the various libraries required, and so I have arranged things so that most libraries are imported by the worker thread rather than the main thread - however the most significant remaining library is numpy. I use this for processing, but it is not necessary to have the comtypes numpy interop behaviour. However, because numpy is installed, comtypes insists on importing it at startup, doubling the overall time to launch for a COM server. Do we think it would be possible to modify the way that npsupport is imported, either allowing it to be turned off completely, or even better to allow it to be loaded independently, i.e. import comtypes without it, starting up a server very fast, then import comtypes.npsupport (or call a function like comtypes.enable_np) to switch over to supporting it afterwards?
@bennyrowland
I also thought a situation like your issue was problematic.
I didn't really have a problem with it because I don't use the server
feature in my project.
Recently I have been working with restore test code(see #267, #298 and other issues/PRs). In the process, I noticed the following problem.
-
numpy
is referenced even though it is not a dependency.-
comtypes
will be developed as "a pure Python library (without any dependencies)" in the future as well(https://github.com/enthought/comtypes/issues/216#issuecomment-664517594 by @vasily-v-ryabov).
-
- Behavior of
safearray
and others changes depending on whethernumpy
exists in the environment or not.
As shown in numpy-interop
in the documentation, using safearray_as_ndarray
context manager to cast the return value to ndarray
.
Similarly, I thought that force to use safearray_as_ndarray
when using ndarray
as an argument, but this is not a good idea for backward compatibility since the behavior would be different from the previous one.
Currently, regardless of whether ndarray
is used or not, npsupport
is called to check whether numpy
exists in the environment first.
Therefore, I think a better approach is that npsupport
should only be imported when trying to use ndarray
as an argument.
What are your thoughts?
@junkmd tricky to import npsupport
when using ndarray
as an argument because how do you test if something is an ndarray
unless you have numpy.ndarray imported already (which is exactly what npsupport
already does).
At the moment, the public functions of npsupport all contain a check for HAVE_NUMPY
that can short-circuit the function if numpy
is not available. HAVE_NUMPY
gets set at import time if import numpy
succeeds, and then, also at import time, if HAVE_NUMPY
is set then some other housekeeping is done to set up the interop. My proposal is simply that npsupport should not import numpy
at its own import time, but provide a function "enable_numpy_interop()" (or similar) which will:
- Try
import numpy
- Set
HAVE_NUMPY
- Run the other housekeeping tasks
This would make importing numpy
opt-in, if you want comtypes to use numpy
then you would have to call the "enable_numpy_interop()" function. That function could also be inserted into things like the safearay_as_ndarray
context manager (if "enable_numpy_interop()" checks for HAVE_NUMPY
at the top then it will be a minimal burden to run it repeatedly) as such a use clearly requires numpy
support to be enabled.
This is not ideal for backwards compatibility as scripts that were previously using numpy automatically would now have to enable support explicitly. However, I think that if we were to modify HAVE_NUMPY
to instead be a function which would check for numpy
in sys.modules
and if found then check to see if the setup tasks for interop have been run, running them if necessary, then we should be get to a situation where if the user has imported numpy
themselves then everything will work as before - obviously they will have had to do that to create an ndarray
to pass as an argument, so it should become transparent.
Thoughts? It would also be nice to get some perspective from @vasily-v-ryabov or @cfarrow, so we know if this change will get accepted if I implement it...
Perhaps a correction, safearrays should not behave differently if you have numpy installed. This should only happen if you use the safearray_as_ndarray
context manager, which is explicit.
I'm in favor of replacing the import-time checks in npsupport with a enable_numpy_interop
function. This will make the use of numpy explicit, and the package will remain "pure python" by default.
Thanks, @cfarrow, but just to be clear, does that mean you would support a breaking change where numpy
support is only included with an explicit enable_numpy_interop()
call - what would then happen if an ndarray
gets passed without making that call (as will surely be the case with many existing scripts)?
I think the cleanest way to handle it is to allow that code to fail with a helpful error message if enable_numpy_interop
has not been called. This way, the behavior of the package will be solely determined by the code instead of the python environment.
This will break backwards compatibility, and therefore would warrant a major version bump.
Ok, I will have a look into making this work. It might still be possible to get comtypes
to accept an ndarray
as an input argument and simply send it as a SAFEARRAY by converting element by element (I think this happens for unsupported dtypes anyway) - this would obviously affect performance but not break scripts (I think), we could then simply emit a warning about the changed behaviour rather than an exception?
@bennyrowland @cfarrow
Thank you guys for replies and opinions.
I completely agree that it is better to do things explicitly.
However, backward compatibility is the only problem I had, and I didn't want developers to experience the sadness of not being able to use the API as they used to.
If calling a function at anywhere of the module makes they behave the same as before, that's fine.
The project that using comtypes
may involves some/many correspondence.
But I think that the above changes will not require much man-hours to deal with.
If there is a "pure python" default that works, then a warning would be appropriate.
For the future, another option is to not assume a ndarray but instead check for a buffer. This would allow comtypes to handle a variety of array types efficiently. (I'm not quite sure how to do that, but it would be fun to figure out.)
@bennyrowland @cfarrow
I agree with the major version update.
Look at the history, it has only been revision updates for the last 8 years!
I would be willing to help repair the tests that use numpy
if this change would allow us to explicitly decide whether to use numpy
or not.
@vasily-v-ryabov How does this proposed change affect your release plan?
@bennyrowland Do you have any opinions when a version with this changes is good to release?
I propose to release 1.1.13
, which fixes bugs in 1.1.12
, before this change.
After that, release a major update that applies this change.
This purpose is to prevent developers from using 1.1.12
when they are trying to use a version previous to the major version update related with numpy
.
Hi all, I started working on this project this morning and began with running the tests on the latest master branch using python -m unittest discover -v -s ./comtypes/test -t comtypes\test
(cribbed from appveyor.yml, is there a better command?) and there are already 4 tests that fail, all of which seem related to numpy interop:
unittest output
======================================================================
ERROR: test_UDT_ndarray (test_safearray.SafeArrayTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "H:\Python\comtypes\comtypes\safearray.py", line 60, in _midlSAFEARRAY
return POINTER(_safearray_type_cache[itemtype])
KeyError: <class 'comtypes.gen._5A3E1D1D_947A_44AC_9B03_5C37D5F5FFFC_0_1_0.MYCOLOR'>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "H:\Python\comtypes\comtypes\safearray.py", line 78, in _make_safearray_type
vartype = _ctype_to_vartype[itemtype]
KeyError: <class 'comtypes.gen._5A3E1D1D_947A_44AC_9B03_5C37D5F5FFFC_0_1_0.MYCOLOR'>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "H:\Python\comtypes\comtypes\test\test_safearray.py", line 427, in test_UDT_ndarray
t = _midlSAFEARRAY(MYCOLOR)
File "H:\Python\comtypes\comtypes\safearray.py", line 62, in _midlSAFEARRAY
sa_type = _make_safearray_type(itemtype)
File "H:\Python\comtypes\comtypes\safearray.py", line 88, in _make_safearray_type
extra = GetRecordInfoFromGuids(*guids)
File "H:\Python\comtypes\comtypes\typeinfo.py", line 469, in GetRecordInfoFromGuids
_oleaut32.GetRecordInfoFromGuids(byref(GUID(rGuidTypeLib)),
File "_ctypes/callproc.c", line 997, in GetResult
OSError: [WinError -2147319779] Library not registered
======================================================================
ERROR: test_VT_UNKNOWN_multi_ndarray (test_safearray.SafeArrayTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "H:\test_venv\lib\site-packages\numpy\core\_internal.py", line 652, in _dtype_from_pep3118
dtype, align = __dtype_from_pep3118(stream, is_subdtype=False)
File "H:\test_venv\lib\site-packages\numpy\core\_internal.py", line 729, in __dtype_from_pep3118
raise ValueError("Unknown PEP 3118 data type specifier %r" % stream.s)
ValueError: Unknown PEP 3118 data type specifier 'P'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "H:\Python\comtypes\comtypes\test\test_safearray.py", line 376, in test_VT_UNKNOWN_multi_ndarray
arr = get_array(sa)
File "H:\Python\comtypes\comtypes\test\test_safearray.py", line 19, in get_array
return sa[0]
File "H:\Python\comtypes\comtypes\safearray.py", line 216, in __getitem__
return self.unpack()
File "H:\Python\comtypes\comtypes\safearray.py", line 251, in unpack
return numpy.asarray(result)
ValueError: '<P' is not a valid PEP 3118 buffer format string
======================================================================
ERROR: test_VT_VARIANT_ndarray (test_safearray.SafeArrayTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "H:\Python\comtypes\comtypes\test\test_safearray.py", line 248, in test_VT_VARIANT_ndarray
sa = t.from_param(inarr)
File "H:\Python\comtypes\comtypes\safearray.py", line 208, in from_param
value = cls.create(value, extra)
File "H:\Python\comtypes\comtypes\safearray.py", line 117, in create
return cls.create_from_ndarray(value, extra)
File "H:\Python\comtypes\comtypes\safearray.py", line 160, in create_from_ndarray
value = _ndarray_to_variant_array(value)
File "H:\Python\comtypes\comtypes\safearray.py", line 387, in _ndarray_to_variant_array
varr.flat = [VARIANT(v) for v in value.flat]
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 110, in dtype_from_ctypes_type
return _from_ctypes_structure(t)
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 65, in _from_ctypes_structure
fields.append((fname, dtype_from_ctypes_type(ftyp)))
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 112, in dtype_from_ctypes_type
return _from_ctypes_union(t)
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 90, in _from_ctypes_union
formats.append(dtype_from_ctypes_type(ftyp))
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 110, in dtype_from_ctypes_type
return _from_ctypes_structure(t)
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 65, in _from_ctypes_structure
fields.append((fname, dtype_from_ctypes_type(ftyp)))
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 112, in dtype_from_ctypes_type
return _from_ctypes_union(t)
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 90, in _from_ctypes_union
formats.append(dtype_from_ctypes_type(ftyp))
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 114, in dtype_from_ctypes_type
return _from_ctypes_scalar(t)
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 80, in _from_ctypes_scalar
return np.dtype(t._type_)
TypeError: data type 'v' not understood
======================================================================
ERROR: test_mixed (test_variant.NdArrayTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "H:\Python\comtypes\comtypes\test\test_variant.py", line 275, in test_mixed
v.value = a
File "H:\Python\comtypes\comtypes\automation.py", line 317, in _set_value
obj = _midlSAFEARRAY(VARIANT).create(value)
File "H:\Python\comtypes\comtypes\safearray.py", line 117, in create
return cls.create_from_ndarray(value, extra)
File "H:\Python\comtypes\comtypes\safearray.py", line 160, in create_from_ndarray
value = _ndarray_to_variant_array(value)
File "H:\Python\comtypes\comtypes\safearray.py", line 387, in _ndarray_to_variant_array
varr.flat = [VARIANT(v) for v in value.flat]
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 110, in dtype_from_ctypes_type
return _from_ctypes_structure(t)
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 65, in _from_ctypes_structure
fields.append((fname, dtype_from_ctypes_type(ftyp)))
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 112, in dtype_from_ctypes_type
return _from_ctypes_union(t)
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 90, in _from_ctypes_union
formats.append(dtype_from_ctypes_type(ftyp))
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 110, in dtype_from_ctypes_type
return _from_ctypes_structure(t)
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 65, in _from_ctypes_structure
fields.append((fname, dtype_from_ctypes_type(ftyp)))
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 112, in dtype_from_ctypes_type
return _from_ctypes_union(t)
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 90, in _from_ctypes_union
formats.append(dtype_from_ctypes_type(ftyp))
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 114, in dtype_from_ctypes_type
return _from_ctypes_scalar(t)
File "H:\test_venv\lib\site-packages\numpy\core\_dtype_ctypes.py", line 80, in _from_ctypes_scalar
return np.dtype(t._type_)
TypeError: data type 'v' not understood
----------------------------------------------------------------------
Does anybody know anything about these errors, is this an environment / numpy version issue on my end (I am running 1.23.1) or are these tests failing for others? Do these issues need fixing before making any other changes to numpy interop?
OSError: [WinError -2147319779] Library not registered
It is probably required registering TestComServerLib
class in the registry.
Yep, I figured something like that would probably be the case - I have some ideas about trying to fake that but that will have to wait for another time. It is the other errors that I have been looking at. On closer inspection, the problems seem to relate to creating a numpy array of VARIANT
dtype in _ndarray_to_variant_array()
in safearray.py
. The function creates a list of individual variants from the source numpy array, then assigns them to the empty VARIANT
array. This prompts a check of the existing dtype of each VARIANT
as they go through, but some of the possible types (including VARIANT_BOOL
and c_wchar_p
) do not have numpy compatible dtypes, so numpy throws an exception when trying to work out the dtype (because this is a union dtype, it doesn't matter whether the contents of the VARIANT
has a known dtype, only that some of the possible dtypes are not valid). I assume it needs to know the dtype so it can work out if it needs to do some kind of cast to go in to the target ndarray
. It is not at all clear to me how to solve this problem, perhaps we need a npVARIANT
class which only contains numpy compatible dtypes? Also, presumably these tests did pass at some point, what has changed that broke them, and was it in comtypes or numpy (or even ctypes)? Any thoughts welcome.
The numpy code in comtypes got a lot of attention in February. It might be worth checking if one of those changes broke the tests.
FWIW, "<P"
highlighted in test_VT_UNKNOWN_multi_ndarray
should never have worked AFAICT. So, whatever is deciding the typecode is "<P"
is probably to blame for that one. I suspect the changing of ctypeslib._typecodes
dictionary order in #239 here.
Well, that was my last contribution, so it could well be at fault :-). I think that the type code "<P"
is probably related to the repr
being <POINTER(IUNKNOWN)>
but it could be unrelated. I will investigate further.
@cfarrow, so I have dug a little deeper into the "<P" typecode issue and it is definitely not related to #239, I rolled back before that commit and still see the issue. The problem seems to be that ctypes POINTER instances appear to support the PEP3118 Buffer Protocol, but actually are non-compliant, so for example if we run:
import comtypes
from comtypes.typeinfo import CreateTypeLib
punk = CreateTypeLib("spam").QueryInterface(comtypes.IUnknown)
# punk is now <POINTER(IUnknown)>
mv = memoryview(punk) # this succeeds but probably shouldn't
mv.format # '<P'
import numpy as np
arr = np.asarray(punk) # this errors with the "Unknown PEP 3118 data type specifier 'P'" message
So numpy, when trying to create an array from punk
, thinks it is an array of bytes that should be pulled in and immediately asks it for the format of each data element so it can work out the dtype, then fails because the format is "<P" which is not a valid format for PEP3118. I suspect that the correct format might need to be "O", this is what a memoryview of an ndarray with object dtype has as its format, I guess any POINTER that is not to a known type (like c_int or c_float) should probably be "O".
So I guess this issue might have to be raised with ctypes?
Guys I agree that numpy related changes could be a major version update to 1.2.0. So it will shift the end of Py2.7 support to 1.3.0 next year.
Ok, a little bit more information: it looks like this came in with np 1.15.0 which changed the way the PEP3118 buffer protocol was used to create arrays, see https://github.com/numpy/numpy/issues/11150. I think fixing this issue is way outside of our scope, creating numpy arrays from safearrays of ints, floats, bools etc. is all fine, it is only creating them from c_void_p that creates the problem, which is an unlikely use case in practice. I suggest we put this test on the skip list with an explanation - anyone that needs it fixing is likely to have a better idea than I do of how to do it.
The other issue, regarding VARIANT_BOOL, seems more like a comtypes
level problem. I will open a new issue to discuss solving that.
I have implemented a simple fix for npsupport being opt-in, basically if you don't call enable_numpy_interop()
then it is the same as if numpy
is not in the environment. All tests that did pass still pass this way, and I have one more test checking the error message is raised if you try and do things before enabling the interop, but there probably need to be more tests included, and particularly I have done nothing about the case of passing in ndarray
s as parameters if you haven't enabled npsupport, so they will currently fail with an unhelpful error message about failing to create the variant.
I think we can reasonably identify ndarray
s by their __array_interface__
attribute, and use that to raise a ValueError("Looks like an ndarray but np interop not enabled") kind of thing, but there is also the possibility of trying to use the Buffer Protocol in comtypes directly which might make it just work without even needing numpy there at all (at least for standard dtypes). Any thoughts?
I will put together a PR soon so you can comment directly on the code, one thing that I did wonder about is whether we want to look at having multiple test environments (perhaps with tox) so that we can have an env with numpy installed and one without, with tests expecting different results in the different envs. We can simulate that to a certain extent by calling or not calling enable_numpy_interop()
in test setup, but there are things that could go wrong with that. I know that the test system has undergone a lot of changes recently, have any decisions about things like that been reached?
Nice diagnosis.
Regarding numpy vs buffers, I think we'd want to support buffers in the long run, but can get there through updating the numpy support.
I think we'll want multiple environments for testing eventually. If the plan is to generalize from numpy to buffers, however, it may not be worth the short-term overhead to do this solely for numpy support.
I think we will still need numpy support for the safearray_as_ndarray
direction at the very least, as that has to have numpy available to pass the buffer into, and at the moment when receiving a SAFEARRAY is to either convert straight to numpy ndarray or go to a list or other element by element conversion to Python. I suppose we could introduce a safearray_as_buffer
context manager instead, then the user can pass the buffer to numpy (or array or some other thing if they want), that might let us get out of any dependency on numpy in comtypes. There might be a few issues with certain types like datetimes though.
True. If we want to test for unintended interactions, multiple test environments will be needed.
@bennyrowland @cfarrow @vasily-v-ryabov
Since #337 and #343 have been merged, I will close this issue as resolved.
If the issue remains, please re-open.