autowrap icon indicating copy to clipboard operation
autowrap copied to clipboard

Conversion: lib_cpp[libcpp_utf8_string] and ["s1", "s2", ...]

Open timosachsenberg opened this issue 3 years ago • 3 comments

Hi Uwe, if we wrap a function:

void addTags(libcpp_utf8_string key, libcpp_vector[libcpp_utf8_string] tags) nogil except +
following works: p.addTags(k, [b"", b"c"])
but: p.addTags(k, ["", "c"]) fails
 File "/home/sachsenb/miniconda3/lib/python3.8/site-packages/nose/case.py", line 197, in runTest
   self.test(*self.arg)
 File "/home/sachsenb/OMS/OpenMS/openms-build/pyOpenMS/tests/unittests/test000.py", line 32, in wrapper
   f(*a, **kw)
 File "/home/sachsenb/OMS/OpenMS/openms-build/pyOpenMS/tests/unittests/test000.py", line 1556, in testCompNovoIdentification
   _testParam(p)
 File "/home/sachsenb/OMS/OpenMS/openms-build/pyOpenMS/tests/unittests/test000.py", line 32, in wrapper
   f(*a, **kw)
 File "/home/sachsenb/OMS/OpenMS/openms-build/pyOpenMS/tests/unittests/test000.py", line 1429, in _testParam
   p.addTags(k, ["", "c"])
 File "pyopenms/pyopenms_3.pyx", line 6188, in pyopenms.pyopenms_3.Param.addTags
 File "stringsource", line 48, in vector.from_py.__pyx_convert_vector_from_py_std_3a__3a_string
 File "stringsource", line 15, in string.from_py.__pyx_convert_string_from_py_std__in_string
TypeError: expected bytes, str found

with an internal cython error. Any hints what needs to be done here? Best, the OpenMS guys

timosachsenberg avatar May 01 '21 07:05 timosachsenberg

What is libcpp_utf8_string? Best would be a minimal example to reproduce this...

uweschmitt avatar May 03 '21 09:05 uweschmitt

Hi Uwe, It is the one used in autowrap: https://github.com/uweschmitt/autowrap/search?q=libcpp_utf8_string relevant tests are probably: https://github.com/uweschmitt/autowrap/blob/master/tests/test_files/libcpp_utf8_string_test.hpp and around here: https://github.com/uweschmitt/autowrap/blob/master/tests/test_code_generator.py#L239

timosachsenberg avatar May 03 '21 11:05 timosachsenberg

I think the culprit is, that at certain points autowrap makes use of the auto-conversion feature of Cython, e.g. when assigning a std::vector<libcpp_string> to a python object (special conversion provider "hacks" like libcpp_utf8_string, do not affect this; it is the same cython type underneath). The types used in this auto-conversion, are defined by the pragmas # cython: c_string_type=unicode, c_string_encoding=utf8 (as an example). One is set automatically in CodeGenerator.py (btw: see my PR #103 )

I think it would be nice to be able to decide from outside, which types are expected to come from python (especially since in python 3 it is not bytes anymore but unicode). Potentially one could even make unicode the default by now.

Any thoughts on this @uweschmitt ? We basically want to avoid users typing b'foo' in front of every string in Python 3. And this is only partly possible with ConversionProviders.

jpfeuffer avatar Jun 03 '21 10:06 jpfeuffer