rpy2 icon indicating copy to clipboard operation
rpy2 copied to clipboard

3.5.4 breaks some pandas conversions

Open Thijss opened this issue 2 years ago • 1 comments

Describe the issue or bug

I have a pandas dataframe (loaded from a .RDS file) with a column that contains the following types:

  • bool (True/False)
  • rpy2.rinterface_lib.sexp.NALogicalType (NA)

With rpy2 3.5.3 this dataframe can be converted to an RDataFrame. With rpy2 3.5.4 I am getting an error:

.../lib/python3.8/site-packages/rpy2/robjects/pandas2ri.py:196
@py2rpy.register(pandas.core.series.Series)
	def py2rpy_pandasseries(obj):
		....
>                   raise ValueError(
                        'Series can only be of one type, or None '
                        '(and here we have %s and %s). If happening with '
                        'a pandas DataFrame the method infer_objects() '
                        'will normalize data types before conversion.' %
                        (homogeneous_type, type(x)))


E                   ValueError: Series can only be of one type, or None (and here we have <class 'bool'> and <class 'rpy2.rinterface_lib.sexp.NALogicalType'>). If happening with a pandas DataFrame the method infer_objects() will normalize data types before conversion.

and

    r_output = get_conversion().py2rpy(df_input)
/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/functools.py:875: in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
.../lib/python3.8/site-packages/rpy2/robjects/pandas2ri.py:64: in py2rpy_pandasdataframe
    od[name] = StrVector(values)
.../lib/python3.8/site-packages/rpy2/robjects/vectors.py:409: in __init__
    super().__init__(obj)
.../lib/python3.8/site-packages/rpy2/rinterface_lib/sexp.py:695: in __init__
    robj: Sexp = type(self).from_object(obj)
.../lib/python3.8/site-packages/rpy2/rinterface_lib/sexp.py:616: in from_object
    res = cls.from_iterable(obj)
.../lib/python3.8/site-packages/rpy2/rinterface_lib/conversion.py:45: in _
    cdata = function(*args, **kwargs)
.../lib/python3.8/site-packages/rpy2/rinterface_lib/sexp.py:553: in from_iterable
    populate_func(iterable, r_vector, set_elt, cast_value)
.../lib/python3.8/site-packages/rpy2/rinterface_lib/sexp.py:497: in _populate_r_vector
    set_elt(r_vector, i, cast_value(v))
.../lib/python3.8/site-packages/rpy2/rinterface_lib/sexp.py:709: in _as_charsxp_cdata
    return conversion._str_to_charsxp(x)
.../lib/python3.8/site-packages/rpy2/rinterface_lib/conversion.py:147: in _str_to_charsxp
    cchar = _str_to_cchar(val, encoding='utf-8')
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

s = False, encoding = 'utf-8'

    def _str_to_cchar(s: str, encoding: str = 'utf-8'):
        # TODO: use isString and installTrChar
>       b = s.encode(encoding)
E       AttributeError: 'bool' object has no attribute 'encode'

I am not sure yet what exactly is triggering the problem, but it seems that the dependency change in 3.5.4 is doing more than expected.

Thijss avatar Sep 13 '22 11:09 Thijss

It is difficult to be certain without a reproducible example. This might be a duplicate of #916.

lgautier avatar Sep 13 '22 23:09 lgautier

Fixed with PR #951.

lgautier avatar Oct 10 '22 20:10 lgautier