basemap icon indicating copy to clipboard operation
basemap copied to clipboard

support pandas Series as input for lat/lon -> x/y conversion

Open jankatins opened this issue 10 years ago • 7 comments

Using pd.Series() as input for lat/long -> x/y conversion does currently raise a RuntimeError:

x,y = map_katins(df["lon"], df["lat"])
RuntimeError                              Traceback (most recent call last)
<ipython-input-139-693c51387321> in <module>()
     15 #    map_katins.plot(x, y, "ro", markersize=10)
     16 x,y = map_katins(df["lon"].values, df["lat"].values)
---> 17 x,y = map_katins(df["lon"], df["lat"])
     18 map_katins.plot(x, y, "o", color="grey", markersize=5)
     19 

C:\portabel\miniconda\envs\katins\lib\site-packages\mpl_toolkits\basemap\__init__.pyc in __call__(self, x, y, inverse)
   1146             except TypeError:
   1147                 y = [_dg2rad*yy for yy in y]
-> 1148         xout,yout = self.projtran(x,y,inverse=inverse)
   1149         if self.celestial and inverse:
   1150             try:

C:\portabel\miniconda\envs\katins\lib\site-packages\mpl_toolkits\basemap\proj.pyc in __call__(self, *args, **kw)
    284             outxy = self._proj4(xy, inverse=inverse)
    285         else:
--> 286             outx,outy = self._proj4(x, y, inverse=inverse)
    287         if inverse:
    288             if self.projection in ['merc','mill','gall']:

C:\portabel\miniconda\envs\katins\lib\site-packages\mpl_toolkits\basemap\pyproj.pyc in __call__(self, *args, **kw)
    386             _proj.Proj._inv(self, inx, iny, radians=radians, errcheck=errcheck)
    387         else:
--> 388             _proj.Proj._fwd(self, inx, iny, radians=radians, errcheck=errcheck)
    389         # if inputs were lists, tuples or floats, convert back.
    390         outx = _convertback(xisfloat,xislist,xistuple,inx)

_proj.pyx in _proj.Proj._fwd (src/_proj.c:1571)()

RuntimeError: 

Using

x,y = map_katins(df["lon"].values, df["lat"].values)

works...

import mpl_toolkits.basemap
mpl_toolkits.basemap.__version__
'1.0.7'

jankatins avatar Nov 29 '15 16:11 jankatins

True. But if we start adding code to support Pandas data structures, where does it end? For something like basemap, I'm inclined to think that the present "array-like" support is sufficient, and we can leave it to users to supply suitable inputs.

efiring avatar Nov 30 '15 06:11 efiring

Usually you can use pandas Series as drop-in replacements for numpy arrays. Series also supports the __array__(self, dtype=None) interface and so I think it is "array-like". Which interface does basemap expect?

jankatins avatar Nov 30 '15 10:11 jankatins

Idea: https://github.com/matplotlib/basemap/blob/master/lib/mpl_toolkits/basemap/pyproj.py#L538

   else:
+        # something which can be converted to a numpy array
+        if hasattr(x, '__array__'):
+            return _copytobuffer(x.__array__())
        # perhaps they are regular python arrays?
        if hasattr(x, 'typecode'):
            #x.typecode
            inx = array('d',x)

jankatins avatar Nov 30 '15 10:11 jankatins

Ok, this won't work, as the Series takes a different route, as it has a shape attribute (hasattr(x,'shape') == True) but that test assumes that all objects which have a shape are numpy arrays. So the problem is that _copytobuffer(Series(list("abcd"))) does not return a object which supports the buffer interface, but again a pd.Series.

It seems that the code tries very hard not to import numpy, so no idea what to do here without an explicit check for numpy or a np.asarray(x).

jankatins avatar Nov 30 '15 10:11 jankatins

This seems to work:

    # (array scalars don't support buffer API)
    if hasattr(x,'shape'):
+        # make sure we are a numpy array and not a pd.Series
+        x = x.__array__()
        if x.shape == ():
`` 

jankatins avatar Nov 30 '15 11:11 jankatins

Pandas objects have values property, which gives you a numpy array...

In [1]: import pandas as pd

In [2]: s = pd.Series(data=range(10))

In [3]: s.values Out[3]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

I do not see any problem in using s.values for basemap...

Cheers

2015-11-30 1:47 GMT-05:00 Eric Firing [email protected]:

True. But if we start adding code to support Pandas data structures, where does it end? For something like basemap, I'm inclined to think that the present "array-like" support is sufficient, and we can leave it to users to supply suitable inputs.

— Reply to this email directly or view it on GitHub https://github.com/matplotlib/basemap/issues/232#issuecomment-160538133.

Sasha

guziy avatar Nov 30 '15 14:11 guziy

There is a fair amount of legacy code here, back from the numeric/numarray/etc. days, which is why it is avoiding importing any of them.

I am against trying to call .values(). I shouldn't have to bend over backwards to support a particular package like that. Instead, I think importing numpy and calling np.asarray() should be sufficient?

On Mon, Nov 30, 2015 at 9:25 AM, Huziy Oleksandr (Sasha) < [email protected]> wrote:

Pandas objects have values property, which gives you a numpy array...

In [1]: import pandas as pd

In [2]: s = pd.Series(data=range(10))

In [3]: s.values Out[3]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

I do not see any problem in using s.values for basemap...

Cheers

2015-11-30 1:47 GMT-05:00 Eric Firing [email protected]:

True. But if we start adding code to support Pandas data structures, where does it end? For something like basemap, I'm inclined to think that the present "array-like" support is sufficient, and we can leave it to users to supply suitable inputs.

— Reply to this email directly or view it on GitHub <https://github.com/matplotlib/basemap/issues/232#issuecomment-160538133 .

Sasha

— Reply to this email directly or view it on GitHub https://github.com/matplotlib/basemap/issues/232#issuecomment-160644419.

WeatherGod avatar Nov 30 '15 15:11 WeatherGod