basemap icon indicating copy to clipboard operation
basemap copied to clipboard

Basemap readshapefile should read shapefile for the long/lat specified in the Basemap instance.

Open tacaswell opened this issue 9 years ago • 6 comments

@wxguy commented a day ago

I need to plot more than 100 images for a specific area which are not included in default Basemap. I have read lot of documentation regarding speeding up the matplotlib/basemap plotting examples. Most of them lead to this example https://github.com/matplotlib/basemap/blob/master/examples/hires.py . I agree that picklling does improve the code for Basemap instance. However, I also required to read the shape file once the Basemap instance is created like following:-

south = 0 north = 5 west = 70 east = 85

following is a global instance and read once only

m = Basemap(projection='merc', llcrnrlat=south, urcrnrlat=north, llcrnrlon=west, urcrnrlon=east, resolution='c')

the following line a part of plotting function and not a continuous line from above

m.readshapefile('data/gis-data/world_countries/ne_10m_admin_0_countries_lakes', 'ne_10m_admin_0_countries_lakes', linewidth=0.7)

Please see that I am reading shape file from NE 10 resolution file (~ 8 mb). When I read this shape file it takes about 10-15 sec which is too much for just map. This is a big problem as the code (m.readshapefile) reside inside the map plotting function and it has to be called to plot boundaries each time.

I have a question and a proposal (sort of) here.

Question:- How to improve the plotting time by reducing the reading time of shape file using the default m.readshapefile code. There could be some solution similar to pickle which I am not aware. Best people can share their experience here (with some example code).

Proposal (sort of):- The major problem with the reading shape file which I think is basemap reading the entire shapefile. Why should it read all the data from shapefile if you the lat/lon which are already specified in the basemap instance. Therefore, readshapefile should only read data within the lon/lat specified from the basemap instance. I hope that this way shapefile read time may be reduced greatly.

I may or may not be correct somewhere above. Forgive my ignorance. I hopr that some one may help me on reading shapefile/ plotting the map faster.

Note ::: Please don't suggest me to use lower resolution basemap/shapefile. @WeatherGod Matplotlib Developers member WeatherGod commented a day ago I think the difficulty with this request is that Basemap uses the pyshp package for handling reading of the shapefiles. The last time I checked, pyshp does not support any sort of constraint-based loading, so all of the data has to be read in anyway. Now, perhaps it might be possible to find some optimizations in the overall pipeline, I would have to do some profiling to find out where the bottlenecks are. It is indeed, too slow and painful at the moment. … @wxguy wxguy commented 20 hours ago

Thanks for the clarification.

Is it not possible to pickle the shapefile and use it again. At least this would improve the plotting speed. Also is it possible to use contour function so that shapefile is not required to be called many times? @wxguy wxguy commented 20 hours ago

It would be great and highly appreciated if this request is accepted.


Moved from https://github.com/matplotlib/matplotlib/issues/5993

tacaswell avatar Feb 13 '16 22:02 tacaswell

There are several spatial index formats for shapefiles. I find it interesting that that pyshp doesn't support any of them.

micahcochran avatar Feb 14 '16 14:02 micahcochran

I'm sorry but did we get a better way to speed up reading shapefile now? Pickle did help a lot in requiring existing basemap obj but the read_shapefile operation still have to be executed every time now..

leemengtw avatar Mar 01 '17 11:03 leemengtw

No new development work has been done for basemap for optimizations.

On Wed, Mar 1, 2017 at 6:57 AM, LeeMeng [email protected] wrote:

I'm sorry but did we get a better way to speed up reading shapefile now? Pickle did help a lot in requiring existing basemap obj but the read_shapefile operation still have to be executed every time now..

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/matplotlib/basemap/issues/263#issuecomment-283321666, or mute the thread https://github.com/notifications/unsubscribe-auth/AARy-EEXicJ0lmdvLfsqUWzqFLg7ibvIks5rhV0dgaJpZM4HZvJQ .

WeatherGod avatar Mar 01 '17 14:03 WeatherGod

Hi @abc123634:

I think I might have a solution for you, because readshapefile actually saves the info into fields of the basemap object, so you could reuse it without calling readshapefile as follows:

    #draw zones
    ax = fig.add_subplot(gs[0,0])
    basemap.drawcoastlines(ax = ax, linewidth=1.5)
    shp_info = basemap.readshapefile("data/pf_2/permafrost5_wgs84/permaice", name="zone",
            ax=ax, linewidth=3)
    print(shp_info)
    for nshape,seg in enumerate(basemap.zone):
        the_color = "green" if basemap.zone_info[nshape]["EXTENT"] == "C" else "red"
        poly = mpl.patches.Polygon(seg,facecolor=the_color, zorder = 10)
        ax.add_patch(poly)

    b1 = ax.bar([0],[0],color="green", label="Continuous")
    b2 = ax.bar([0],[0],color="r", label="Discontinuous")
    ax.legend(loc = "lower left")

Here I reuse the info to add the polygons again and color them based on a field value. You could create the polygon patches and then reuse them.

Is this smth you were looking for?

Cheers

guziy avatar Mar 01 '17 15:03 guziy

Hi @guziy

Yes this what I'm searching for! I'm just too stupid that didn't realize that the Polygon/shape information is stored after first readshapefile().

Now I can first read shapefile and picke the Basemap object, and in the following plotting, I can just

fig, ax = plt.subplots(figsize=(20, 20))
m = pickle.load(open('map.pickle','rb'))
 df_poly = pd.DataFrame({
            'shapes': [Polygon(np.array(shape), True) for shape in m.areas],
            'JCODE': [area['JCODE'] for area in m.areas_info]
        })
 pc = PatchCollection(df_poly.shapes, zorder=2)
 ax.add_collection(pc)

for quick redraw administrative boundary in Japan, thanks for the help!

leemengtw avatar Mar 02 '17 01:03 leemengtw

@leemengtaiwan Can you post the full code that achieves your goal of plotting several maps and also some example .shp files? I keep running into issues with the .shp files that I am importing and I have no way to tell if it is something with my code or something with my .shp file formats.

caseygierke avatar Jan 15 '19 18:01 caseygierke