cuspatial icon indicating copy to clipboard operation
cuspatial copied to clipboard

Add python benchmarks for `from_geopandas`

Open thomcom opened this issue 2 years ago • 7 comments

This PR adds benchmarks for the from_geopandas method. It depends on #585 and https://github.com/rapidsai/integration/pull/505

thomcom avatar Jul 22 '22 17:07 thomcom

What is the integration repository that you refer to? I will do so when I know.

thomcom avatar Jul 26 '22 15:07 thomcom

What is the integration repository that you refer to? I will do so when I know.

Ah sorry! I had the link in my clipboard and forgot to paste it. Here it is.

  • https://github.com/rapidsai/integration/blob/2a6502dcfca40337c941ac4a5a348632defcbf3c/conda/recipes/versions.yaml#L75-L76

ajschmidt8 avatar Jul 26 '22 16:07 ajschmidt8

done, thanks @ajschmidt8 !

thomcom avatar Jul 26 '22 16:07 thomcom

Right. I originally started out trying to support cudf's benchmarking framework, but after discussing with @vyasr it didn't seem necessary or even appropriate at this time.

  1. cuspatial is more of a ListSeries library than a Dataframe library - everything that supports dataframes is at best redundant with cudf and at worst it is going to become divergent.
  2. cuspatial doesn't really support dtypes at this time. I think that our floating point columns usually support float32 or float64, now, but otherwise all columns have a fixed type for each API. GeoSeries can have a single type in a series, or completely heterogeneous types. Having type specific tests will apply to certain GeoSeries operations, eventually, but not yet.
  3. GeoSeries provides a fairly small API surface that is parallel to GeoPandas. Everything else in cuspatial does not have a language-specific analog. We don't need to switch easily between geopandas and cuspatial yet, for example.

For these reasons I think we should start out with a trimmer benchmark library for cuspatial.

thomcom avatar Jul 27 '22 21:07 thomcom

Please retarget for 22.10.

harrism avatar Aug 02 '22 23:08 harrism

@thomcom do you mind writing up the python benchmark docs for #599 since this PR first introduced the benchmark suite?

isVoid avatar Aug 03 '22 18:08 isVoid

Still needed.

isVoid avatar Sep 06 '22 15:09 isVoid

Some comments below. Curious, how long does it take to run the full benchmark suite?

The full set of tests takes 33 seconds on small default input data.

thomcom avatar Sep 30 '22 15:09 thomcom

(rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ time pytest
================================================================================================================================= test session starts =================================================================================================================================
platform linux -- Python 3.8.13, pytest-7.1.3, pluggy-1.0.0
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/tcomer/mnt/NVIDIA/rapids-docker/cuspatial/python/cuspatial/benchmarks, configfile: pytest.ini
plugins: cov-3.0.0, cases-3.6.13, benchmark-3.4.1, forked-1.4.0, xdist-2.5.0, hypothesis-6.54.6
collected 21 items                                                                                                                                                                                                                                                                    

api/bench_api.py ..................                                                                                                                                                                                                                                             [ 85%]
io/bench_geoseries.py ...                                                                                                                                                                                                                                                       [100%]


---------------------------------------------------------------------------------------------------------------- benchmark: 21 tests -----------------------------------------------------------------------------------------------------------------
Name (time in us)                                       Min                       Max                      Mean                 StdDev                    Median                     IQR            Outliers         OPS            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bench_haversine_distance                           139.0160 (1.0)            318.3000 (1.0)            154.8732 (1.0)          32.1748 (1.0)            141.6490 (1.0)            4.5692 (1.0)       523;890  6,456.8943 (1.0)        4519           1
bench_lonlat_to_cartesian                          210.3299 (1.51)           478.5019 (1.50)           237.2452 (1.53)         48.6556 (1.51)           215.8792 (1.52)          13.6376 (2.98)      425;562  4,215.0479 (0.65)       3329           1
bench_points_in_spatial_window                     259.7428 (1.87)           529.4732 (1.66)           295.5240 (1.91)         59.0128 (1.83)           265.1850 (1.87)          24.5259 (5.37)      464;531  3,383.8204 (0.52)       2830           1
bench_trajectory_distances_and_speeds              362.3699 (2.61)           685.7242 (2.15)           391.5284 (2.53)         53.9242 (1.68)           368.6621 (2.60)          16.2490 (3.56)      217;343  2,554.0930 (0.40)       2046           1
bench_trajectory_bounding_boxes                    384.6721 (2.77)           732.2170 (2.30)           427.3871 (2.76)         79.3527 (2.47)           391.4400 (2.76)          25.4575 (5.57)      233;342  2,339.7992 (0.36)       2051           1
bench_polyline_bounding_boxes                      462.0778 (3.32)           858.6980 (2.70)           491.5333 (3.17)         63.6829 (1.98)           468.8120 (3.31)          13.5623 (2.97)      136;260  2,034.4501 (0.32)       1689           1
bench_polygon_bounding_boxes                       517.5611 (3.72)         1,042.5448 (3.28)           597.3495 (3.86)        127.7777 (3.97)           527.9840 (3.73)          81.9925 (17.94)     263;263  1,674.0619 (0.26)       1411           1
bench_pairwise_linestring_distance                 639.9602 (4.60)           964.9121 (3.03)           705.2090 (4.55)         68.0856 (2.12)           676.3469 (4.77)         107.4563 (23.52)       213;6  1,418.0193 (0.22)       1097           1
bench_quadtree_point_to_nearest_polyline           873.2041 (6.28)         1,515.8060 (4.76)           911.7086 (5.89)         66.9581 (2.08)           889.4689 (6.28)          29.3235 (6.42)        52;68  1,096.8416 (0.17)        688           1
bench_io_read_polygon_shapefile                  1,685.8699 (12.13)        2,333.1030 (7.33)         2,046.8916 (13.22)       299.6621 (9.31)         2,121.5100 (14.98)        560.6051 (122.69)        1;0    488.5457 (0.08)          5           1
bench_derive_trajectories                        2,237.0580 (16.09)        6,979.4860 (21.93)        2,746.7828 (17.74)       528.4972 (16.43)        2,634.2644 (18.60)        723.4351 (158.33)       36;3    364.0623 (0.06)        386           1
bench_io_geoseries_from_offsets                  6,929.0400 (49.84)        9,210.2559 (28.94)        7,498.8532 (48.42)       720.3332 (22.39)        7,215.1589 (50.94)        941.2048 (205.99)        1;0    133.3537 (0.02)         10           1
bench_quadtree_point_in_polygon                  8,008.3192 (57.61)       14,782.9410 (46.44)        9,987.2674 (64.49)     1,968.0318 (61.17)        8,548.2986 (60.35)      3,810.6833 (834.00)       32;0    100.1275 (0.02)        120           1
bench_quadtree_on_points                        12,491.7610 (89.86)       15,702.6912 (49.33)       12,866.9665 (83.08)       564.0410 (17.53)       12,640.5515 (89.24)        352.6580 (77.18)         8;8     77.7184 (0.01)         84           1
bench_from_geoseries_100                        17,417.7901 (125.29)      96,352.6999 (302.71)      20,873.9450 (134.78)   10,861.2324 (337.57)      19,178.4850 (135.39)     1,739.9152 (380.79)        1;2     47.9066 (0.01)         51           1
bench_io_from_geopandas                         21,291.4760 (153.16)      23,606.3749 (74.16)       22,073.7033 (142.53)      493.7965 (15.35)       22,011.4351 (155.39)       614.4474 (134.48)       11;1     45.3028 (0.01)         37           1
bench_io_to_geopandas                           32,513.8462 (233.89)      51,633.7841 (162.22)      35,600.8441 (229.87)    4,169.8471 (129.60)      34,209.4060 (241.51)     2,832.5964 (619.93)        4;3     28.0892 (0.00)         29           1
bench_directed_hausdorff_distance               53,695.8429 (386.26)     123,995.8352 (389.56)      59,650.9127 (385.16)   16,143.7641 (501.75)      56,051.5680 (395.71)     2,942.6329 (644.02)        1;1     16.7642 (0.00)         18           1
bench_from_geoseries_1000                      100,644.1731 (723.98)     123,424.3310 (387.76)     107,189.1096 (692.11)    8,004.7326 (248.79)     104,656.7055 (738.85)     8,843.0239 (>1000.0)       1;0      9.3293 (0.00)          8           1
bench_point_in_polygon                         203,288.3549 (>1000.0)    216,483.1341 (680.12)     206,608.6186 (>1000.0)   5,588.8981 (173.70)     203,943.8270 (>1000.0)    4,700.9572 (>1000.0)       1;1      4.8401 (0.00)          5           1
bench_from_geoseries_10000                   1,015,495.3292 (>1000.0)  1,156,930.4019 (>1000.0)  1,079,089.5166 (>1000.0)  64,659.0465 (>1000.0)  1,055,592.2610 (>1000.0)  117,687.0280 (>1000.0)       1;0      0.9267 (0.00)          5           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean
================================================================================================================================= 21 passed in 28.91s =================================================================================================================================

real	0m32.592s
user	0m29.669s
sys	0m2.641s

thomcom avatar Sep 30 '22 15:09 thomcom

@gpucibot merge

thomcom avatar Sep 30 '22 18:09 thomcom