cuspatial
cuspatial copied to clipboard
Add python benchmarks for `from_geopandas`
This PR adds benchmarks for the from_geopandas
method. It depends on #585 and https://github.com/rapidsai/integration/pull/505
What is the integration repository that you refer to? I will do so when I know.
What is the integration repository that you refer to? I will do so when I know.
Ah sorry! I had the link in my clipboard and forgot to paste it. Here it is.
- https://github.com/rapidsai/integration/blob/2a6502dcfca40337c941ac4a5a348632defcbf3c/conda/recipes/versions.yaml#L75-L76
done, thanks @ajschmidt8 !
Right. I originally started out trying to support cudf's benchmarking framework, but after discussing with @vyasr it didn't seem necessary or even appropriate at this time.
- cuspatial is more of a ListSeries library than a Dataframe library - everything that supports dataframes is at best redundant with cudf and at worst it is going to become divergent.
- cuspatial doesn't really support dtypes at this time. I think that our floating point columns usually support float32 or float64, now, but otherwise all columns have a fixed type for each API. GeoSeries can have a single type in a series, or completely heterogeneous types. Having type specific tests will apply to certain GeoSeries operations, eventually, but not yet.
- GeoSeries provides a fairly small API surface that is parallel to GeoPandas. Everything else in cuspatial does not have a language-specific analog. We don't need to switch easily between geopandas and cuspatial yet, for example.
For these reasons I think we should start out with a trimmer benchmark library for cuspatial.
Please retarget for 22.10.
@thomcom do you mind writing up the python benchmark docs for #599 since this PR first introduced the benchmark suite?
Still needed.
Some comments below. Curious, how long does it take to run the full benchmark suite?
The full set of tests takes 33 seconds on small default input data.
(rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ time pytest
================================================================================================================================= test session starts =================================================================================================================================
platform linux -- Python 3.8.13, pytest-7.1.3, pluggy-1.0.0
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/tcomer/mnt/NVIDIA/rapids-docker/cuspatial/python/cuspatial/benchmarks, configfile: pytest.ini
plugins: cov-3.0.0, cases-3.6.13, benchmark-3.4.1, forked-1.4.0, xdist-2.5.0, hypothesis-6.54.6
collected 21 items
api/bench_api.py .................. [ 85%]
io/bench_geoseries.py ... [100%]
---------------------------------------------------------------------------------------------------------------- benchmark: 21 tests -----------------------------------------------------------------------------------------------------------------
Name (time in us) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bench_haversine_distance 139.0160 (1.0) 318.3000 (1.0) 154.8732 (1.0) 32.1748 (1.0) 141.6490 (1.0) 4.5692 (1.0) 523;890 6,456.8943 (1.0) 4519 1
bench_lonlat_to_cartesian 210.3299 (1.51) 478.5019 (1.50) 237.2452 (1.53) 48.6556 (1.51) 215.8792 (1.52) 13.6376 (2.98) 425;562 4,215.0479 (0.65) 3329 1
bench_points_in_spatial_window 259.7428 (1.87) 529.4732 (1.66) 295.5240 (1.91) 59.0128 (1.83) 265.1850 (1.87) 24.5259 (5.37) 464;531 3,383.8204 (0.52) 2830 1
bench_trajectory_distances_and_speeds 362.3699 (2.61) 685.7242 (2.15) 391.5284 (2.53) 53.9242 (1.68) 368.6621 (2.60) 16.2490 (3.56) 217;343 2,554.0930 (0.40) 2046 1
bench_trajectory_bounding_boxes 384.6721 (2.77) 732.2170 (2.30) 427.3871 (2.76) 79.3527 (2.47) 391.4400 (2.76) 25.4575 (5.57) 233;342 2,339.7992 (0.36) 2051 1
bench_polyline_bounding_boxes 462.0778 (3.32) 858.6980 (2.70) 491.5333 (3.17) 63.6829 (1.98) 468.8120 (3.31) 13.5623 (2.97) 136;260 2,034.4501 (0.32) 1689 1
bench_polygon_bounding_boxes 517.5611 (3.72) 1,042.5448 (3.28) 597.3495 (3.86) 127.7777 (3.97) 527.9840 (3.73) 81.9925 (17.94) 263;263 1,674.0619 (0.26) 1411 1
bench_pairwise_linestring_distance 639.9602 (4.60) 964.9121 (3.03) 705.2090 (4.55) 68.0856 (2.12) 676.3469 (4.77) 107.4563 (23.52) 213;6 1,418.0193 (0.22) 1097 1
bench_quadtree_point_to_nearest_polyline 873.2041 (6.28) 1,515.8060 (4.76) 911.7086 (5.89) 66.9581 (2.08) 889.4689 (6.28) 29.3235 (6.42) 52;68 1,096.8416 (0.17) 688 1
bench_io_read_polygon_shapefile 1,685.8699 (12.13) 2,333.1030 (7.33) 2,046.8916 (13.22) 299.6621 (9.31) 2,121.5100 (14.98) 560.6051 (122.69) 1;0 488.5457 (0.08) 5 1
bench_derive_trajectories 2,237.0580 (16.09) 6,979.4860 (21.93) 2,746.7828 (17.74) 528.4972 (16.43) 2,634.2644 (18.60) 723.4351 (158.33) 36;3 364.0623 (0.06) 386 1
bench_io_geoseries_from_offsets 6,929.0400 (49.84) 9,210.2559 (28.94) 7,498.8532 (48.42) 720.3332 (22.39) 7,215.1589 (50.94) 941.2048 (205.99) 1;0 133.3537 (0.02) 10 1
bench_quadtree_point_in_polygon 8,008.3192 (57.61) 14,782.9410 (46.44) 9,987.2674 (64.49) 1,968.0318 (61.17) 8,548.2986 (60.35) 3,810.6833 (834.00) 32;0 100.1275 (0.02) 120 1
bench_quadtree_on_points 12,491.7610 (89.86) 15,702.6912 (49.33) 12,866.9665 (83.08) 564.0410 (17.53) 12,640.5515 (89.24) 352.6580 (77.18) 8;8 77.7184 (0.01) 84 1
bench_from_geoseries_100 17,417.7901 (125.29) 96,352.6999 (302.71) 20,873.9450 (134.78) 10,861.2324 (337.57) 19,178.4850 (135.39) 1,739.9152 (380.79) 1;2 47.9066 (0.01) 51 1
bench_io_from_geopandas 21,291.4760 (153.16) 23,606.3749 (74.16) 22,073.7033 (142.53) 493.7965 (15.35) 22,011.4351 (155.39) 614.4474 (134.48) 11;1 45.3028 (0.01) 37 1
bench_io_to_geopandas 32,513.8462 (233.89) 51,633.7841 (162.22) 35,600.8441 (229.87) 4,169.8471 (129.60) 34,209.4060 (241.51) 2,832.5964 (619.93) 4;3 28.0892 (0.00) 29 1
bench_directed_hausdorff_distance 53,695.8429 (386.26) 123,995.8352 (389.56) 59,650.9127 (385.16) 16,143.7641 (501.75) 56,051.5680 (395.71) 2,942.6329 (644.02) 1;1 16.7642 (0.00) 18 1
bench_from_geoseries_1000 100,644.1731 (723.98) 123,424.3310 (387.76) 107,189.1096 (692.11) 8,004.7326 (248.79) 104,656.7055 (738.85) 8,843.0239 (>1000.0) 1;0 9.3293 (0.00) 8 1
bench_point_in_polygon 203,288.3549 (>1000.0) 216,483.1341 (680.12) 206,608.6186 (>1000.0) 5,588.8981 (173.70) 203,943.8270 (>1000.0) 4,700.9572 (>1000.0) 1;1 4.8401 (0.00) 5 1
bench_from_geoseries_10000 1,015,495.3292 (>1000.0) 1,156,930.4019 (>1000.0) 1,079,089.5166 (>1000.0) 64,659.0465 (>1000.0) 1,055,592.2610 (>1000.0) 117,687.0280 (>1000.0) 1;0 0.9267 (0.00) 5 1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Legend:
Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
OPS: Operations Per Second, computed as 1 / Mean
================================================================================================================================= 21 passed in 28.91s =================================================================================================================================
real 0m32.592s
user 0m29.669s
sys 0m2.641s
@gpucibot merge