vector-benchmark
vector-benchmark copied to clipboard
[WIP] Add benchmarks for GeometryOps.jl (a Julia package)
https://github.com/asinghvi17/GeometryOps.jl is a Julia package for (mostly) vector geometry operations. It's still pretty early stage, but I realized it could do half the operations in this benchmark, so wanted to get a foundation of code going.
This PR adds Julia capability to the run_benchmarks.sh, and a folder geometryops which contains:
- Benchmark files for GeometryOps.jl.
- A Julia
Project.tomlwhich defines a list of packages which must be installed, that Julia can be pointed to. - (Optional) A Julia plotting file in
geometryops/plots.jl.
Here are the comparisons with GeometryOps,
(edited from the original)
This PR is still WIP, but is in a runnable state now.
Thanks, great idea! I will be happy to add this as several people have asked to include Julia in benchmarks. I don't know Julia personally, but I think I will test it during the holidays.
@evetion, what do you think?
This stuff is great, finally having generic native Julia code for things we would otherwise use GDAL/GEOS for (as do most libraries). In that sense, would be good to benchmark that as well (LibGEOS.jl).
I think we discussed benchmarking at least some file loading in Julia last summer, would be good to eventually include that here as well, but that's not the point of GeometryOps.jl.
PS. Where has GEOS gone in the graph?
Good point @evetion - I think I hadn't installed the R GEOS package on my machine then, so it didn't run. Posting updated benchmarks here (plus GeometryOps calling out to GEOS's buffer:
BTW: Have you seen this year's edition of Spatial Data Science across Languages organized by Martin in Prague? Maybe you will be interested as Julia programmers.
I've shared the invitation with @evetion I believe (and Martijn and Fabian) but happy to extend it! My knowledge of Julia-land is limited, so feel free to throw names at me.
I'll most likely be there, also an author of GeometryOps.jl and in EU
We have just released a new version of GeometryOps with support for buffer - this PR should be ready to merge after that!
I've just run and updated the PR with the latest changes to GeoDataFrames.jl, which uses GDAL's chunked writes to get some more speedup.
@kadyb this should now run with no additional setup, so the PR is good to merge from my end!
Thank you very much! I haven't had time to sit it down yet, but I will look into it during the holidays. (There is one problem, because I longer haven't access to the machine on which I tested this, but I will ask someone to help). The second issue is that we also need to update geopandas, because it now uses a new, faster engine (pyogiro) to load and save data.
So overall, based on the new results, Julia outperforms the R and Python packages and the GEOS binding. It would be also interesting to see what the performance of binding to georust looks like (rsgeo).
And one more question that I am curious about. Will the geometryops binding from R/Python be the fastest of all the packages tested? If so, maybe Julia will eventually replace Rust and C++ in the future?
Thank you very much! I haven't had time to sit it down yet, but I will look into it during the holidays. (There is one problem, because I longer haven't access to the machine on which I tested this, but I will ask someone to help). The second issue is that we also need to update geopandas, because it now uses a new, faster engine (pyogiro) to load and save data.
I expect that pyogrio will get the read/write times to at least the same level as Julia. In the end, they all should be pretty similar (and limited by IO).
So overall, based on the new results, Julia outperforms the R and Python packages and the GEOS binding. It would be also interesting to see what the performance of binding to georust looks like (rsgeo).
Yeah, we should test it. Like pyogrio, I expect georust to be on par with Julia.
And one more question that I am curious about. Will the geometryops binding from R/Python be the fastest of all the packages tested? If so, maybe Julia will eventually replace Rust and C++ in the future?
Not sure what you mean with the sentence. Julia is not a generic replacement for Rust and C++ (Rust might be for C++ though), but it certainly is easy to implement new algorithms, probably for a wider audience than if you would do it in Rust or C++ (neither of all linked authors are proficient in those languages).
Not sure what you mean with the sentence.
I saw some benchmarks and Julia demonstrated the same speed as low-level languages. If Julia has easier syntax and a lower entry barrier, then I think it could be a very good choice for writing geoprocessing algorithms compared to C++ or Rust. Moreover, we can see that geometryops is faster than R binding to GEOS (probably the same is true for pygeos). Hence, I am also curious what the overhead of calling Julia from R looks like.
Julia is not a generic replacement for Rust and C++
What are the limitations? Or why Rust / C++ would be better?
I saw some benchmarks and Julia demonstrated the same speed as low-level languages. If Julia has easier syntax and a lower entry barrier, then I think it could be a very good choice for writing geoprocessing algorithms compared to C++ or Rust. Moreover, we can see that geometryops is faster than R binding to GEOS (probably the same is true for pygeos). Hence, I am also curious what the overhead of calling Julia from R looks like.
Agreed! Calling other languages will always cause overhead, and I'm not sure what that will be from R/Python to Julia. Much also has to do with the geometry types used. Seems like a good experiment for SDSL.
What are the limitations? Or why Rust / C++ would be better?
Julia is dynamically typed (like Python/R), whereas Rust/C++ are statically typed. Julia has a garbage collector (like Python/R), whereas the other languages do not. So that makes Julia very easy and similar to Python and R, but we can't (yet) make small executables/libraries, or guarantee that the memory footprint is known beforehand and small enough for embedded systems.
Julia should have small compiled binaries soonish (currently they're too big).
We will experiment with calling GeometryOps.jl from R and python. If R/Python packages are wrapping GEOS we may be able to just rewrap the same C objects as Julia LibGEOS.jl objects, as GeometryOps.jl already accepts them without conversion (as a short term experiment with minimal changes).
Mostly GeometryOps.jl isn't actually dynamically typed, but statically known algs (hence this performance). So in theory we will be able to compile good static binaries. But practically not yet.
FWIW: https://github.com/r-spatial/sf/issues/2472 ;)
BTW: Shouldn't GeometryOps.jl be listed on https://juliageo.org/?