prisma-engines
prisma-engines copied to clipboard
feat: geospatial types support
Fixes https://github.com/prisma/prisma/issues/1798 Fixes https://github.com/prisma/prisma/issues/2789
This PR implements support for database geometry types. I'll start this message by saying that, although I'd be very proud of making such a contribution to Prisma, I understand that the likeliness of such a massive PR to be accepted is pretty low due to the time I'd take to fully review. How did I end up there ? In the end of last year, I tried to get in touch with the team as I had some experience in the field, and also had the time to work on the feature - but by the time we managed to exchange emails I no longer had enough time to invest in working on this. Last month, I had time and remembered about this project, so I started browing the prisma repo - and realised the ORM core was not actually Typescript, but full Rust, which I had no previous experience with. I questioned my ability to be up to the task, but though: why not have a go at it ? Not having sufficient rust experience, I decided against getting back in touch with the Prisma team and just leverage this project as a learning experience. A month later, although it is very far from being production ready, I believe this branch might deserve some attention which is why I decide to release it. Again, apologies about the size. I worked on this as a learning experience and ended up getting quite ahead of myself.. heh.
Now for the meat of the PR, here's what it includes:
In the psl
/ query-engine
:
- Two new
ScalarType
s:Geometry
andGeoJSON
. Both types refer to the same underlying database types, they only differ in the geometry serialization format the user wants to use:GeoJSON
forGeoJSON
,EWKT
forGeometry
. Why do we need both ?GeoJSON
is the format most people wants to use, as they most likely use TypeScript andGeoJSON
is the de-facto standard for manipulating geometries on the web. ButEwkt
allows for richer information about the raw geometry: it can be used to serialize geometry in any CRS (when GeoJSON is limited by specification to WGS84), and can also represent more geometry types than geojson (CIRCULARSTRING
,CURVEPOLYGON
etc). As such, onlyEWKT
can be used to safely & exhaustively read database geometries. - A new filtering condition:
Geometry(Not)Within
mapping tost_within
- A new filtering condition
Geometry(Not)Intersects
mapping tost_intersects
In quaint
:
- Two new enum
Value
s:Geometry
andGeography
, which both holds the same type (GeometryValue
, a struct with a WKT string and an SRIDi32
). We might have needed only one, but MSSQL required making the two explicitly different. - A new function
geom_from_text
, that maps to the SQL/MMst_geomfromtext
. - A new function
geom_as_text
, that maps tost_astext.
- A new compare function
geometry_is_valid
that maps tost_isvalid
- A new compare function
geometry_is_empty
that maps tost_isempty
- A new compare function
geometry(_not)_within
that maps tost_within
- A new compare function
geometry(_not)_intersects
that maps tost_intersects
- A new compare function
geometry_type(_not)_equals
that maps tost_geometrytype(x) = ...
All in all, what works:
- GeoJSON & EWKT I/O for CockroachDB, MySQL, MSSQL
- GeoJSON & EWKT I/O for PostgreSQL (as long as PostGIS is installed and the public schema is exposed to Prisma)
- GeoJSON & EWKT I/O for SQLite (as long as Spatialite is available and exposed with
SPATIALITE_PATH
) - GeoJSON I/O for MongoDB
-
Within
andIntersects
operators for all vendors except MongoDB (Mongo has the$geoWithin
and$geoIntersects
query operators, however they cannot be used with the current connector implementation, since those are Query operators wherehas Prisma uses Aggregation operators. Unfortunately, they're not compatible, and no geospatial filtering operators are available for aggregate queries in MongoDB. - Introspection / Schema description for all vendors (both directions)
What hasn't been fully implemented / properly tested:
- Migrations (particularily casts)
- Default value handling
- Index type & operators handling (postgres only ?)
There is still a lot of work to do here, but I was quite happy to get the core features (introspection, IO and filtering) going and though it was worth sharing.
My intent with this PR is to show a possible implementation, start a discussion with the team and community about its viability, potential drawbacks and help making these features reach Prisma in a production ready state. Questions about the implementation details are welcome and encouraged. Please get in touch at aurele.nitoref at icloud if you'd like to discuss this more thoroughly!
Absolutely boss 🚀
CodSpeed Performance Report
Merging #4208 will not alter performance
Comparing Oreilles:feat/geometry
(e0f834d) with main
(5c4707e)
Summary
✅ 11
untouched benchmarks
Amazing.
So that the tests are automatically run going forward:
Can you quickly do a PR to main
where you change line 5 to [](https://github.com/prisma/prisma-engines/actions/workflows/on-push-to-main.yml)
? I can then merge that, and the repository will remember that we trust you and run the tests without me clicking a button. (And that button needs to be fixed anyway 😆)
So there was quite a lot of things I had overlooked when testing on my machine..! I squashed all the fixes and we now have a mostly green test suite 👍
🚀
Two test failures left for now:
- Something with 1 PostgreSQL metrics test, 7 v.s 9 metrics recorded or something. Could your PR cause this? (We recently change something in metrics, so you might want to rebase (
Update branch
button in GH UI) to get the most recent version of that code). - A lot of unhappiness in the Vitess 8 tests with it not being able to introspect the tables because of permission problems. Strange. That looks a bit ... weird, even flaky maybe? I retriggered a test run there. Maybe it just goes green.
PS: No need from our side to combine your PRs and force push, we only merge via "Squash and merge" which combines all the commits, and with the individual commits of the fixes it might be easier to understand what test failures where fixed where - or introduced.
So the two additional queries breaking the tests are due to:
CREATE EXTENSION IF NOT EXISTS postgis
https://github.com/Oreilles/prisma-engines/blob/a588e1296288b79df0204397a441f2b3d891b9d9/query-engine/connector-test-kit-rs/qe-setup/src/postgres.rs#L42
and
SELECT PostGIS_version()
https://github.com/Oreilles/prisma-engines/blob/a588e1296288b79df0204397a441f2b3d891b9d9/schema-engine/connectors/sql-schema-connector/src/flavour/postgres.rs#L628-L630
Both are needed to run the tests, I updated the counter to 9 to make the test succeed. ( Also rebased on main just in case... )
No clue about the Vitess issue though...
~~EDIT: I tried running the vitess tests on prisma:main
on my machine and the result is the same, which makes me assume that this is not due to a code change in this PR...~~
EDIT: it's now all fixed 🙌
How can I help to merge it?
Test it, figure out how it would fit into Prisma, let the author and us know what else is needed to make this a complete feature.
@janpio @Oreilles thanks for all you do guys. I am at the point in my project to start introducing PostGIS and trying to figure out how to test this PR. I cloned @Oreilles feature branch and compiled debug targets of prisma-engines
how do i tie it into existing prisma
to run test migration/introspection ? thank you
@okonon Have a look at the documentation: https://www.prisma.io/docs/concepts/components/prisma-engines#using-custom-engine-libraries-or-binaries
@RafaelKr thanks a lot for that sorry i missed that section.
I was able to test introspection from existing database (prisma db pull
):
Looks like it is introspecting correctly.
Adding a comment to voice support for this to be added to Prisma! Looking forward to seeing this merged; keep up the good work.
Definitely would love that to be merged as well :+1:
Sorry, but this will definitely not be merged as it is only part of the full feature that needs to be implemented.
But we are interested in bringing it there, so let's see what we can do here.
For now I created a branch based on the commit this was forked from, so that there are no conflicts and @Oreilles can keep committing if they want.
I also created a local branch of the code that runs all our CI (Buildkite is not run on external contributions), with the goal of publishing the engines to our infrastructure, so we can build a Prisma CLI and Client with them for further testing.
But unfortunately some tests are failing (results only internally visible): https://github.com/prisma/prisma-engines/pull/4427
Turns out GitHub Actions do not run for these databases, and seems the current code has some problems - and that is blocking my plan to release these engine files :/
Random example of a SQLite failure, our of many ([2023-11-08T21:25:35Z] test result: FAILED. 622 passed; 970 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.73s
):
[2023-11-08T21:25:35Z] thread 'writes::top_level_mutations::delete_many_relations::delete_many_rels::pm_c1_no_children' panicked at 'called `Result::unwrap()` on an `Err` value: Error { kind: QueryError(SqlInputError { error: Error { code: Unknown, extended_code: 1 }, msg: "no such function: InitSpatialMetaData", sql: "SELECT InitSpatialMetaData()", offset: 7 }), original_code: Some("1"), original_message: Some("no such function: InitSpatialMetaData") }', /root/build/query-engine/connector-test-kit-rs/qe-setup/src/sqlite.rs:9:65
MySQL 5.6 only fails 2:
[2023-11-08T21:35:48Z] failures:
[2023-11-08T21:35:48Z] queries::filters::geometry_filter::geometry_filter_spec::geometric_comparison_filters
[2023-11-08T21:35:48Z] writes::data_types::native_types::mysql::mysql::native_geojson_srid_geometry_types
5.7 only 1 even:
[2023-11-08T21:39:12Z] failures:
[2023-11-08T21:39:12Z] writes::data_types::native_types::mysql::mysql::native_geojson_srid_geometry_types
MariaDB the same as 5.6:
[2023-11-08T21:31:19Z] failures:
[2023-11-08T21:31:19Z] queries::filters::geometry_filter::geometry_filter_spec::geometric_comparison_filters
[2023-11-08T21:31:19Z] writes::data_types::native_types::mysql::mysql::native_geojson_srid_geometry_types
Trying to get these tests to run on GitHub Actions now. But maybe you have speculative fixes @Oreilles?
Interesting, as far as I can remember (almost) all tests were succeeding the last time we discussed the PR, and we did have tests for all MySQL versions... though I might remember wrong. Did you resolve merge conflicts when you merged main
, which could explain tests failing afterwards ?
Anyway, I'll have a look as soon as possible. ~~@janpio if that's something you can do, please send me the test cli commands to reproduce the failing tests.~~
Edit: nevermind, I had not seen that they were already mentioned in your message!
Yes, I realize that the main
merge was not optimal to fully reproduce what was going on here - sorry. I thought about reproducing the state before I did that and make that run in Buildkite, but I couldn't figure out how really.
Should be good now:
-
SQLite tests were failing due to an unsuccesfull initialization of Spatialite, which I had unwrapped - it's now oked instead. Tests involving Spatialite will be filtered out later if it is not available.
-
The write test failing with MySQL was the one involving geometry columns with srid constraints, which are only available starting with MySQL 8, hence the failing schema initialization.
-
The last one is a "fun" one: in MySQL 5.6 and MariaDB, the result of this expression is false:
elect st_intersects( st_geomfromtext('POINT(0 0)'), st_geomfromtext('POINT(0 0)') ;
I replaced the intersection test with one working with all vendors.
Merged main
again and had to solve quite a lot of conflicts: there apparently was some pretty big change in the code base recently (I'd be interested in knowing what were the motivation behind the quaint
's new ValueType
).
Unfortunately it seems like some of the recent changes broke the current implementation, as the query engine tests are now failing for mssql and postgres, which will likely take a bit longer to figure out
Thanks for even tackling the merge conflicts, that was not expected.
I suggest you add this to your query-engine.yml so that it runs the same tests as Buildkite will: https://github.com/prisma/prisma-engines/pull/4428/files You can see that MySQL (5.7 and 8) and MariaDB fail on GH Actions with 1 test - but we should be able to ignore that here and Buildkite would then still pass (I opened an issue to clean this up at https://github.com/prisma/prisma/issues/21896, but no one had time to look at it yet).
Looking at your last commit's results, you can definitely ignore:
- https://github.com/prisma/prisma-engines/actions/runs/6850990006?pr=4208 (we are currently cleaning those up and will have them green soon I hope)
- https://github.com/prisma/prisma-engines/actions/runs/6850990009/job/18626346956?pr=4208 and https://github.com/prisma/prisma-engines/actions/runs/6850990009/job/18626347232?pr=4208 - those are flaky, so I could rerun them a few times to make them pass, but that is ultimately irrelevant
All others are the same singular test that fails: writes::top_level_mutations::create::geometry_create::create_geometry
- not too bad. I was afraid there was more after your message above.
Turns out it was a much smaller issue that I thought! I squashed the main
merge to avoid polluting this PR so for reference, this line was the culprit (the null geometry case was seemingly handled elsewhere previously)
https://github.com/Oreilles/prisma-engines/blob/59410f64952de4ed512f8983a4e242b5f327ef5c/quaint/src/visitor.rs#L220
I'd be interested in knowing what were the motivation behind the
quaint
's newValueType
This is what my colleague shared:
Previously
quaint::Value
was currentquaint::ValueType
it was a enum, which means that if we needed to enrich the representation of a value with metadata for all variants, we needed to modify the source code wherever any variant was referenced. To solve this, and add metadata about a the native column type for postgres (needed to encode information for driver adapters) we promotedquaint::Value
from an enum to a struct, and use the old enum as a property of said struct, but renaming the type toquiant::ValueType
Some more information: https://github.com/prisma/prisma-engines/pull/4311
thanks for moving this forward.
Another look at the failing tests:
- Driver Adapter, LibSQL and SQLite are probably because missing extension? (The LibSQL one we could ignore for now)
- All others are the same test
writes::top_level_mutations::create::geometry_create::create_geometry
caused byInconsistent column data: Conversion failed: Value of type Column { catalog: ConstBytes(PhantomData<mysql_common::packets::Catalog>)
- Vitess ones are flaky and can be ignored
(Let me know if you prefer to fix the resolve conflicts, or if I should do a branch from yesterday's state so that your PR here does not have conflicts and can keep building and running in CI)
I rebased the branch on main so there shouldn't be any more conflicts for now. I also managed to add the Spatialite tests to the CI 👍
There's a failing test I can't wrap my head around: https://github.com/prisma/prisma-engines/actions/runs/7017021601/job/19089461750?pr=4208
{"errors":[{"error":"Error occurred during query execution:\nConnectorError(ConnectorError { user_facing_error: None, kind: QueryError(Server(MysqlError { code: 4079, message: \"Illegal parameter data type varchar for operation 'st_within'\", state: \"HY000\" })), transient: false })","user_facing_error":{"is_panic":false,"message":"Error occurred during query execution:\nConnectorError(ConnectorError { user_facing_error: None, kind: QueryError(Server(MysqlError { code: 4079, message: \"Illegal parameter data type varchar for operation 'st_within'\", state: \"HY000\" })), transient: false })","backtrace":null}}]}
The strange part is that this test is succeeding for all MySQL versions except MariaDB. To give a bit of context, I decided to handle geometries differently for the MySQL connector to avoid a classic footgun: MySQL 8 broke backwards compatibility by swapping coordinates in WKT representation for WGS84 geometries (https://dba.stackexchange.com/questions/242001/mysql-8-st-geomfromtext-giving-error-latitude-out-of-range-in-function-st-geomfr). There is a way to prevent this behavior by using st_geomfromtext
and st_astext
with a third argument 'axis-order=long-lat'
- but unfortunately, I didn't find a way in Quaint to output custom SQL depending on the connected database version. So instead of using those function, I'm reading and writing the internal binary blob representation directly. This seems to work fine for everything except for this specific test with MariaDB, where calling ST_Within(geom, <geometry binary blob>)
fails with the aforementioned error: Illegal parameter data type varchar for operation 'st_within'
. For some reason, the binary blob seems to be interpreted as varchar
... or is there something else I missed ?
Seems like you are getting somewhere.
For the Spatialite problem:
No space left on device : '/home/ubuntu/actions-runner/_diag/blocks/acc9258d-1293-4903-99d1-cd922e79f8df_4482dee1-f750-5a60-3af2-b3e57514afa7.2'
There are actions that makes more space on the runners: https://github.com/marketplace/actions/maximize-build-disk-space and a few more. But is it expected to take that much space actually?
Only a small number of failing tests on MySQL(like) databases either.
(You can continue ignoring the flakyness in the driver adapter test)
The failing test on MySQL 5.7 and 8 isn't caused by this PR, as you can see it's also failing on your PR adding those versions to the query engine test CI: https://github.com/prisma/prisma-engines/actions/runs/6870390796/job/18685172620?pr=4428
Regarding the "No space left" error, the problem is that an empty Sqlite database with Spatialite loaded (with SELECT InitSpatialMetadata();
) is 7Mo, and since the database file are not cleaned up after each test... well after 330 tests that adds up to a lot 😄. We can circumvent this by initializing Spatialite with SELECT InitSpatialMetadata('WGS84_ONLY')
as that avoids fully loading the spatial_ref_sys
table taking all the space. However, this will also affect usage outside the test suite, since all SRIDs are currently allowed for the Sqlite connector.
I'll add this option for now so that the tests can run while we decide on the best way to handle this.
Regarding the MariaDB failing st_within
test I'd be glad if anyone has any insight on what could cause the bug (Why would a binary blob be sent as varchar, and only with MariaDB ..?)
Thanks for the reminder that our MySQL tests are failing by default, you are of course correct :D