postgrest
postgrest copied to clipboard
Provide a way to customize the URL structure
Problem
We recommend customizing the URL structure through Nginx, however doing so is complicated and requires additional tools. This can be seen on ppKrauss/PostgREST-writeAPI, which uses mustache templating plus a nodejs script to generate the Nginx rewrite config.
Additionally, changing the URL in the proxy will not be reflected in our OpenAPI spec - which would need rewriting on the proxy as well, which is also more work.
Relying solely on the proxy for this hurts our extensibility, ideally PostgREST should be able to adapt to other standards, like more classical REST with nested routes or maybe more complex ones like Odata; and do so outside of PostgREST core.
Note that we do have a way forward to customize the response body(see this discussion regarding JSON-LD), but not the URL structure.
Proposal
Provide a way to customize the URL through the config file or through postgrest-contrib. It could be like so:
select pgrst.url_rewrites(
-- more classic REST
-- /persons/1
'persons/${value}', 'people?id=eq.${value}'
-- /persons/1/companies
, 'persons/${value}/companies', '/people?select=*,companies(*)&id=eq.${value}'
-- bring an RPC to root endpoint
, 'articles/${year}', 'rpc/get_articles?year=${year}'
-- special route
, 'christmas_2021', 'rpc/get_articles?year=2021&month=december'
);
Related
- https://github.com/PostgREST/postgrest/issues/1086
@wolfgangwalther I think you had an idea about a "routing refactor", this may be related.
@wolfgangwalther I think you had an idea about a "routing refactor", this may be related.
Absolutely, I had customizing of the URLs for endpoints in mind with that, too.
@steve-chavez so this would be a separate module that somehow plugs in before the userApiRequest
function in ApiRequest.hs
file depending on a configuration? Is there an example of such a module that I can look at to get an idea of how to write such a thing?
Is there an example of such a module that I can look at to get an idea of how to write such a thing?
Hm no, there isn't. This is mostly a high-level idea at this stage, extensibility is pretty new.
so this would be a separate module that somehow plugs in before the userApiRequest function in ApiRequest.hs file
Sounds about right, though TBH I haven't looked yet at the implementation details.
Sounds about right, though TBH I haven't looked yet at the implementation details.
At a higher level, my idea was roughly along the following lines:
- Currently, the request is parsed in
ApiRequest
by pulling together lots of data from various sources (request itself, config, schema cache) - all this happens for each request, where the implicit routing table is basically built dynamically per request. - I would like to change that to create a
Router
type that holds all the information for available routes, that can be constructed from the config and schema cache - independent of the actual request. - When the request comes in, the request can then be matched against the
Router
to find the route it should be handled from. - This allows to customize the
Router
easily. After creation of the base router from config and schema cache, additional options can be parsed to customize all kinds of things, e.g. the endpoints, accepted media-types, etc... - the suggestedpgrst.url_rewrites
would hook in right there.
Does that make sense?
Hmm, let me try to echo back what you are saying. There would be a new data type called Router
. A router and a request can be combined using applyRouting
function to reform the paths in the incoming Request
to create a new Request
with a modified path and/or query. Maybe something like the following:
data Router = Router {
-- some kind of representation of appconfig and dbstructure and routing rules
}
makeRouter :: AppConfig -> DbStructure -> Router
applyRouting :: Request -> Router -> Request
userApiRequest :: Router -> Request -> RequestBody -> Either ApiRequestError ApiRequest
The applyRouting
function has to be able to "run" some rules that will be specified in some configuration file, like the example in the issue description which looks like taking values from the input request and templating them out into the output string.
Hm. Using the names you suggested, I think userApiRequest
would be replaced by makeRouter
and applyRouting
. Currently userApiRequest
does both. So something like this maybe:
data Router = Router {
-- some kind of representation of appconfig and dbstructure and routing rules
}
makeRouter :: AppConfig -> DbStructure -> Router
applyRouting :: Router -> Request -> RequestBody -> Either ApiRequestError ApiRequest
@wolfgangwalther yes this makes sense. However, if I try to write a function in haskell whose job is to run
some rules provided in the configuration file as mentioned in the description of this issue, then I think there are two ways it can be bad:
- The users specify some rule that is not satisfiable by the function and they do not know that they have made that mistake until runtime. Because there is no
compiler
checking this dsl we are inventing inside the configuration file to specify routing. - We end up writing a very extensive module to
check
that the router configuration is sound in every way before we start postgrest (which feels like a big project to me, depending on how complex we make this dsl). And then there could be feature requests on this module making it even more complex and hard to maintain.
So going with postgrest's motto of doing one thing well, I suggest we somehow provide this configuration
of the router as a program.
So, something like, if you provide an executable in the configuration file, postgrest will start that executable on a separate process. And we provide a default executable (router) with postgrest which may implement what we have discussed above. For example:
# config file
router = "default" # it can be a path to some executable
router-configs = "./routerconfigs"
# routerconfigs
port = 8080
host = localhost
target-port = 3000 # postgrest is running here
target-host = localhost
Now when you start postgrest, and it finds a router
configuration it will start that program on a separate process.
This way, we can still allow people to configure routes on postgrest out of the box. And we give them a path to either go with some kind of cloud native architecture like AWS Apigateway or their own kubernetes setup with nginx gradually. We also give them the ability to configure much more than just the routes. Because this program can just rewrite the entire request, swapping out the request method, body and headers in arbitrary ways that the users can imagine. (This is possible inside the postgrest function we discussed above as well, however, the configuration dsl will become evil at that point).
Because there is no compiler checking this dsl we are inventing inside the configuration file to specify routing. We end up writing a very extensive module to check that the router configuration is sound in every way before we start postgrest (which feels like a big project to me, depending on how complex we make this dsl).
True, it's looking complex. And the DSL I put above only considers the URL part, ideally we'd also be able to extend PostgREST to be compatible with IDL-based standards such as graphql — for this we'd need to parse the request body + map it to PostgREST query params.
That is one of the reasons I mentioned the embedded scripting language idea in https://github.com/PostgREST/postgrest/issues/1698#issuecomment-907922983.
There might be another option. I've just read about List-based parser combinators, which is a way to do parser combinators dinamically. With this we could(in theory) allow to modify PostgREST parser combinators through the configuration file(or with a nicer interface through postgrest-contrib).
We'd need a library for the List-based parser combinators, because there's only an example for Haskell. Then we could use that lib for parsing the body + the URL part(maybe).
- The users specify some rule that is not satisfiable by the function and they do not know that they have made that mistake until runtime. Because there is no
compiler
checking this dsl we are inventing inside the configuration file to specify routing.
Proper strong haskell typing should avoid that.
- We end up writing a very extensive module to
check
that the router configuration is sound in every way before we start postgrest (which feels like a big project to me, depending on how complex we make this dsl). And then there could be feature requests on this module making it even more complex and hard to maintain.
I don't think this is going to be a very complex thing. I'm pretty sure it should be straight-forward to add a parser that reads in a JSON config for the router, maps that to a haskell data type and throws decent parsing errors.
So going with postgrest's motto of doing one thing well, I suggest we somehow provide this
configuration
of the router as a program.
I think this is just going to complicate things a lot more. For everything that goes beyond the simple set of features that we want to support directly at the PostgREST level, nginx as a reverse-proxy is still the best solution. We don't need to reinvent the wheel here.
True, it's looking complex. And the DSL I put above only considers the URL part, ideally we'd also be able to extend PostgREST to be compatible with IDL-based standards such as graphql — for this we'd need to parse the request body + map it to PostgREST query params.
That is one of the reasons I mentioned the embedded scripting language idea in #1698 (comment).
The whole scripting language idea sounds promising - as a second step after the routing refactor. In that second step, we would basically rewrite the request parser in that scripting language, to make it possible to swap it out by a replacement. Does that make sense?
Hmm yes. The embedded scripting language part gets me more excited than this refactor 😛. And the idea of extending postgrest to be compatible with graphql is very exciting. I would love to see some specs about how to do these things. But for a starter I can refactor the router out as @wolfgangwalther suggested above.
Can @steve-chavez or @wolfgangwalther throw some more light on how it can be evolved to support graphql or start another issue/rfc for that?
throw some more light on how it can be evolved to support graphql or start another issue/rfc for that?
@gautam1168 I don't have anything specific but really that part should be left for another repo. My main motivation here is to provide a generic interface for extensibility.
Also, seems that dhall might have issues with graphql because it doesn't have recursiveness? https://discourse.dhall-lang.org/t/dhall-as-an-alternative-to-openapi-graphql/344/4. Hm, that might be a blocker for us to adopt Dhall.
hmm ok. Let me open a PR for this refactor first.
Regarding embedded scripting languages, there's also Embedded SQL - PostgreSQL has ECPG for example.
For Haskell, I found hesql(does a subset of Postgres' SQL), it doesn't seem maintained but the idea of embedded SQL in PostgREST seems really cool :open_mouth: :exploding_head: :boom:
Also, seems that dhall might have issues with graphql because it doesn't have recursiveness? https://discourse.dhall-lang.org/t/dhall-as-an-alternative-to-openapi-graphql/344/4. Hm, that might be a blocker for us to adopt Dhall.
I've seen that Dhall supports toml, that doesn't seem that different from supporting graphql.
So maybe not all things can be done in pure dhall, but dhall can be extended with Haskell. So we'd need to upgrade our Dhall dependency in some cases, but that's not too bad, we'd still support extensibility off the core repo.
For Haskell, I found hesql(does a subset of Postgres' SQL), it doesn't seem maintained but the idea of embedded SQL in PostgREST seems really cool
That might be shooting for the stars for now, Dhall is looking like the most feasible alternative.
One advantage of Lua as embedded scripting language is that many backend services already support it: Redis, HAProxy, Vector and of course Nginx(OpenResty) and Pandoc. I'm now thinking Lua is the right choice here.
Also, hslua seems well supported.
There was a request in gitter about making PostgREST read-only. Of course a read replica or having a read-only role would work but maybe we can allow to restrict our http methods so only GETs are possible and POSTs or other methods don't even touch the db.
Or for a more "db as source of truth" approach, we could reject http methods without hitting the db based on the role's privileges. This requires more smartness from the schema cache though.
Edit: we shouldn't do this because those are different things: "not allowed to" vs "not capable to" .
This discussion derailed a bit of its original purpose. There's a more plausible alternative now on https://github.com/PostgREST/postgrest/issues/1086#issuecomment-1513910590
Basically by defining the custom URL as a view or function name:
CREATE VIEW "./projects/:id" AS
SELECT *
FROM projects
WHERE id = current_setting('request.path.id`, false);
Let's follow up there.