postgrest Provide a way to customize the URL structure

Problem

We recommend customizing the URL structure through Nginx, however doing so is complicated and requires additional tools. This can be seen on ppKrauss/PostgREST-writeAPI, which uses mustache templating plus a nodejs script to generate the Nginx rewrite config.

Additionally, changing the URL in the proxy will not be reflected in our OpenAPI spec - which would need rewriting on the proxy as well, which is also more work.

Relying solely on the proxy for this hurts our extensibility, ideally PostgREST should be able to adapt to other standards, like more classical REST with nested routes or maybe more complex ones like Odata; and do so outside of PostgREST core.

Note that we do have a way forward to customize the response body(see this discussion regarding JSON-LD), but not the URL structure.

Proposal

Provide a way to customize the URL through the config file or through postgrest-contrib. It could be like so:

select pgrst.url_rewrites(
  -- more classic REST
  -- /persons/1
  'persons/${value}', 'people?id=eq.${value}'
  -- /persons/1/companies
, 'persons/${value}/companies', '/people?select=*,companies(*)&id=eq.${value}'
  -- bring an RPC to root endpoint
, 'articles/${year}', 'rpc/get_articles?year=${year}'
  -- special route
, 'christmas_2021', 'rpc/get_articles?year=2021&month=december'
);

Currently, the request is parsed in ApiRequest by pulling together lots of data from various sources (request itself, config, schema cache) - all this happens for each request, where the implicit routing table is basically built dynamically per request.
I would like to change that to create a Router type that holds all the information for available routes, that can be constructed from the config and schema cache - independent of the actual request.
When the request comes in, the request can then be matched against the Router to find the route it should be handled from.
This allows to customize the Router easily. After creation of the base router from config and schema cache, additional options can be parsed to customize all kinds of things, e.g. the endpoints, accepted media-types, etc... - the suggested pgrst.url_rewrites would hook in right there.

Does that make sense?

Aug 22 '21 08:08 wolfgangwalther

Hmm, let me try to echo back what you are saying. There would be a new data type called Router. A router and a request can be combined using applyRouting function to reform the paths in the incoming Request to create a new Request with a modified path and/or query. Maybe something like the following:

data Router = Router {
-- some kind of representation of appconfig and dbstructure  and routing rules
}

makeRouter :: AppConfig -> DbStructure -> Router
applyRouting :: Request -> Router -> Request
userApiRequest :: Router -> Request -> RequestBody -> Either ApiRequestError ApiRequest

The applyRouting function has to be able to "run" some rules that will be specified in some configuration file, like the example in the issue description which looks like taking values from the input request and templating them out into the output string.

Aug 22 '21 14:08 gautam1168

Hm. Using the names you suggested, I think userApiRequest would be replaced by makeRouter and applyRouting. Currently userApiRequest does both. So something like this maybe:

data Router = Router {
-- some kind of representation of appconfig and dbstructure  and routing rules
}

makeRouter :: AppConfig -> DbStructure -> Router
applyRouting :: Router -> Request -> RequestBody -> Either ApiRequestError ApiRequest

Sep 02 '21 07:09 wolfgangwalther

@wolfgangwalther yes this makes sense. However, if I try to write a function in haskell whose job is to run some rules provided in the configuration file as mentioned in the description of this issue, then I think there are two ways it can be bad:

The users specify some rule that is not satisfiable by the function and they do not know that they have made that mistake until runtime. Because there is no compiler checking this dsl we are inventing inside the configuration file to specify routing.
We end up writing a very extensive module to check that the router configuration is sound in every way before we start postgrest (which feels like a big project to me, depending on how complex we make this dsl). And then there could be feature requests on this module making it even more complex and hard to maintain.

So going with postgrest's motto of doing one thing well, I suggest we somehow provide this configuration of the router as a program.

So, something like, if you provide an executable in the configuration file, postgrest will start that executable on a separate process. And we provide a default executable (router) with postgrest which may implement what we have discussed above. For example:

# config file
router = "default" # it can be a path to some executable
router-configs = "./routerconfigs"

# routerconfigs
port = 8080
host = localhost
target-port = 3000 # postgrest is running here
target-host = localhost

Now when you start postgrest, and it finds a router configuration it will start that program on a separate process.

This way, we can still allow people to configure routes on postgrest out of the box. And we give them a path to either go with some kind of cloud native architecture like AWS Apigateway or their own kubernetes setup with nginx gradually. We also give them the ability to configure much more than just the routes. Because this program can just rewrite the entire request, swapping out the request method, body and headers in arbitrary ways that the users can imagine. (This is possible inside the postgrest function we discussed above as well, however, the configuration dsl will become evil at that point).

Sep 04 '21 11:09 gautam1168

Because there is no compiler checking this dsl we are inventing inside the configuration file to specify routing. We end up writing a very extensive module to check that the router configuration is sound in every way before we start postgrest (which feels like a big project to me, depending on how complex we make this dsl).

True, it's looking complex. And the DSL I put above only considers the URL part, ideally we'd also be able to extend PostgREST to be compatible with IDL-based standards such as graphql — for this we'd need to parse the request body + map it to PostgREST query params.

That is one of the reasons I mentioned the embedded scripting language idea in https://github.com/PostgREST/postgrest/issues/1698#issuecomment-907922983.

There might be another option. I've just read about List-based parser combinators, which is a way to do parser combinators dinamically. With this we could(in theory) allow to modify PostgREST parser combinators through the configuration file(or with a nicer interface through postgrest-contrib).

We'd need a library for the List-based parser combinators, because there's only an example for Haskell. Then we could use that lib for parsing the body + the URL part(maybe).

Sep 04 '21 22:09 steve-chavez

The users specify some rule that is not satisfiable by the function and they do not know that they have made that mistake until runtime. Because there is no compiler checking this dsl we are inventing inside the configuration file to specify routing.

Proper strong haskell typing should avoid that.

We end up writing a very extensive module to check that the router configuration is sound in every way before we start postgrest (which feels like a big project to me, depending on how complex we make this dsl). And then there could be feature requests on this module making it even more complex and hard to maintain.

I don't think this is going to be a very complex thing. I'm pretty sure it should be straight-forward to add a parser that reads in a JSON config for the router, maps that to a haskell data type and throws decent parsing errors.

So going with postgrest's motto of doing one thing well, I suggest we somehow provide this configuration of the router as a program.

I think this is just going to complicate things a lot more. For everything that goes beyond the simple set of features that we want to support directly at the PostgREST level, nginx as a reverse-proxy is still the best solution. We don't need to reinvent the wheel here.

True, it's looking complex. And the DSL I put above only considers the URL part, ideally we'd also be able to extend PostgREST to be compatible with IDL-based standards such as graphql — for this we'd need to parse the request body + map it to PostgREST query params.

That is one of the reasons I mentioned the embedded scripting language idea in #1698 (comment).

The whole scripting language idea sounds promising - as a second step after the routing refactor. In that second step, we would basically rewrite the request parser in that scripting language, to make it possible to swap it out by a replacement. Does that make sense?

Sep 06 '21 12:09 wolfgangwalther

Hmm yes. The embedded scripting language part gets me more excited than this refactor 😛. And the idea of extending postgrest to be compatible with graphql is very exciting. I would love to see some specs about how to do these things. But for a starter I can refactor the router out as @wolfgangwalther suggested above.

Can @steve-chavez or @wolfgangwalther throw some more light on how it can be evolved to support graphql or start another issue/rfc for that?

Sep 06 '21 15:09 gautam1168

throw some more light on how it can be evolved to support graphql or start another issue/rfc for that?

@gautam1168 I don't have anything specific but really that part should be left for another repo. My main motivation here is to provide a generic interface for extensibility.

Also, seems that dhall might have issues with graphql because it doesn't have recursiveness? https://discourse.dhall-lang.org/t/dhall-as-an-alternative-to-openapi-graphql/344/4. Hm, that might be a blocker for us to adopt Dhall.

Sep 06 '21 16:09 steve-chavez

hmm ok. Let me open a PR for this refactor first.

Sep 07 '21 06:09 gautam1168

Regarding embedded scripting languages, there's also Embedded SQL - PostgreSQL has ECPG for example.

For Haskell, I found hesql(does a subset of Postgres' SQL), it doesn't seem maintained but the idea of embedded SQL in PostgREST seems really cool :open_mouth: :exploding_head: :boom:

Sep 07 '21 16:09 steve-chavez

Also, seems that dhall might have issues with graphql because it doesn't have recursiveness? https://discourse.dhall-lang.org/t/dhall-as-an-alternative-to-openapi-graphql/344/4. Hm, that might be a blocker for us to adopt Dhall.

I've seen that Dhall supports toml, that doesn't seem that different from supporting graphql.

So maybe not all things can be done in pure dhall, but dhall can be extended with Haskell. So we'd need to upgrade our Dhall dependency in some cases, but that's not too bad, we'd still support extensibility off the core repo.

For Haskell, I found hesql(does a subset of Postgres' SQL), it doesn't seem maintained but the idea of embedded SQL in PostgREST seems really cool

That might be shooting for the stars for now, Dhall is looking like the most feasible alternative.

Sep 20 '21 23:09 steve-chavez

One advantage of Lua as embedded scripting language is that many backend services already support it: Redis, HAProxy, Vector and of course Nginx(OpenResty) and Pandoc. I'm now thinking Lua is the right choice here.

Also, hslua seems well supported.

Oct 26 '21 22:10 steve-chavez

There was a request in gitter about making PostgREST read-only. Of course a read replica or having a read-only role would work but maybe we can allow to restrict our http methods so only GETs are possible and POSTs or other methods don't even touch the db.

Or for a more "db as source of truth" approach, we could reject http methods without hitting the db based on the role's privileges. This requires more smartness from the schema cache though.

Edit: we shouldn't do this because those are different things: "not allowed to" vs "not capable to" .

Mar 02 '22 16:03 steve-chavez

This discussion derailed a bit of its original purpose. There's a more plausible alternative now on https://github.com/PostgREST/postgrest/issues/1086#issuecomment-1513910590

Basically by defining the custom URL as a view or function name:

CREATE VIEW "./projects/:id" AS
SELECT * 
FROM projects
WHERE id = current_setting('request.path.id`, false);

Let's follow up there.

Oct 03 '23 18:10 steve-chavez

postgrest
postgrest copied to clipboard

Provide a way to customize the URL structure

Problem

Proposal

Related

postgrest postgrest copied to clipboard

Provide a way to customize the URL structure

Problem

Proposal

Related

postgrest
postgrest copied to clipboard