caddy icon indicating copy to clipboard operation
caddy copied to clipboard

Reusable named routes

Open francislavoie opened this issue 3 years ago • 6 comments

Sometimes, there seems to be a need for defining a route once (a list of handlers) and invoking it on multiple different conditions (matchers).

My idea to solve this would be to implement two new handlers:

  • First one, let's call it define for now, defines a route with a name (key). This should be placed as early as possible in the request handling pipeline so that it's always "defined" when it's needed. This would register itself in a map in the request's context for things later to reference it.
  • Second one, let's call it invoke, takes a name, does a lookup in the request context to find the defined route by name, then executes the routes as if they were defined in the current location. This invoke could be paired with a matcher, of course.

The big benefit here is config reuse. So if you have multiple sites which have the exact same pattern, you could define a route once and then execute it from multiple different places, potentially with different variables defined ahead of time. Also, these routes would only be provisioned once, which is an advantage over the Caddyfile snippets feature which is just fancy copy-paste with some string replacement. This could mean better performance and shared state across sites (e.g. reverse_proxy health state, etc).

This could be supported in the Caddyfile as well. The syntax for this is up in the air. You could consider this a "reference" in the programmatic sense, so my immediate thinking is to involve & in the syntax. So we could make this kinda an "extension" of Caddyfile snippets, and have the syntax like &(named-route) { ... }, and later called with invoke [<matcher>] <named-route>.

francislavoie avatar Aug 30 '22 23:08 francislavoie

What are some example use cases that need this functionality?

mholt avatar Sep 15 '22 20:09 mholt

This is all based on the assumption that I'm understanding this all correctly, but:

I use Caddy for my entire domain, and have a number of services, each being served on their own subdomain. My Caddyfile looks a little something like this:

www.mydomain.net {
    header / {
        -Server
        X-Powered-By "Lots of loud music"
    }
    reverse_proxy * localhost:7260
}

id.mydomain.net {
    header / {
        Strict-Transport-Security "max-age=2592000;" # 30 days
        X-Frame-Options "DENY"
        -Server
    }
    reverse_proxy localhost:7261
}

rss.mydomain.net {
    header -Server
    reverse_proxy localhost:7264
}

# contents trimmed

In each and every single one of these, I'm removing the Server header, and usually applying some other headers, like a comedy X-Powered-By or a Strict-Transport-Security. Having reusable, named, routes would allow these all to be configured in one place and allow me to stop repeating myself throughout the entire configuration for something that is and will always be the same across every route. I actually stumbled across this issue because I was Googling for a way to make a reusable "function" within my config.

codemicro avatar Sep 15 '22 20:09 codemicro

@codemicro Thanks for the feedback. Why doesn't import work for you for that use case? What would this feature do for you that snippets do not?

mholt avatar Sep 15 '22 21:09 mholt

Because I had no idea it existed. Pardon me for clogging up the issue, and thanks for the heads-up!

codemicro avatar Sep 15 '22 21:09 codemicro

Ah good :) Thanks for participating!

mholt avatar Sep 15 '22 21:09 mholt

What are some example use cases that need this functionality?

We've heard of users dynamically updating their config via the API, adding new sites for their customers on demand. Having pre-defined routes would make this simpler to update, especially when each domain needs slightly different config (e.g. need to add a customer-specific request header to the routes before proxying, so it can't just be added to an existing host matcher).

It would also reduce memory usage significantly in those cases with a lot of routes, because only a single instance of the route would be provisioned, and only a single copy of state would be in memory. It would also mean health check state would be shared, so if you have the same upstreams for all the sites, you would avoid having active health checks for each instance (which is awful for performance if you have lots of individual proxy instances, a goroutine per proxy handler firing off health checks etc).

francislavoie avatar Sep 20 '22 03:09 francislavoie

A use-case I'm having right now is that I have to block certain requests based on HTTP headers (like "referer" or "from"), and I have like 24 config files for Caddy ( + the main Caddyfile), one of them is a "redirections" file, doing 56 redirections based on hostname, and I'd like to have all of them use the exact same blocking system.

I could use iptables or varnish or other stuff to do that, but having it all plug-and-play in Caddy is probably the best IMO.

Another issue that @francislavoie points and that's really important is related to the workaround I found for my personal issue: I created a snippet that sets up the same request matcher & responder, something like this:

# Caddyfile

# ...
(mysnippet) {
  @mymatcher {
    header_regexp SomeHeader "..."
  }
  respond @mymatcher "Nope nope" 444 {
    close
  }
}

Then I have to import mysnippet in all my config files.

By doing so, it "copy pastes" the snippet in all configs, so you can imagine that for more than 80 sites, it becomes a bit tedious in terms of memory size... So imagine a shared hosting provider that can have thousands of websites...

If the snippet could be changed to a single route (like my matcher + responder tuple), calling invoke ... instead of import ... all the time would just call the existing route in memory instead of copy/pasting it, saving memory size.

Pierstoval avatar Sep 26 '22 13:09 Pierstoval

@Pierstoval To clarify, it sounds like a single handler at the very top of your server routes in the JSON config suffices. That's way better than needing named routes.

At least, if we treat "named routes" as a native JSON config concept.

Perhaps there is some precedent for being able to define HTTP handlers in the Caddyfile that are placed at the top of their servers, like maybe a global sub-option of servers, that run before any host matching, or any of the Caddyfile routes are evaluated.

mholt avatar Sep 26 '22 15:09 mholt

Perhaps there is some precedent for being able to define HTTP handlers in the Caddyfile that are placed at the top of their servers

But it's not necessarily always at the top of the servers that users might want reusable routes. I gave an example about with reverse_proxy with shared health checks. That's not possible to do if you "just put it at the top".

francislavoie avatar Sep 26 '22 18:09 francislavoie

My use-case is to prepend some action to all site handlers, but I may at some point need to do some other action for specific sites, and in such case, the JSON part will probably not be enough, because I will have to copy/paste everything in all handlers, or use the same workaround everywhere ( a.k.a snippet at the root Caddyfile, and import mysnippet in all other sites config files). It's kinda bad memory-wise.

Another really good point about named routes is DX: having to switch my 20+ files to JSON is bad for users that don't use JSON for their config file, and I will probably have to constantly go back and forth with the documentation to be sure I'm not writing the bad JSON path, don't forget the colons, the brackets etc., while Caddyfile syntax is way simpler to use, has a smoother learning curve than the whole JSON (which I thought was mostly dedicated to consuming the admin API). (and also, I'm managing my server in CLI only, and I use VIM, and writing JSON in VIM is a true torture compared to writing a Caddyfile 😁)

I'm sorry if I'm bringing a strange use-case on the table, maybe I'm not using Caddy the right way, but I just have a config that looks like this on my server:

/etc/caddy/
├── Caddyfile
└── vhosts/
    ├── vhost1
    ├── vhost2
    ├── vhost3
    ├── vhost4
    ├── ...
    └── vhostN

And the contents of my main Caddyfile is really light:

$ cat Caddyfile
{
        #debug
        email [email protected]
}

main-server-domain.example.com {
        root * /var/www/html
        file_server
}

(myrule) {
	# my general rule that I'd like any vhost to be able to import
}

import vhosts/*

Currently, all my vhosts contain import myrule in their definition, especially the 50+ redirection rules that I configured, which is really bothersome to implement. I would be okay with this method if it wasn't necessary for all sites, and in such case, I understand that my snippet would be kinda fine. But as said above, there are other concerns, because here, if I want to use this "spam blocker", it means that I have to import this heavy file.

I just did configure this "spam blocker", enabled admin and curl-ed to the /config/ endpoint to download the full config, and it contains the copy/pasted snippet 77 times, the outputted config file is 6.1MB and Caddy weighs 446.1MB in memory. EDIT: I just disabled my global rule, the JSON file is now 33KB (meaning using my block rule makes the JSON config 183 times heavier!), and Caddy weighs 401.3MB in memory. (The previous one is 1.11 times heavier, but it's still 45MB)

I'm pretty certain that having one single "global handler" consisting in a "define" route + a "invoke" statement in all my vhosts would be a very good start to improve memory usage 😉

Pierstoval avatar Sep 26 '22 19:09 Pierstoval

Interesting discussion so far.

@Pierstoval I imagine your files in the vhosts folder are all pretty similar?

I am having a hard time justifying the complexity of self-referencing JSON while I can see simpler, more elegant ways to write these configs with hand-crafted JSON that simply isn't repetitive.

But I hear you, you want to use the Caddyfile.

mholt avatar Sep 29 '22 20:09 mholt

@Pierstoval I imagine your files in the vhosts folder are all pretty similar?

Nope 😁 Some are, like the PHP/Symfony apps all contain the same customized stuff.

Example with two of my Symfony apps:

/etc/caddy/vhosts$ cat corahnrin.pierstoval.com esterenmaps.pierstoval.com

corahnrin.pierstoval.com {
        root * /var/www/corahnrin.pierstoval.com/www/public/
        encode gzip
        file_server
        import blockspam
        php_fastcgi unix//run/php/php8.0-fpm.sock
        tls ...
        log {
                output file /var/www/corahnrin.pierstoval.com/access.log {
                        roll_size 512mb
                        roll_keep_for 720h
                }
        }
}


esterenmaps.pierstoval.com {
        root * /var/www/esterenmaps.pierstoval.com/www/public/
        encode gzip
        import blockspam
        file_server
        php_fastcgi unix//run/php/php8.0-fpm.sock
        tls ...
        log {
                output file /var/www/esterenmaps.pierstoval.com/access.log {
                        roll_size 512mb
                        roll_keep_for 720h
                }
        }
}

I could use a snippet and use the first arg as domain name, that's how I organize my apps, and it might maybe simplify stuff, but sometimes I have to customize the PHP version (some projects use 7.4 and some others use 8.1 for example), so the more specific use-cases I have, the more I need abstraction, the more complex it becomes in the end 😅 There's a balance for all of that and it's not easy to find it.

Pierstoval avatar Sep 30 '22 12:09 Pierstoval

FYI @Pierstoval I have an initial implementation in #5107, check it out!

francislavoie avatar Oct 02 '22 19:10 francislavoie

Does this in anyway compare (or relate) to https://github.com/abiosoft/caddy-named-routes?

abiosoft avatar Oct 03 '22 10:10 abiosoft

Yeah @abiosoft it's quite similar.

Main difference I see is that my implementation only provisions the named routes once, whereas yours provisions them each time they're used in a config.

A goal here is to reduce memory usage by only having one instance, and having the benefit of shared state when that's potentially useful.

Also I implemented Caddyfile support in mine which is pretty important for most users.

I notice you have cycle checking in yours which I didn't do... Maybe I should do that 🤔

francislavoie avatar Oct 03 '22 13:10 francislavoie

It's awesome that it was merged!

I can't wait for next release to try this out 👌

Pierstoval avatar May 16 '23 18:05 Pierstoval