caddy icon indicating copy to clipboard operation
caddy copied to clipboard

Support "precompressed" files without "base" files

Open TobiX opened this issue 3 years ago • 21 comments
trafficstars

As discussed in https://caddy.community/t/precompressed-files-without-base-files/17330, I was thinking about an extension to the precompressed feature: It would be nice to only have compressed files on disk and not have to keep the uncompressed files around...

Pro: Most clients support compression nowadays, so nothing changes for most clients Contra: If some client not supporting compression comes around, Caddy need to decompress the file before delivering it to the client

@francislavoie suggested in the community thread that supporting this feature might impact performance due to the additional stat calls, so we probably need to keep that aspect in mind, too...

PS: While thinking about it a bit more, I could probably implement it as a custom file system which "synthesizes" the uncompressed files :laughing:

TobiX avatar Oct 03 '22 22:10 TobiX

I'm thinking it could be a new boolean option called precompressed_allow_no_base for a lack of a better name.

francislavoie avatar Oct 04 '22 00:10 francislavoie

How about prefer_precompressed?

mholt avatar Oct 04 '22 17:10 mholt

I'm not really sure what all is needed for this and might need help. But I wouldn't mind taking a stab at this.

shade34321 avatar Oct 10 '22 12:10 shade34321

@shade34321 take a look at the fileserver code and at the docs. Look for precompressed in the codebase for the key places to look at.

francislavoie avatar Oct 10 '22 14:10 francislavoie

I've taken a look and I think this is going to take me a bit longer to understand before making a fix. So if someone else wants to work on it they can but I'll keep chugging along until I see a fix. Just wanted to update everybody.

shade34321 avatar Oct 15 '22 12:10 shade34321

Would like to take this up !

titanventura avatar Nov 29 '22 04:11 titanventura

Sounds good! Let us know if you have any questions.

mholt avatar Nov 29 '22 16:11 mholt

I have a question. What approach does caddy currently take, for clients that do not support compression ?

titanventura avatar Dec 26 '22 07:12 titanventura

would be better if i can connect on call with a person who can explain about this. I'm not sure if this is common in open-source communities. But i feel it is essential.

titanventura avatar Dec 26 '22 07:12 titanventura

We follow the Accept-Encoding header. If a client doesn't declare it can handle compressed content, then Caddy will not compress it.

For file_server's precompressed files, that's why it's useful to have an uncompressed copy on file (most of the time).

This feature request would require decompressing the content before sending it if the client doesn't support compression, if the "no base" option is used.

francislavoie avatar Dec 26 '22 14:12 francislavoie

I have a question. What approach does caddy currently take, for clients that do not support compression ?

I suggest you just read the code and maybe run through it with a debugger, starting here: https://github.com/caddyserver/caddy/blob/b166b9008344786e7cfe5715b7e8a4cd3ec1bd76/modules/caddyhttp/fileserver/staticfiles.go#L233

More specifically look at this part: https://github.com/caddyserver/caddy/blob/b166b9008344786e7cfe5715b7e8a4cd3ec1bd76/modules/caddyhttp/fileserver/staticfiles.go#L357-L393

The call to encode.AcceptedEncodings normally returns an ordered list of accepted encodings in the request (Accept-Encoding) and then checks if they are present in the file system. Since you ask for the case of no compression, this loop will be skipped entirely. That means the variable file will still be nil, which makes you enter this case and open the actual file to serve: https://github.com/caddyserver/caddy/blob/b166b9008344786e7cfe5715b7e8a4cd3ec1bd76/modules/caddyhttp/fileserver/staticfiles.go#L395-L411

ueffel avatar Dec 26 '22 15:12 ueffel

Sure. Thank you all for adding more information ! Will look at it.

titanventura avatar Dec 27 '22 07:12 titanventura

But i don't understand why we need a new boolean option called prefer_precompressed for this. If caddy finds an sidecar/precompressed file that does not have it's corresponding uncompressed file, we can decompress it on the fly (for clients that do not support compression) without the need for a boolean option right ?

titanventura avatar Dec 27 '22 07:12 titanventura

@titanventura did you read the discussion in the community forum thread? The reasoning is covered there. And reading the code, it should be clear why we do it that way currently.

francislavoie avatar Dec 27 '22 11:12 francislavoie

I feel this feature can be implemented without prefer_precompressed option. Because, we can implement it in a way so that, if the uncompressed file is NOT present and the client does NOT accept the encoding, we can decide with decompressing the file on the fly since we have no other option to serve the client right ? Please correct me if i'm wrong, im a newbie to the open source realm. I would love to get on call to discuss things in more detail (if that works with you).

titanventura avatar Dec 27 '22 11:12 titanventura

As I wrote before, currently the code is set up such that if the original file does not exist (checked first), a 404 is immediately returned and no additional system calls are made. This is optimal to "fail fast" and assumes that no compressed version of the file will exist.

If we change the code (without a new option) then additional system calls would be made on all requests to missing files which can be seen as a performance regression, especially if someone is making tons of requests to brute force trying to find some files (but admittedly in that case you should probably have some rate limiting in place to stop that from happening).

My argument is that most users of precompressed will probably have both versions on disk, as is commonly done when using JS build tooling to precompress assets. So I think that performance regression to allow for not having the uncompressed copy should be opt-in.

francislavoie avatar Dec 27 '22 17:12 francislavoie

Ahh I get it now. This has made things much more clear. Thank you so much @francislavoie . Will try and work on it now that i've understood the issue !

titanventura avatar Dec 27 '22 17:12 titanventura

Hi @francislavoie @mholt I would love to spend some time solving this. Sharing my understanding below of the above discussion before I start to work on it,

Issue: If there's only a compressed copy of file on server and and the client doesn't accept the compressed file, it returns 404.

Solution:

  1. We can allow decompressing of the compressed files on the fly if the precompressed file doesn't exist and the client doesn't accept the compressed files.

    Issue: It can lead to performance regression issues.

  2. We can make it as an opt-in feature so that the server administrator can specify that he wants to decompress and serve files maybe for specific routes (if the precompressed file doesn't exist) and keep returning 404 if not opted for assuming that the prescompressed file must exist.

Please correct me if I have some flaws in understanding.

singhalkarun avatar Aug 14 '23 07:08 singhalkarun

Sounds correct @singhalkarun.

francislavoie avatar Aug 14 '23 08:08 francislavoie

Sounds correct @singhalkarun.

Thanks for confirming @francislavoie . I am starting up on this. I will keep you posted.

singhalkarun avatar Aug 14 '23 11:08 singhalkarun

I'd love to get this one done. I already started working on it.

3v0k4 avatar Apr 24 '24 21:04 3v0k4