rustypaste icon indicating copy to clipboard operation
rustypaste copied to clipboard

Strange behaviour with large uploads behind nginx.

Open Anagastes opened this issue 2 months ago • 7 comments

Hi,

I recently noticed some strange behaviour with rustypaste when dealing with large amounts of data.

It happens at the end when it should actually be 100%.

Kooha/Kooha-2025-10-21-20-23-36.mkv => Upload error: `<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx</center>
</body>
</html> (status code: 502)`

Here's what's in the error log.

2025/10/23 15:13:16 [error] 422289#422289: *350865 upstream prematurely closed connection while reading response header from upstream, client: XXXXX, server: YYYYYY, request: "POST / HTTP/1.1", upstream: "http://127.0.0.1:8000/", host: "YYYYYY"

I'm currently not using a CDN, so it's a direct connection.

Here's the NGNX config.


upstream rpaste {
    server 127.0.0.1:8000;
    keepalive 250;
}

server {
    listen 443 ssl;
    listen [::]:443 ssl;
    server_name YYYYYY;
}


location / {
    proxy_pass                         http://rpaste/;
    proxy_set_header Host              $host;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";

    proxy_headers_hash_bucket_size 128;

    proxy_http_version 1.1;
    proxy_max_temp_file_size 4096M;

    client_body_temp_path /tmp;
    client_max_body_size 3500M;
    client_body_buffer_size 20M;
    client_body_in_file_only clean;

    keepalive_timeout 165;

    proxy_connect_timeout   5500s;
    proxy_send_timeout      5500s;
    proxy_read_timeout      5500s;
    send_timeout            5500s;

    proxy_redirect off;

    proxy_buffer_size          128k;
    proxy_buffers              4 256k;
    proxy_busy_buffers_size    256k;
}

I've tried several times now and always get the same error.

I have already configured ngnix specifically so that it stores uploads in tmp files and not in the RAM. That was the initial error (insufficient RAM). However, this only helped with smaller data files.

It appears that rustypaste is somehow ‘timed out’.

Here is my config.toml

[config]
refresh_rate = "1s"

[server]
address = "127.0.0.1:8000"
workers=6
max_content_length = "3500MB"
upload_path = "./upload"
timeout = "5500s"
expose_version = false
expose_list = true
auth_tokens = [
  "super_secret_token",
]
delete_tokens = [
  "super_secret_token"
]
handle_spaces = "replace" # or "encode"
max_upload_dir_size = "3500M"

[landing_page]
text = """
┬─┐┬ ┬┌─┐┌┬┐┬ ┬┌─┐┌─┐┌─┐┌┬┐┌─┐
├┬┘│ │└─┐ │ └┬┘├─┘├─┤└─┐ │ ├┤
┴└─└─┘└─┘ ┴  ┴ ┴  ┴ ┴└─┘ ┴ └─┘

Private instance only (auth):

The server administrator might remove any pastes that they do not personally
want to host.
"""
#file = "index.txt"
content_type = "text/plain; charset=utf-8"

[paste]
random_url = { type = "petname", words = 2, separator = "-" }
#random_url = { type = "alphanumeric", length = 8 }
#random_url = { type = "alphanumeric", length = 6, suffix_mode = true }
default_extension = "txt"
mime_override = [
  { mime = "image/jpeg", regex = "^.*\\.jpg$" },
  { mime = "image/png", regex = "^.*\\.png$" },
  { mime = "image/svg+xml", regex = "^.*\\.svg$" },
  { mime = "video/webm", regex = "^.*\\.webm$" },
  { mime = "video/x-matroska", regex = "^.*\\.mkv$" },
  { mime = "application/octet-stream", regex = "^.*\\.bin$" },
  { mime = "text/plain", regex = "^.*\\.(log|txt|diff|sh|rs|toml)$" },
]
mime_blacklist = [
  "application/x-dosexec",
  "application/java-archive",
  "application/java-vm",
  "application/vnd.apple.installer+xml",
  "application/x-msdownload",
]
duplicate_files = true
default_expiry = "1h"
delete_expired_files = { enabled = true, interval = "1h" }

Any ideas?

Anagastes avatar Oct 23 '25 13:10 Anagastes

rustypaste uses chunked uploads so it's rather unlikely that there is a timeout. Is there an error message in rustypaste's log file?

I've never used rustypaste for uploading huge files and my testing options are limited since my upstream is very slow. I am also not an nginx expert as I have been using Apache httpd myself for the last 30 years.

tessus avatar Oct 26 '25 17:10 tessus

Okay... I'm actually fascinated...

I've now activated systemd debugging specifically for this... and had it write a log.

In short, it's not rustypaste's fault...

In long form, rustypaste is terminated with OOM... o.o

Okt 28 03:18:17 SERVER kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/rustypaste.service,task=rustypaste,pid=424615,uid=33
Okt 28 03:18:17 SERVER kernel: Out of memory: Killed process 424615 (rustypaste) total-vm:6197660kB, anon-rss:1323436kB, file-rss:0kB, shmem-rss:0kB, UID:33 pgtables:5844kB oom_score_adj:0
Okt 28 03:18:17 SERVER systemd[1]: rustypaste.service: Main process exited, code=killed, status=9/KILL

Sorry for not looking closely.

That's basically why I switched to tmp files with nginx. Not enough RAM. But I only moved it.

NGINX writes a tmp file (saving RAM), but as soon as this file is finished, it is passed on to rustypaste in its entirety, and THEN the RAM is too low again...

And now the big question: can rustypaste process the file partially? :D

Because then it's not a bug, just simply too little RAM for the large file... :S

Anagastes avatar Oct 28 '25 02:10 Anagastes

This is quite interesting. I am wondering where and how rustypaste would use all that memory.

The middleware processes files as streams (chunked upload), so chunks are written to disk. There is no putting chunks together in memory, unless I misunderstood how streaming works in actix. I don't have the entire code in my head, but the only part that could use a lot of memory is when a hash is generated to check whether it is a duplicate file. But you are using duplicate_files = true, so this operation does not happen.

tessus avatar Oct 28 '25 21:10 tessus

I've now combed through the nginx documentation a bit.

I can bypass the ‘buffering’ and pass it directly to rustypaste.

BUT, then I have the problem again that everything ends up directly in RAM. Not from nginx during upload, but from rustypaste. And then the error is identical. OOM Kill.

So yes, rustypaste puts everything in RAM. Regardless of whether via method A (via tmp file) or method B (bypass). 👀

Anagastes avatar Oct 30 '25 20:10 Anagastes

@orhun why would rustypaste use as much RAM as the size of the uploaded file? isn't file streaming (chunked uploads) a way around this? or is there a bug in actix?

tessus avatar Oct 30 '25 21:10 tessus

No idea, I need to investigate it.

We should definitely support chunked uploads, if that's not what we're doing already.

orhun avatar Oct 31 '25 04:10 orhun

btw, I opened a rustypaste v2 discussion as you suggested a while back.

tessus avatar Oct 31 '25 04:10 tessus