Reduce memory usage of rate limiting
RateLimitStorage grows whenever a new IP address makes a request, so it's extremely important to make it use as little memory as possible.
The biggest problem is that old buckets are not deleted, so the memory consumption just keeps on growing. This could cause the server to run out of memory. To fix this, I added a weekly task that causes an IP address's buckets to be deleted after 1 to 2 weeks of inactivity (or more if any rate limit interval is longer than 1 week).
And here's each step in my process of optimizing the buckets field of RateLimitStorage:
HashMap<RateLimitType, HashMap<IpAddr, RateLimitBucket>> (original)
IpAddris a wrapper forString.RateLimitBucketvalues for allRateLimitTypes are initialized when an IP address is added, so there's always oneIpAddrstored for eachRateLimitType. On a 64 bit machine, they use a total of at least 186 bytes per IP address (eachStringstores 24 bytes inline, and has at least the amount of characters in "0.0.0.0").
HashMap<IpAddr, HashMap<RateLimitType, RateLimitBucket>>
- Now, for each IP address, only one
IpAddris stored instead of six.
HashMap<IpAddr, EnumMap<RateLimitType, RateLimitBucket>>
EnumMaponly stores an inline fixed-sized array:[RateLimitBucket; 6]. It doesn't store things like length, pointers, or even keys. This is the most compact it can be.
HashMap<Ipv6Addr, EnumMap<RateLimitType, RateLimitBucket>>
Ipv6Addris only 16 bytes. This uses less memory than an emptyString.
I also made RateLimitBucket 3 times smaller, which saves 96 bytes per IP address.
I also made get_ip work correctly with IPv6 addresses. The previous implementation would only return the first segment of an IPv6 address (before the first colon).
Now ready to merge
This is only kinda related but I wanted to mention that rate-limiting by full ipv6 addresses is pretty useless (except for the most honest of users) because you get assigned a full /64 (2**64 addresses) or even a full /56 (2**(128-56) addresses) and can switch between them without any effort (with privacy extensions it even switches automatically every hour or so).
So for ipv6 rate limiting to be mostly equivalent to the ipv4-rate limiting you need to store the /64 subnet and not the full ip, and for it to be really effective you need to use a cascading rate limit. e.g. allowing each /64 1x some limit, each /56 5x some limit, and each /48 10x some limit.
Oh noooo. I should probably try to prevent that IPv6 rate limiting problem in this pull request. If it gets merged without a fix, then it could be disasterous because it removes the accidental /16 subnet rate limiting that was caused by the bug in get_ip.
It now limits /64 with 1x capacity, /56 with 4x capacity, and /48 with 16x capacity.
Needs a cargo +nightly fmt
I fixed the formatting and the CI still shows the same formatting error
Good job, thank you!