edx-platform
edx-platform copied to clipboard
fix: replaced md4 with blake2b
Description
This PR replaces md4 with blake2b in memcache. At MIT, we recently encountered an issue after upgrading our base image from a debian-buster image to a debian-bookworm image. This uncovered an incompatibility between the newer version of openssl and one of the hash digests that edx-platform is using (1.1.1d-0+deb10u7). Basically, openssl + hashlib are saying that md4 is supported and available but if you actually try to invoke openssl md4 you’ll get an error saying it is not available. This PR resolves this issue by updating the hash function to something a bit newer and more widely supported.
Useful information to include:
- Which edX user roles will this change impact? Learner, Course Author
Supporting information
https://discuss.openedx.org/t/openssl-upgrade-md4-hashing/12654
Testing instructions
- In the lms container run
python manage.py lms shell
- In the python shell run:
>>> import hashlib
>>> blake2b = hashlib.new("blake2b", digest_size=16)
>>> string = "abc"
>>> blake2b.update(string.encode('utf-8'))
>>> blake2b.hexdigest()
- You should get the hash without any errors in the above statements.
Deadline
"None"
Other information
Include anything else that will help reviewers and consumers understand the change.
- Relevant openssl issue: https://github.com/openssl/openssl/issues/21247
Thanks for the pull request, @Anas12091101! Please note that it may take us up to several weeks or months to complete a review and merge your PR.
Feel free to add as much of the following information to the ticket as you can:
- supporting documentation
- Open edX discussion forum threads
- timeline information ("this must be merged by XX date", and why that is)
- partner information ("this is a course on edx.org")
- any other information that can help Product understand the context for the PR
All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here.
Please let us know once your PR is ready for our review and all tests are green.
@Anas12091101 I know this caused problems with launching the gradebook MFE. Do you know what else in edx-platform might use this caching scheme?
My only concern is that this will have the effective of clearing the entire cache, potentially causing a disruption. (On edx.org we only clear memcache as a last resort, as it logs out all sessions and causes a performance hit.)
So, a few thoughts:
- Is there some option for a rotation mechanism? Fetching two keys (one blake2b, one md4) would be a much higher traffic burden, but maybe there's something clever we can do.
- Would it be possible to just use a different implementation of MD4? MD4 is a weak hash function for cryptographic purposes, but should be fine for sharding.
@Anas12091101 I know this caused problems with launching the gradebook MFE. Do you know what else in edx-platform might use this caching scheme?
@pdpinch I think by default in LMS and CMS memcache is being used as the caching scheme in edx-platform. So anywhere in edx-platform where we are using cache.get, we are calling the safe_key
fn in the memcache.py
One option would be to put this in a try/except
block catching the ValueError
that returns when md4
is not a supported algorithm and falling back to Blake2B. That doesn't solve for the case when deploying a new set of instances with the newer OpenSSL that no longer supports that method.
On the note of using a different implementation of md4
, that will give a stable target (for now) but introduces the potential for stagnation, as that algorithm is (rightfully) not supported any more. That leaves us in the situation of either accepting a one-time hit of dumping cached items or developing a migration path.
All of that being said, the hashed key is only used if the combined values of key, key_prefix, and version exceeds 250 characters. Given that the cache is designed to be a performance enhancement, not a functional requirement and that only a subset of cached values will be impacted I think that this should be a safe change to merge without developing a complex stack of migration logic.
Good to know this only affects larger keys. I might put in a quick PR to add telemetry for checking what key sizes we're seeing on edX.
I think it might make sense to put this behind a feature switch and make a DEPR out of removing the md4 option entirely. Switching to blake2b as a default seems like a very reasonable thing to do, but having a well-defined timeframe would be helpful here.
From personal experience, while caches are nominally a performance enhancement, a full cache can very quickly become a critical requirement for system stability. I'd like to understand which situation we're in here before saying that I'm comfortable with a hard cutover.
@timmc-edx what do you think?
Looks good! The feature flag would allow us to schedule a time to do a cutover in a controlled way (and would give us time to research which keys would be affected).
@Anas12091101 would you mind squashing your commits?
@timmc-edx we took the route of adding a feature flag to ease the changeover. Is this OK to merge from your perspective?
After this merges, we'll open a DEPR for removing the md4 code and the feature flag, but it wouldn't be removed until after the Redwood or Sumac releases, depending on timing.
Yes, I like this approach, and have no reservations about it merging.
@Anas12091101 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future.
2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production.
2U Release Notice: This PR has been deployed to the edX production environment.