Django with django-seo-js returning "invalid code length" error to browser and curl
Hey,
All of the sudden, Google, Facebook, and other crawlers started reporting my website as being unavailable. We're using django-seo-js==0.2.4 and the paid version of Prerender.io. Testing using _escaped_fragment_ in the browser and using cURL showed that responses are for some reason returning invalid:

curl: (61) Error while processing content unencoding: invalid code lengths set
I made no changes to anything affecting Prerender.io nor this configuration of this library; this just started happening out of the blue.
Doing some deeper walkthroughs through the code, Prerender.io appears to cache the content correctly, and calling self.backend.get_response_for_url(url) also returns a response with the correctly rendered HTML content, including getting the response from Prerender and transforming the requests response into a Django HttpResponse object.
When that gets returned, though, for some reason both the browser and curl think it's invalid.
I've done plenty of debugging but I'm a bit at a loss here; all I can think of is that base.py:56 is too naive with r['content-length'] = len(response.content), or it's some type of gzip issue, where somehow headers or encodings or getting passed on that shouldn't be.
Ultimately, though, my site is currently not crawlable, and that's obviously a major issue for us.
Some more research on this is showing that it might be because django-seo-js depends on requests 2.2.1, which is an older version of requests. It may be incompatible with the current requests 2.9.1.
It ended up turning out that the issue was django_seo_js is passing on a Content-Encoding header from PrerenderIO, which is causing all the problems.
Subclassing with the below code fixed it:
from django.http import HttpResponse
from django_seo_js.backends import PrerenderIO
from django_seo_js.backends.base import RequestsBasedBackend, IGNORED_HEADERS
class FixedRequestsBasedBackend(RequestsBasedBackend):
def build_django_response_from_requests_response(self, response):
# Key difference -- we're excluding "content-encoding" from the response
r = HttpResponse(response.content)
for k, v in response.headers.items():
if k.lower() not in IGNORED_HEADERS:
r[k] = v
r['content-length'] = len(response.content)
r.status_code = response.status_code
return r
class FixedPrerenderIO(FixedRequestsBasedBackend, PrerenderIO):
pass