http
http copied to clipboard
Auto-deflate raises Zlib::BufError on URLs that can otherwise be decoded
Example request:
url = 'https://m.huffingtonpost.es/entry/el-supremo-anula-la-sentencia-contra-otegi-y-los-demas-acusados-en-el-caso-bateragune_es_5f24056ac5b6a34284b99a0a?25f'
HTTP.use(:auto_inflate)
.follow
.headers('Accept-Encoding' => 'gzip')
.get(url)
immediately raises:
Traceback (most recent call last):
...
7: from vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/chainable.rb:20:in `get'
6: from vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/client.rb:34:in `request'
5: from vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/redirector.rb:59:in `perform'
4: from vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/response.rb:94:in `flush'
3: from vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/response/body.rb:51:in `to_s'
2: from vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/response/inflater.rb:19:in `readpartial'
1: from vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/response/inflater.rb:19:in `finish'
Zlib::BufError (buffer error)
If you fetch the URL without use(:auto_inflate), read the body into a string, then feed it into Zlib in full, then it's correctly decoded without errors:
res = HTTP.follow.headers('Accept-Encoding' => 'gzip').get(url)
zlib = Zlib::Inflate.new(32 + Zlib::MAX_WBITS)
zlib.inflate(res.to_s)
zlib.finish
zlib.close
It must be related to how the chunks are read but I don't know enough about Zlib to understand why.
The error is raised in the Redirector (lib/http/redirector.rb:59) when it tries to flush the body of the first response:
res = HTTP.headers('Accept-Encoding' => 'gzip').get(url)
zlib = Zlib::Inflate.new(32 + Zlib::MAX_WBITS)
zlib.inflate(res.to_s)
zlib.finish # => Zlib::BufError: buffer error
res.to_s # => " "
res.headers['Content-Encoding'] # => "gzip"
IMO returned response is faulty. It shouldn't contain Content-Encoding header if body is not compressed.
On the other hand it should be easy to "fix" it on http-rb side by skipping decompression when body is flushed.