req icon indicating copy to clipboard operation
req copied to clipboard

Move to using the ExBrotli NIF for Brotli decompression

Open s3cur3 opened this issue 1 year ago • 2 comments

Hi Wojtek!

This is mostly for discussion—the PR is not really ready to merge as-is for a number of reasons, but it's at least something concrete to look at.

The Erlang implementation of Brotli decompression errors on many real-world web pages I've tried to scrape with it, crashing the Req process. The backtrace looks like this:

** (MatchError) no match of right hand side value: {:error, :stream_failed}
    (req 0.4.5) lib/req/steps.ex:1360: Req.Steps.decompress_body/3
    (req 0.4.5) lib/req/steps.ex:1325: Req.Steps.decompress_body/1
    (req 0.4.5) lib/req/request.ex:1009: anonymous fn/2 in Req.Request.run_response/2
    (elixir 1.15.7) lib/enum.ex:4830: Enumerable.List.reduce/3
    (elixir 1.15.7) lib/enum.ex:2564: Enum.reduce_while/3
    (req 0.4.5) lib/req/request.ex:937: Req.Request.run/1
    (req 0.4.5) lib/req/steps.ex:1658: Req.Steps.redirect/1
    iex:7: (file)

I was able to solve my problem by using this library, which is just a NIF wrapper for Google's C implementation. On the sample corpus in the newly-added decompression test, all those pages would previously crash, but they work with the NIF. (Well... I should say with the right headers they can be made to work.)

Is this something you'd be interested in taking?

s3cur3 avatar Dec 22 '23 12:12 s3cur3