Requests icon indicating copy to clipboard operation
Requests copied to clipboard

".tgz" files somtimes decompressed in error.

Open jerrm opened this issue 4 years ago • 1 comments

Requests.php will mistakenly decompress a .tgz file (or other gzipped files) downloaded from sites that pass a "content-encoding: none" header.

In the case of .tgz files, you end up with a misnamed, uncompressed tar file. Often manageable, but not if there needs to be an md5/sha1/etc checksum confirmation.

The decompress code is wrapped by if (isset($return->headers['content-encoding'])), so the decompress code fires even though the encoding is "none". Should there be a check for a proper encoding type, or at minimum, check for "none" before decompressing?

Ran into this with bitbucket. A sample file to demonstrate the issue: https://bitbucket.org/jerrm-bb/testdownload/downloads/outcnam-test.tgz

jerrm avatar Nov 19 '20 00:11 jerrm

For further clarification and provide workarounds for anyone that lands here from google:

The problem occurs when accessing $response->body.

The below code will errantly decompress the .tgz file:

$response = Requests::get('https://bitbucket.org/jerrm-bb/testdownload/downloads/outcnam-test.tgz', array(), array());
file_put_contents('response-body-test', $response->body);

Easiest work-around I've found is to save the file using $options and avoid accessing $response->body:

Requests::get('https://bitbucket.org/jerrm-bb/testdownload/downloads/outcnam-test.tgz', array(), array('filename' => 'options-save-test'));

If the intact body is needed for further processing, extract it from $response->raw:

$response = Requests::get('https://bitbucket.org/jerrm-bb/testdownload/downloads/outcnam-test.tgz', array(), array());
$pos = strpos($response->raw, "\r\n\r\n");
$filedata = substr($response->raw, $pos + strlen("\n\r\n\r"));
file_put_contents('extracted-filedata-test', $filedata);

jerrm avatar Nov 22 '20 17:11 jerrm