bun icon indicating copy to clipboard operation
bun copied to clipboard

`Decompression error: ShortRead` when fetching

Open SukkaW opened this issue 2 years ago • 13 comments

What version of Bun is running?

1.0.21+ecdde8867

What platform is your computer?

Darwin 23.2.0 arm64 arm

What steps can reproduce the bug?

git clone https://github.com/sukka-reproductions/bun-fetch-gzip-shortread
cd bun-fetch-gzip-shortread
bun run index.ts
// index.ts
(async () => {
  console.log(Bun.version, Bun.revision);

  const res = await fetch('https://phishing.army/download/phishing_army_blocklist.txt', { verbose: true });
  console.log(res);

  if (!res.body) {
    throw new TypeError('No body!');
  }
  for await (const chunk of res.body) {
    console.log(chunk.length);
  }
})();

What is the expected behavior?

The reproduction above should not produce any error.

What do you see instead?

$ bun run index.ts
1.0.21 ecdde886707b20ad1db67f5b53cb163d2a6c66d2

Request: GET /download/phishing_army_blocklist.txt
	Connection: keep-alive
	User-Agent: Bun/1.0.21
	Accept: */*
	Host: phishing.army
	Accept-Encoding: gzip, deflate, br

Response: < 200 Connection established

Response: < 200 OK
< 	Date: Sat, 06 Jan 2024 15:32:26 GMT
< 	Content-Type: text/plain
< 	Transfer-Encoding: chunked
< 	Connection: keep-alive
< 	last-modified: Sat, 06 Jan 2024 11:46:21 GMT
< 	vary: Accept-Encoding,User-Agent
< 	x-turbo-charged-by: LiteSpeed
< 	Cache-Control: max-age=18000
< 	CF-Cache-Status: HIT
< 	Age: 1929
< 	Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=32DZPmzjFR6fjhd%2BIVISiwPY0GP46KcaltPSKSHNFJzAEpeqRmNCwRxNfNGWv6NnFgcgDtOXEOhVSkHrrFYbUk%2BgDWQPaVCgu8FfkN%2B6VnPXF4k3iHWhECUsOZ1mCxro"}],"group":"cf-nel","max_age":604800}
< 	NEL: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
< 	Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
< 	X-Content-Type-Options: nosniff
< 	Server: cloudflare
< 	CF-RAY: 84150380d88980c0-NRT
< 	Content-Encoding: br
< 	alt-svc: h3=":443"; ma=86400

Response (6.0 KB) {
  ok: true,
  url: "https://phishing.army/download/phishing_army_blocklist.txt",
  status: 200,
  statusText: "OK",
  headers: Headers {
    "date": "Sat, 06 Jan 2024 15:32:26 GMT",
    "content-type": "text/plain",
    "transfer-encoding": "chunked",
    "connection": "keep-alive",
    "last-modified": "Sat, 06 Jan 2024 11:46:21 GMT",
    "vary": "Accept-Encoding,User-Agent",
    "cache-control": "max-age=18000",
    "age": "1929",
    "report-to": "{\"endpoints\":[{\"url\":\"https:\\/\\/a.nel.cloudflare.com\\/report\\/v3?s=32DZPmzjFR6fjhd%2BIVISiwPY0GP46KcaltPSKSHNFJzAEpeqRmNCwRxNfNGWv6NnFgcgDtOXEOhVSkHrrFYbUk%2BgDWQPaVCgu8FfkN%2B6VnPXF4k3iHWhECUsOZ1mCxro\"}],\"group\":\"cf-nel\",\"max_age\":604800}",
    "strict-transport-security": "max-age=31536000; includeSubDomains; preload",
    "x-content-type-options": "nosniff",
    "content-encoding": "br",
    "x-turbo-charged-by": "LiteSpeed",
    "cf-cache-status": "HIT",
    "nel": "{\"success_fraction\":0,\"report_to\":\"cf-nel\",\"max_age\":604800}",
    "server": "cloudflare",
    "cf-ray": "84150380d88980c0-NRT",
    "alt-svc": "h3=\":443\"; ma=86400",
  },
  redirected: false,
  bodyUsed: false
}
4602
4602
4602
4602
4602
4602
4602
2999
Decompression error: ShortRead
ShortRead: fetch() failed. For more information, pass `verbose: true` in the second argument to fetch()
 path: "https://phishing.army/download/phishing_army_blocklist.txt"
ShortRead: fetch() failed. For more information, pass `verbose: true` in the second argument to fetch()
 path: "https://phishing.army/download/phishing_army_blocklist.txt"

Additional information

No response

SukkaW avatar Jan 06 '24 15:01 SukkaW

I am getting the same error

knajjars avatar Jan 11 '24 08:01 knajjars

I am getting the same error

Could you provide the URL and the code that reproduces the error? The more information you provide, the easier it will be for the bun team to identify the root cause of the issue.

SukkaW avatar Jan 11 '24 09:01 SukkaW

Got the same when updating to bun 1.0.22

What version of Bun is running? 1.0.22+b400b36ca

What platform is your computer? Darwin 23.1.0 arm64 arm (mac book pro m2)

I get this error

Decompression error: ShortRead
error: script "dev" was terminated by signal SIGSEGV (Address boundary error)

Do you have any command I could run to get more info from the error ?

LeTamanoir avatar Jan 15 '24 14:01 LeTamanoir

I am getting the exact same error, did anyone find a workaround for this ?

BaptistG avatar Jan 18 '24 10:01 BaptistG

Just encountered this today. The following code reproduces it.

export function assert(condition: boolean, message: string): asserts condition {
  if (!condition) {
    throw new Error(message);
  }
}

class TableHandler {
  data = [];

  element(element) {
    // Process each row in the table
    element.querySelectorAll("tr").forEach((row, index) => {
      // Skip the header row
      if (index === 0) return;

      const cells = row.querySelectorAll("td");
      if (cells.length >= 2) {
        const name = cells[0].textContent.trim();
        const address = cells[1].textContent.trim();
        this.data.push({ name, address });
      }
    });
  }
}

export async function fetchMultisigsFromGitbook() {
  const response = await fetch("https://info.send.it/finance/multisigs");
  const tableHandler = new TableHandler();

  const rewriter = new HTMLRewriter().on("table", tableHandler);
  await rewriter.transform(response).text();

  assert(tableHandler.data.length > 0, "No multisigs found");
  return tableHandler.data;
}

await fetchMultisigsFromGitbook();

0xBigBoss avatar Jan 20 '24 18:01 0xBigBoss

For us its the rewriter.transform(res) that triggers the decompression error.

bjon avatar Jan 23 '24 22:01 bjon

Same error here, bun version 1.0.25 :confused:

Edit: downgrading to version 1.0.20 (1.0.20+09d51486e) fixed it

Feridinha avatar Jan 24 '24 06:01 Feridinha

Only way I could fix:

await fetch(url, { headers: { 'Accept-Encoding': 'identity' } })

I wonder if this is related to Bun's Brotli compression issue? Seems unlikely, as this issue still occurred when setting Accept-Encoding to anything but identity.

Bun v1.0.23

tgrushka avatar Jan 25 '24 23:01 tgrushka

Only way I could fix:

await fetch(url, { headers: { 'Accept-Encoding': 'identity' } })

I wonder if this is related to Bun's Brotli compression issue? Seems unlikely, as this issue still occurred when setting Accept-Encoding to anything but identity.

Bun v1.0.23

I suspect this issue might be related to Bun's transform stream implementation under the hood. I've encountered a variety of issues manipulating streams with bun besides this specific fetch issue.

SukkaW avatar Jan 26 '24 05:01 SukkaW

It is becoming more and more ridiculous:

bun 1.1.0+4f98336f8 now consistently crashes with Decompression error: ShortRead when fetching https://geofeed.ipxo.com/geofeed.txt.

SukkaW avatar Feb 01 '24 08:02 SukkaW

Seeing this as well when using react-pdf to generate a PDF file.

Rolling back to 1.0.20 fixes it for me.

EvHaus avatar Feb 02 '24 07:02 EvHaus

It looks like works if I rollback to 1.0.18.

Leechael avatar Feb 03 '24 11:02 Leechael

I've had this happen to me but in different circumstances. It's happening intermittently. My script just reads and writes some JSON files asynchronously. Unrelated to web-api or any network calls for that matter.

Decompression error: ShortRead
error: script "process" was terminated by signal SIGSEGV (Address boundary error)

niieani avatar Feb 10 '24 22:02 niieani

This continues to be an issue after https://github.com/oven-sh/bun/pull/8874.

This appears to be an issue in our brotli usage.

// index.ts
(async () => {
  console.log(Bun.version, Bun.revision);

-  const res = await fetch('https://phishing.army/download/phishing_army_blocklist.txt', { verbose: true });
+  const res = await fetch('https://phishing.army/download/phishing_army_blocklist.txt', { verbose: true, keepalive: false, headers: { "Accept-Encoding": "gzip, deflate, identity"} });

  console.log(res);

  if (!res.body) {
    throw new TypeError('No body!');
  }
  for await (const chunk of res.body) {
    console.log(chunk.length);
  }
})();

if brotli is not included in the list of Accept-Encoding headers, this no longer reproduces. My guess is we are not correctly handling when Brotli returns ShortRead and there is still data remaining to be read in the stream (or potentially with chunked encoding)

Jarred-Sumner avatar Feb 13 '24 06:02 Jarred-Sumner

My guess is we are not correctly handling when brotli returns ShortRead and there is still data remaining to be read in the stream (or potentially with chunked encoding)

Your guess is correct. This might fix it:

https://github.com/argosphil/bun/commit/98491f1854a58df64b02fbb9126b21b9895de211

Then again, it might not (though it should be applied after renaming total_in to something more sensible given how it's used). This experiment splits input data into single-byte packages to test the code:

https://github.com/oven-sh/bun/compare/main...argosphil:bun:issue-8017-experiment

argosphil avatar Feb 13 '24 13:02 argosphil