hono icon indicating copy to clipboard operation
hono copied to clipboard

fix(websocket): prevent sending entire buffer when streaming Uint8Array chunks

Open Ariel-Moroshko opened this issue 1 year ago • 5 comments

Problem

When sending a Uint8Array through the WebSocket's send() method, the current implementation converts it to its underlying ArrayBuffer using source.buffer. This becomes problematic when the Uint8Array is a view over a larger buffer, which is common in streaming scenarios.

For example, when streaming data with Bun, it internally uses a buffer pool with 16KB buffers for efficiency. So if you have a Uint8Array chunk of 800 bytes, it's actually a view over a 16KB buffer. The current implementation sends the entire 16KB buffer instead of just the 800 bytes of actual data:

// Current implementation
send(source: string | ArrayBuffer | Uint8Array) {
    this.#init.send(
      source instanceof Uint8Array 
        ? source.buffer  // Sends entire 16KB buffer! 
        : source
    )
}

This causes:

  1. Unnecessary bandwidth usage - sending 16KB when only 800 bytes are needed
  2. Potential data corruption on the client side - the unused portion of the buffer might contain old data
  3. Performance issues when streaming large amounts of data

Solution

We should send the Uint8Array directly without converting it to its buffer. This preserves the view information and sends only the actual data:

send(source: string | ArrayBuffer | Uint8Array, options?: SendOptions): void {
    this.#init.send(source, options ?? {})
}

Example

// Streaming scenario with Bun
const reader = response.body?.getReader();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  console.log(value.byteLength);      // e.g., 800 bytes
  console.log(value.buffer.byteLength); // 16384 bytes (Bun's pooled buffer)
  
  // Before: sends all 16384 bytes
  // After: sends only the actual 800 bytes
  ws.send(value);
}

The WebSocket API can handle Uint8Array directly, so there's no need for the conversion to ArrayBuffer. This change ensures we only send the actual data bytes while maintaining all existing functionality.

Ariel-Moroshko avatar Oct 27 '24 13:10 Ariel-Moroshko

Hi @Ariel-Moroshko Thank you for the PR!

@nakasyou Can you review this?

yusukebe avatar Oct 28 '24 00:10 yusukebe

@Ariel-Moroshko

To fix the CI error, please format the code with the command bun run format:fix && bun run lint:fix.

yusukebe avatar Oct 28 '24 00:10 yusukebe

Hi @yusukebe, I've made the requested changes to fix the CI formatting issue.

Ariel-Moroshko avatar Oct 28 '24 06:10 Ariel-Moroshko

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 94.71%. Comparing base (3a59d86) to head (b4ba8c5). Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3570      +/-   ##
==========================================
- Coverage   94.71%   94.71%   -0.01%     
==========================================
  Files         157      157              
  Lines        9539     9536       -3     
  Branches     2774     2810      +36     
==========================================
- Hits         9035     9032       -3     
  Misses        504      504              

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Oct 28 '24 06:10 codecov[bot]

I understood. TypedArray does not necessarily return an ArrayBuffer with exactly the same meaning. it may be part of. That is a bug.

nakasyou avatar Oct 28 '24 12:10 nakasyou

@Ariel-Moroshko @nakasyou

Thanks! Merge it now.

yusukebe avatar Oct 29 '24 04:10 yusukebe