hono icon indicating copy to clipboard operation
hono copied to clipboard

[PoC] feat: introduce Partial Content Middleware

Open yusukebe opened this issue 1 year ago • 13 comments

This PR will introduce a new middleware, "Partial Content Middleware".

It's based on #3461 by @exoego . That PR means the Serve Static Middleware supports 206 Partial Content. But this PR can support all contents not only by Serve Static Middleware:

import { partialContent } from 'hono/partial-content'

app.use(partialContent())

app.get('/hello.jpg', (c) => {
  return c.body(body, {
    headers: {
      'Content-Length': body.length.toString(),
      'Content-Type': 'image/jpeg'
    },
  })
})

Then, the /hello.jpg can support 206 Partial Content. It will be enabled when the Response has a Content-Legnth header. The partial contents will be returned based on those values. So, if you want to use it with Serve Satatic, it should return the correct content length with Content-Length --- but the current Serve Static Middleware does not emit content length. So, to do it, we have to modify it to get the file size using a method such as Bun.file() and set it as content length.

This PR will be co-authored by @exoego

We have to discuss this feature.

yusukebe avatar Oct 14 '24 10:10 yusukebe

Codecov Report

Attention: Patch coverage is 97.02970% with 3 lines in your changes missing coverage. Please review.

Project coverage is 94.32%. Comparing base (9986b47) to head (79ebfef). Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
src/middleware/partial-content/index.ts 97.02% 3 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3516      +/-   ##
==========================================
- Coverage   95.58%   94.32%   -1.26%     
==========================================
  Files         155      158       +3     
  Lines        9357     9589     +232     
  Branches     2749     2786      +37     
==========================================
+ Hits         8944     9045     +101     
- Misses        413      544     +131     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Oct 14 '24 10:10 codecov[bot]

Hey @exoego @usualoma @nakasyou and others

What do you think about this?

yusukebe avatar Oct 15 '24 10:10 yusukebe

I like this design since it is much simpler and supports various adapters and middlewares.

One small concern is performance. This design assumes an underlying stream supports range access via Stream.getReader/subarray. But if the underlying Stream reads bytes from the beginning eagerly, range access is not efficient. So it is better to document such note for users.

On the other hand, #3461 requires middleware to provide range access.

exoego avatar Oct 15 '24 11:10 exoego

Most of JS runtimes's file system API provides seek API. e.g.:

using file = await Deno.open('./image.jpg', { read: true })

await file.seek(6) // Move the file pointer 6 bytes forward
// ...

If the middleware or serveStatic can access those APIs, performance problem might be resolved. In web standards, there is FileSystemWritableFileStream in File System Access API. I think there is also a solution to use this.

nakasyou avatar Oct 15 '24 12:10 nakasyou

Hi @yusukebe!

What are the real-world problems we need to solve?

If the case of returning “206 Partial Content” is almost always serve-static, and if it is assumed to be used in a production environment, it seems better to be implemented in the serve-static feature rather than being made independent as middleware. There are no users who want to use serve-static but don't want to support Range headers, so I think it would be more convenient if Range headers were supported automatically when using serve-static. And as mentioned in the review comments of the other two people, I think it would be better to use a lower-level API to improve performance.

Is there a use case other than serve-static? If there are use cases, I think it would be best to make them into separate middleware like this pull request and then optimize serve-static by adding a separate abstraction layer.

usualoma avatar Oct 16 '24 06:10 usualoma

For Hono users on AWS or GCP (not Cloudflare), serving content from object storage (like AWS S3 or GCP Cloud Storage) will be an use-case of Partial Content middleware, without Server Static middleware.

exoego avatar Oct 16 '24 06:10 exoego

Hi @exoego, Thanks for the reply.

For Hono users on AWS or GCP (not Cloudflare), serving content from object storage (like AWS S3 or GCP Cloud Storage) will be an use-case of Partial Content middleware, without Server Static middleware.

I simply don't know much about this, and I'd like to ask for some guidance, but if the app is acting as a proxy for AWS S3 or GCP Cloud Storage, I think it would be better to pass the Range header straight through to the storage and return the response as is, but is that not possible?

If you're caching the data in your app, that's one thing, but if you're not caching it, I think it would be a bit of a pain to “fetch all the data from AWS S3 or GCP Cloud, put it in memory, and then return a portion of it by specifying the Renge header”.

usualoma avatar Oct 16 '24 06:10 usualoma

That's an interesting topic. Quick response!

Also, if you use Cloudflare R2, which is object storage, you have to create Workers with Hono to distribute assets (though there is another way to distribute it without Workers):

import { Hono } from 'hono';

type Env = {
  Bindings: {
    BUCKET: R2Bucket
  }
}

const app = new Hono<Env>()

app.get('/', async (c) => {
  const file = await c.env.BUCKET.get('file-name')
  return new Response(file?.body)
})

yusukebe avatar Oct 16 '24 06:10 yusukebe

It looks like AWS S3 supports partial content natively. One needs to specifie range options. So Partial Content middlware appears not needed for S3. https://docs.aws.amazon.com/AmazonS3/latest/userguide/download-objects.html#download-objects-parts https://stackoverflow.com/questions/36436057/s3-how-to-do-a-partial-read-seek-without-downloading-the-complete-file

So please forget AWS S3 (I don't know GCP though)

exoego avatar Oct 16 '24 07:10 exoego

app.get('/', async (c) => {
  const file = await c.env.BUCKET.get('file-name')
  return new Response(file?.body)
})

In this example of Cloudflare R2, the underlying file is fully-read, which is not ideal for Partial Content. Big use-case of partial content is video seeking, so full-read should be avoided.

exoego avatar Oct 16 '24 07:10 exoego

I've investigated. For Cloudflare R2, c.env.BUCKET.get can range access to the object. So, we may not need this middleware for it!

yusukebe avatar Oct 16 '24 07:10 yusukebe

Indeed, there may not be a use case.

BUT, I am personally concerned about Serve Static getting fat. That needs to be re-designed and discussed.

yusukebe avatar Oct 16 '24 07:10 yusukebe

@exoego @nakasyou @usualoma

Anyway, thank you for your comments!

I'll make this PR as a draft and prepare to discuss about Serve Static and others.

Now, we are implementing runtime-specific matters like file system access and WebSockets in adapters in hono package. This approach is good but it is getting fat to support many runtimes.

I think it's time to re-think the scope of the hono package and re-designed how to treat non-standard things.

Related issues and PRs:

https://github.com/honojs/hono/issues/3483 https://github.com/honojs/hono/pull/3461 https://github.com/honojs/hono/pull/3307

yusukebe avatar Oct 16 '24 07:10 yusukebe