core icon indicating copy to clipboard operation
core copied to clipboard

feat(blob): multipart upload

Open Teages opened this issue 1 year ago β€’ 14 comments

Resolves #66

This pull request added multipart upload api and its proxy route.

Also fixed a type error on proxyHubBlob, it breaks the type check

did some check in playground, ~~not check the proxy api yet~~.

Checked in remote proxy, works well except the contentType not included in the complete response (but it returns in dev server), maybe it takes time for r2 to determine the type

Teages avatar Apr 12 '24 13:04 Teages

Although that upload works fine, I have this wired error in terminal

Screenshot 2024-04-12 at 17 05 48

farnabaz avatar Apr 12 '24 15:04 farnabaz

Although that upload works fine, I have this wired error in terminal

Screenshot 2024-04-12 at 17 05 48

it is from the wroked, as their member said:

This message is not really an error, at least not one that's relevant to most people. This is another example where our logging is highlighting bugs meant for the workers runtime team to look into. In this case V8's API for mapping JITed memory in order to construct stack traces isn't working the way we expect. This has been firing occasionally for a long time. There's actually no impact for workerd users since this information is only used when generating crash dumps in our internal codebase.

so we can safity ignore this

Teages avatar Apr 12 '24 15:04 Teages

I was wondering how we can change API routes to remove query params.

Just an idea:

  • /blob/multipart/[...pathname].post.ts create a multip part upload
  • /blob/multipart/[...pathname]/[uploadId]/[partNumber].put.ts put a part
  • /blob/multipart/[...pathname]/[uploadId]/complete.post.ts complete upload task

sadily radix3 doesn't support this route

import { createRouter } from "radix3"

const router = createRouter()

router.insert('/api/blob/multipart/**:pathname/', { payload: 'not this' })
router.insert('/api/blob/multipart/**:pathname/:uploadId/:partNumber', { payload: 'found' })

const res = router.lookup('/api/blob/multipart/hello.txt/id/1')

console.log(res)

returns

{
  payload: "not this",
  params: {
    pathname: "hello.txt/id/1",
  },
}

so I just make create / complete in different route

Teages avatar Apr 12 '24 16:04 Teages

Thank you so much @Teages

I think to move forward, I would like to have a multipart: true option for hubBlob().put() as automatic multipart:

await hubBlob().put('movie.mp4', file, {
  multipart: true
})

As well as updating the documentation for it.

As a bonus, it would be nice to find a way to have a Vue composable as well:

useUpload(`/api/blob/multipart/${file.name}`).upload(file, { multipart: true })

atinux avatar Apr 17 '24 14:04 atinux

I think to move forward, I would like to have a multipart: true option for hubBlob().put() as automatic multipart:

await hubBlob().put('movie.mp4', file, {
  multipart: true
})

I think it maybe not not be very useful, hubBlob runs on nitro, which is running in cloudflare worker, it should have the most stable connection to cloudflare r2.

The limitation of direct upload seems to be 300mb, but woker only have 128mb memory. I can't see any need for workers to upload to r2 in multipart.

It should be the same for other platforms.

~~maybe use multi-threaded uploading can be faster? I don’t know~~

As a bonus, it would be nice to find a way to have a Vue composable as well:

useUpload(`/api/blob/multipart/${file.name}`).upload(file, { multipart: true })

Yes I think so, but it affects flexibility, I didn't see nuxt hub have provided any upload api except hub proxy.

Or we can provide a composable like this:

const { progress } = useMultipartUpload({
  file,
  partSize: 10 * 1024 * 1024, // 10MB
  throttle: 2, // 2 concurrent uploads
}, ({partNumber, body}) => $fetch())

Teages avatar Apr 17 '24 15:04 Teages

I think it maybe not not be very useful, hubBlob runs on nitro, which is running in cloudflare worker, it should have the most stable connection to cloudflare r2.

Makes sense, let's skip it.

What about:

const { progress, isComplete } = useMultipartUpload(file, {
  partSize: 10 * 1024 * 1024, // 10MB
  throttle: 2, // 2 concurrent uploads
}, ({ partNumber, chunkBody }) => $fetch())

It would be nice to have the second parameter as optional:

const { progress, isComplete } = useMultipartUpload(file, ({ partNumber, chunkBody }) => $fetch())

So it could use defaults we can define, wdyt?

atinux avatar Apr 19 '24 12:04 atinux

well I designed a composable creater:

export const useMultipartUpload = createMultipartUploader({
  create: (file) => $fetch(),
  upload: ({ partNumber, chunkBody }, { pathname, uploadId }) => $fetch(),
  complete: (parts, { pathname, uploadId }) => $fetch(),
  abort: ({ pathname, uploadId }) => $fetch(),
  partSize: 10 * 1024 * 1024, // 10MB
  concurrent: 2, // 2 concurrent uploads
  maxRetry: 3,
})

then upload file in just two lines:

const { completed, progress, cancel } = useMultipartUpload(file)
const result = await completed

Teages avatar Apr 21 '24 07:04 Teages

Thank you so much for your work, we are getting close!

I've been thinking, what about using the action as first part?

/api/blob/multipart/[action]/[...pathname].ts

action can be: create, upload, complete and abort

We could have an abstraction similar to this:

// server/api/blob/multipart/[action]/[...pathname].ts
export default eventHandler(async (event) => {
  await hubBlob().handleMultipartUpload(event, {
    // options...
    addRandomSuffix: true
  })
})

This will avoid creating 4 files to handle multipart upload, and it should also simplify the createMultipartUploader function, I would rename it to useMultipartUpload instead:

const upload = useMulipartUpload('/api/blob/multipart', {
  partSize: 10 * 1024 * 1024, // 10MB
  concurrent: 2, // 2 concurrent uploads
  maxRetry: 3,
})
const { progress, completed, abort } = upload(file)

Also, would you mind updating the progress to be between 0 and 100 similar to https://nuxt.com/docs/api/composables/use-loading-indicator#progress ?

atinux avatar Apr 25 '24 13:04 atinux

That would be great, what options do we need for handleMultipartUpload? I would like to provide some hooks like beforeCreate, afterCompleted and etc, but we can let developers do themself like

// server/api/blob/multipart/[action]/[...pathname].ts
export default eventHandler(async (event) => {
  const actions = getRouterParam(event, 'action')
  // do something before handle
  return await hubBlob().handleMultipartUpload(event)
})

and most upload need authentication, I want to allow overwrite ofetch options of useMulipartUpload,

interface UseMulipartUploadOptions {
  //... some else
  fetchOptions: Omit<FetchOptions, 'body' | 'parseResponse' | 'responseType'>
}

Teages avatar Apr 25 '24 14:04 Teages

I would like to provide some hooks like beforeCreate, afterCompleted and etc,

The goal of exposing a function is that they can do whatever they want before and after the function instead of exposing an API route :)

I like the fetchOptions property for useMultipartUpload πŸš€

atinux avatar Apr 25 '24 15:04 atinux

You are very fast @Teages πŸš€

Happy to resolve the conflicts? πŸ™

atinux avatar Apr 26 '24 15:04 atinux

oops seems I messed up, I need some time to solve

Teages avatar Apr 26 '24 16:04 Teages

ok it looks good now

Teages avatar Apr 26 '24 16:04 Teages

I noticed #99, everything looks fine except the way to resolve returning is different. In this pr I use respondWith to force the response to be the handled response, but in #99 it just return the response and lets the user return on their own.

Do I need to change to the same approach?

Teages avatar May 17 '24 14:05 Teages

cc @farnabaz

atinux avatar May 24 '24 13:05 atinux