aws-sdk-kotlin Provide conversions between ByteStream type and ktor.io ByteReadChannel

Please add builder and transformation extension for ktor.io interface io.ktor.utils.io.ByteReadChannel from aws.smithy.kotlin.runtime.content.ByteStream.

Builder method (companion object) fun fromByteReadChannel(channel: ByteReadChannel): ByteStream This builder adds opportunity for streaming files from ktor.io client to s3
fun ByteStream.toByteReadChannel(): ByteReadChannel This extension method adds opportunity for streaming files to ktor.io client from s3

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue, please leave a comment

Nov 01 '21 12:11 Declow0

Hi @Declow0 thank you for your feature request. So we can better understand, can you explain how adding these builder and extension functions will help you?

Nov 01 '21 16:11 kggilmer

Hi @kggilmer. Main point of this feature using lazy data stream for less memory usage. Example of usage: I have a large file (100+ MB) on S3 storage and want to upload to another application via REST API. I can download whole byte array into JVM and after send it into client or I can save reference to channel and use it in client by lazy. I suggest ktor classes because it is already used in s3 client under the hood and them can be used at native/js/ios platforms (platform independent interface).

And reverse problem: download file from client and save it on s3. Ktor client support base transformation to ByteReadChannel for body. Also ktor has builder for ByteReadChannel from JVM classes like java.nio.ByteBuffer and kotlin.ByteArray.

Nov 02 '21 09:11 Declow0

Thank you for the additional details @Declow0 . Your use cases make sense. Due to policy of AWS we have to be careful about which third-party types we use and expose to customers from the SDK. We'll evaluate your request to see where best this kind of feature should live. The issue will be updated accordingly once more info is available.

Nov 03 '21 16:11 kggilmer

Note you can get around this with a bit of work.

I can download whole byte array into JVM and after send it into client or I can save reference to channel and use it in client by lazy.

Response streams are generally going to be a ByteStream.OneShotStream (or if you have SdkLogMode.LogResponseWithBody enabled it may end up a ByteStream.Buffer).

You can consume the stream in chunks manually and write them to your ktor channel, it might look something like:


s3Client.getObject(request) { resp ->
     val chan = when(resp.body) {
         is ByteStream.OneShotStream -> resp.body.readFrom()
         is ByteStream.Buffer -> SdkByteReadChannel(resp.body.buffer())
         else -> error("unexpected ByteStream body")
     }

     while(!chan.isClosedForRead) {
         chan.awaitContent()
       
         // you can implement whatever kind of buffering/chunk strategy you want here...  
         val chunk = ByteArray(chan.availableForRead)
         chan.readAvailable(chunk)
         
         ktorChan.writeFully(chunk)         

      }
      ktorChan.close()
}

And reverse problem: download file from client and save it on s3

Same thing in reverse, you can implement SdkByteReadChannel for the ktor channel and pass it directly or do some kind of proxying and read chunks off the ktor channel and write them to another channel given to the SDK.

Just some ideas that may unblock your use case.

Nov 08 '21 15:11 aajtodd

@aajtodd Thank you for idea. It was easier to work with java.nio.ByteBuffer at JVM target. From S3 to ByteBuffer

val byteBufferFromS3 = s3Client.getObject(request) { resp ->
    suspend fun SdkByteReadChannel.toByteBuffer(): ByteBuffer {
        val buffer = ByteBuffer.allocate(resp.contentLength.toInt())
        while (!isClosedForRead) {
            awaitContent()
            readAvailable(buffer)
        }
        return buffer
    }

    when (val body: ByteStream = resp.body) {
        is ByteStream.Buffer -> ByteBuffer.wrap(body.bytes())
        is ByteStream.OneShotStream -> body.readFrom().toByteBuffer()
        is ByteStream.ReplayableStream -> body.newReader().toByteBuffer()
        else -> error("unexpected ByteStream body")
    }
}

and pass as body to ktor client ByteReadChannel(byteBufferFromS3.rewind())

And from ktor client to ByteBuffer

class RestClient {
    suspend fun download(...): ByteReadChannel = ktorClient.request { ... }
}
...
val response: ByteReadChannel = restClient.download(...)
val contentLength: Int = ...
val byteBuffer = ByteBuffer.allocate(contentLength)
while (!response.isClosedForRead) {
    response.awaitContent()
    response.readAvailable(byteBuffer)
}

and pass to s3 client

body = object : ByteStream.OneShotStream() {
    override fun readFrom(): SdkByteReadChannel = SdkByteReadChannel(content.rewind())
    override val contentLength: Long = contentLength.toLong()
}

Nov 09 '21 13:11 Declow0

That works but it isn't going to use less memory since it requires consuming the whole thing as a ByteBuffer for both examples.

Nov 09 '21 19:11 aajtodd

Closing as we have no plans right now to provide this conversion directly in the SDK. If you have any further issues adapting ktor-io (or any other I/O library for that matter) to the SDK please open a new issue or discussion.

Sep 13 '23 14:09 aajtodd

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sep 13 '23 14:09 github-actions[bot]

aws-sdk-kotlin aws-sdk-kotlin copied to clipboard

Provide conversions between ByteStream type and ktor.io ByteReadChannel

Community Note

⚠️COMMENT VISIBILITY WARNING⚠️

aws-sdk-kotlin
aws-sdk-kotlin copied to clipboard