aws-sdk-kotlin InputStream adapter for ByteStream

Describe the feature

A method on ByteStream to convert it to an InputStream, or a utility function to do so.

Is your Feature Request related to a problem?

Sometimes you need to connect S3 calls to code that accept a blocking java.io.InputStream. It would be helpful if the SDK included some way to convert a ByteStream to an InputStream. This avoids a lot of in-memory buffering.

Proposed Solution

An input adapter similar to

https://github.com/ktorio/ktor/blob/c68d889b7088fca0e9a75f299b58ce7f55572a56/ktor-io/jvm/src/io/ktor/utils/io/jvm/javaio/Blocking.kt#L28

or, an adapter that could convert SdkByteReadChannel back to a ktor ByteReadChannel, then I could use ByteReadChannel.toInputStream

Describe alternative solutions or features you've considered

As a workaround I could try something like this (untested and lacks proper error handling). This is however less efficient since it involves copying data unnecessarily.

@OptIn(InternalApi::class)
suspend fun ByteStream.toInputStream(): InputStream {
    val ktorChan = ByteChannel()
    val chan = when (this) {
        is ByteStream.OneShotStream -> readFrom()
        is ByteStream.Buffer -> SdkByteReadChannel(this.bytes())
        is ByteStream.ReplayableStream -> newReader()
        else -> error("unexpected ByteStream body")
    }
    GlobalScope.async {
        while (!chan.isClosedForRead) {
            chan.awaitContent()
            val chunk = ByteArray(chan.availableForRead)
            chan.readAvailable(chunk)
            ktorChan.writeFully(chunk)
        }
        ktorChan.close()
    }
    return ktorChan.toInputStream()
}

Acknowledge

[X] I may be able to implement this feature request

AWS Kotlin SDK version used

0.16.0

Platform (JVM/JS/Native)

jvm

Operating System and version

macOS Monterey 12.3.1

May 29 '22 10:05 rrva

Thanks for the feature request. Indeed this would be useful. The code you have provided is one possible workaround in the meantime.

Jun 01 '22 14:06 aajtodd

Thanks for the feature request. Indeed this would be useful. The code you have provided is one possible workaround in the meantime.

Just a heads up, this doesn't actually work in practice. If you have a resource that is roughly 36k worth of JSON for example and run it through a jackson mapper to convert to a data class, it breaks due to having only reached about 4100 bytes of data before it was considered done and the transform breaks. Seems to work fine for anything under 4000 bytes but breaks consistently with anything over 12k bytes in size (didn't have any sample data sets between 4k and 12k off hand).

This is probably not as good but it does at least allow me to hack around a solution that does retrieve all of the datapoints in general. But a good, clean, built-in solution would be desired.

    suspend fun ByteStream.toInputStream(): InputStream = this.toHttpBody()
        .readAll()
        ?.inputStream()
        ?: "".byteInputStream() // TODO This is probably worse than breaking???

The downside of this approach so far as I can tell is that the entire contents of the bytestream have to be loaded into memory. Thus if you have a large resource such as a movie, you are going to fail hard.

Aug 17 '22 17:08 sepatel

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sep 05 '23 15:09 github-actions[bot]

aws-sdk-kotlin aws-sdk-kotlin copied to clipboard

InputStream adapter for ByteStream

Describe the feature

Is your Feature Request related to a problem?

Proposed Solution

Describe alternative solutions or features you've considered

Acknowledge

AWS Kotlin SDK version used

Platform (JVM/JS/Native)

Operating System and version

⚠️COMMENT VISIBILITY WARNING⚠️

aws-sdk-kotlin
aws-sdk-kotlin copied to clipboard