stream-applications icon indicating copy to clipboard operation
stream-applications copied to clipboard

s3-sink java.io.File casting

Open borg1310 opened this issue 2 years ago • 6 comments

hi, we want to use the s3-sink application to write files to an s3 storage. For this we use the File-Source and the S3-Sink applications. in the file-source the mode is set to "ref" (a java.io.File should be returned). when writing to the s3 sink, error [0] occurs. when debugging, we noticed that not a file arrives in the S3MessageHandler (method upload in line 306), but a byte array containing the path to the file. imho, the problem is that the path is not converted to a java.io.File object. Am I doing something wrong or is there an additional setting for this (especially for the keyExpression property) ?

thanks in advance best regards, juergen

[0] Caused by: java.lang.IllegalStateException: Specify a 'keyExpression' for non-java.io.File payloads at org.springframework.integration.aws.outbound.S3MessageHandler.upload(S3MessageHandler.java:390) at org.springframework.integration.aws.outbound.S3MessageHandler.handleRequestMessage(S3MessageHandler.java:277) at org.springframework.integration.handler.AbstractReplyProducingMessageHandler.handleMessageInternal(AbstractReplyProducingMessageHandler.java:136) at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:56) ... 39 more

borg1310 avatar Apr 27 '23 06:04 borg1310

The java.io.File is not OK abstraction to transfer via network. Even if we really can convert it into a file path and then serialize that string properly when we send to the binder, it does not mean that on a consumer side even if we deserialize that path to the java.io.File, such an object is going to be present on that target file system to pull data for S3.

It is best for you now to transfer byte[] of the file content from that File-Source.

We may think about something like payload-to-file=true|false option for this S3-Sink, if end-user is sure that both apps are operating against the same file system. Why then would one place a binder in between?..

I'm not fully familiar with SCDF, but I believe that there has to be an option to co-locate apps with in-memory interaction.

/CC @corneil , @onobc

artembilan avatar Apr 27 '23 14:04 artembilan

but I believe that there has to be an option to co-locate apps with in-memory interaction.

Apps can be "co-located" via Function Composition.

onobc avatar Apr 27 '23 20:04 onobc

Thanks, Chris, but doesn't look like that doc shows how to do that. It talks about functions, but we have here apps which are things in itself and, yeah, tied to specific binder according to their packaging. Plus I doubt users are interested in the programming style composition for out-of-the-box apps. More over it is not clear if its possible to compose Source with Sink. I guess we can brainstorm other day.

artembilan avatar Apr 27 '23 21:04 artembilan

Good points @artembilan

Yes, the user would have to create a custom stream application that chained the functions together into a single app. That is the only way I know how to do that in SCDF. Using that technique I do think it would be possible to chain the file source to s3 sink (eg. spring.cloud.function.definition=file|s3). But it would require user to create a custom stream app.

onobc avatar Apr 28 '23 14:04 onobc

Oleg: We could use the application content type extra parameters (~sub-types) to include the extra info about the payload (byte[]) eg. classname, filepath, etc..

We can leverage Spring MimeType to help w/ this.

onobc avatar May 31 '23 14:05 onobc

Moving out to 2024.1.x as we did not have cycles to get to this in the 2024.0.0 timeline.

onobc avatar May 31 '24 19:05 onobc