aws-sdk-java-v2 icon indicating copy to clipboard operation
aws-sdk-java-v2 copied to clipboard

Duplicate all eventstream event shapes + add new legacy event modes

Open alextwoods opened this issue 8 months ago • 7 comments

Fixes codegeneration issues for eventstreams with shared event shapes: Duplicate and rename all eventstream event shapes with a unique name + add new disableUniqueEventStreamShapePreprocessing customization config.

Motivation and Context

When an event shape is shared between multiple eventstreams, it causes SDK generation/compilation failures. The top level shape POJO implements the event stream interface for each stream and the return type of the sdkEventType method conflicts.

Modifications

This PR does two things:

  1. Adds a new disableUniqueEventStreamShapePreprocessing customization config.
  2. If both useLegacyEventGenerationScheme and disableUniqueEventStreamShapePreprocessing is unset a for an event, the event shape will be duplicated and given a unique name: <ShapeName><EventStreamName>.

Note: There are a large number of changes in the codegen test to add the new customization mode to existing test eventstreams.

Open Questions

What naming scheme should we use to produce unique names?

This PR currently uses <EventStreamName><ShapeName>, for example an event named Payload in eventstream StreamA would get the name: StreamAPayload. We could reverse this (PayloadStreamA).

The downside of this naming scheme is that it can produce long names. For example the existing HeartbeatEvent from LexRuntimeV2's StartConversationResponseEventStream would be (without the legacy customization) renamed to: HeartbeatEventStartConversationResponseEventStream.

FAQ

Why do we always need to rename the event shape? Couldn't we do that only when there is a shared event?

We must ensure that shapes names are deterministic and do not depend on the shapes in the model because if a shared event shape is added in the future, we have no way of telling which was the existing shape.

Alternatives

For POCs of alternative approaches see:

  • Support through customization: https://github.com/aws/aws-sdk-java-v2/pull/6031
  • Remove eventstream interface from generic POJOs: https://github.com/alextwoods/aws-sdk-java-v2/pull/2/files
  • add SDKEventType interface: https://github.com/alextwoods/aws-sdk-java-v2/pull/3

User Experience

There are no changes for users for any existing Eventstreams/events. New eventstreams or new events added to existing evenstreams will have unique, longer names, but the fluid visitor and builder methods will continue to use the member names. For example, assuming we have Payload event (with member name "PayloadEvent") in eventstream StreamA, the handler and builder methods will still be onPayloadEvent and payloadEventBuilder.

// Handling events
StreamAResponseHandler.builder()
    .subscriber(
        StreamAResponseHandler.Visitor.builder()
            .onPayloadEvent(event -> { // NOTE: onPayload is still used because the member name is "PayloadEvent"
                System.out.println("Payload event: " + event);
            })
            .build()
    );

// sending events
subscriber.onNext(
    StreamA.payloadEventBuilder() // NOTE: We still use `payloadEventBuilder` because the member name is "Payload"
        .payload("text/plain;charset=utf-8")
        .build()
);

Note: For users using switch/case or if/else on sdkEventType or checking instance of, when casting the events they will need to use the longer shape names. Using the example from above:

subsciber(event -> {
    if (event instanceof PayloadStreamA) { // NOTE: we need to check the rename/unique shape name here
        handlePayloadEvent((PayloadStreamA)event);
    }
});

subscriber(event -> {
    switch(event.sdkEventType()) {
        case PAYLOAD_EVENT: // the sdkEventType will still use the member name
            handlePayloadEvent((PayloadStreamA)event); // Note: Need to cast to the renamed/uniqe shape name
            break;
    }
})

Testing

New Unit tests - see shared event stream cases (model was modified to add new streams that share an event)

Types of changes

  • [x] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)

Checklist

  • [x] I have read the CONTRIBUTING document
  • [x] Local run of mvn install succeeds
  • [x] My code follows the code style of this project
  • [x] I have added tests to cover my changes
  • [x] All new and existing tests passed
  • [x] I have added a changelog entry. Adding a new entry must be accomplished by running the scripts/new-change script and following the instructions. Commit the new file created by the script in .changes/next-release with your changes.

License

  • [x] I confirm that this pull request can be released under the Apache 2 licens

alextwoods avatar Apr 22 '25 23:04 alextwoods

The downside of this naming scheme is that it can produce long names. For example the existing HeartbeatEvent from LexRuntimeV2's StartConversationResponseEventStream would be (without the legacy customization) renamed to: HeartbeatEventStartConversationResponseEventStream.

The current non-legacy way is there's a package per eventstream, e.g. software.amazon.awssdk.services.lexruntimev2.model.startconversationresponseeventstream that contains the event stream specific event classes. Could we do something similar in this case?

dagnir avatar Apr 25 '25 23:04 dagnir

The current non-legacy way is there's a package per eventstream, e.g. software.amazon.awssdk.services.lexruntimev2.model.startconversationresponseeventstream that contains the event stream specific event classes. Could we do something similar in this case?

We could, but it would require fairly large changes in multiple parts of the code generation and require maintaining 3 separate paths through codegen (for each of the cases) including marshallers, builders and others.

The benefit of renaming the shape up front is that all other parts of the codegen are able to remain the same, but it doesn't let us change the packaging/namespace of the generated classes.

alextwoods avatar Apr 29 '25 18:04 alextwoods

The current non-legacy way is there's a package per eventstream, e.g. software.amazon.awssdk.services.lexruntimev2.model.startconversationresponseeventstream that contains the event stream specific event classes. Could we do something similar in this case?

I did try to put together a POC for this but It would require a much more significant refactoring to get working. There are a few issues/sharp edges:

  1. In some cases we need both the model.<Event> and model.<EventStream>.<Event> POJOs (an event in a legacy ES + in a new, non-legacy ES).
  2. There are package private model classes that are required by the event pojos. We would need to make these public to allow access/usage in the eventstream model package).
  3. Fairly extensive updates to code generation for marshallers, async client (usage of marshallers), Builders, Handlers to ensure the correct types are used (each place needs to switch behavior on legacy modes).

alextwoods avatar Apr 30 '25 20:04 alextwoods

@zoewangg Can you take a look as well please?

If we haven't done so already, can we do a before and after check with the services with existing eventstream to make sure there aren't any changes?

dagnir avatar May 28 '25 15:05 dagnir

If we haven't done so already, can we do a before and after check with the services with existing eventstream to make sure there aren't any changes?

Note - I've marked this PR as in draft to ensure we don't merge it before the CR's to internal service teams are merged. I've got dry run builds w/ this change + all of my internal CRs running to ensure we've got everything. I'll only mark this PR as ready for review once those all pass.

Edit to clarify - this PR is in a final state and ready for review, it is only marked as draft to prevent accidental early merge :-)

alextwoods avatar May 28 '25 16:05 alextwoods