aws icon indicating copy to clipboard operation
aws copied to clipboard

Better handling of unknown enum values in responses

Open stof opened this issue 5 months ago • 8 comments

Currently, our generator code says that our getter returns FooEnum::*, but the code parsing results assigns the string value returned by the backend without checking whether it is a known value. This means that the getter can actually return any string in case AWS adds new values in those endpoints. Psalm actually detects this mismatch (that's 90% of the content of our baseline). phpstan does not report it because it is a rule checked at level 7 while we run it only at level 6 currently (when trying to enable level 7, it also detects it).

The AWS guidelines for their SDK require that SDK don't fail at runtime when receiving such values. Most official SDKs (maybe even all) implement this by generating an extra case in the enum (sdkUnknown in Kotlin, $unknown in JS, UNKNOWN_TO_SDK in Java, etc...) that is used by parsers when populating results if the case is unknown. This approach forces projects to decide how to handle that case when they do a match on the enum (maybe they decide to fail at runtime with an exception asking to upgrade the SDK, maybe they have a graceful fallback, depending on what is appropriate).

stof avatar Jul 22 '25 15:07 stof

Note that this also applies to union shapes (see #1900)

stof avatar Jul 22 '25 15:07 stof

IMHO, for PHP returning UNKNOWN does not provide any value than returning the original string. At the end, users have to check the value.

I get that for strictly typed programming languages, they need to convert the original value to a well-known fallback, but it is not the case for PHP.

Adding a check on the value, to provide a fallback, would consume CPU for nothing in 99.99% of case.

I think we should do what javascript sdk does and adding a |string. on the related DocBlocs

https://github.com/aws/aws-sdk-js/blob/657d6feb00447c8be1d65158a0ecc0585b70ed60/clients/s3.d.ts#L1133

This is what I'm doing in #1923

jderusse avatar Jul 24 '25 10:07 jderusse

@jderusse the JS SDK you linked is the old one.

And match in PHP requires exhaustiveness, which is impacted by this.

Also, the suggestion done in #1923 basically uses string as type for all enum fields in practices. And #1078 discusses using native PHP enums, which are strictly typed.

stof avatar Jul 24 '25 11:07 stof

sdk-v3 uses strict lists of values

ie

/**
 * @public
 * @enum
 */
export const SchemaStatus = {
  Active: "ACTIVE",
  Deleting: "DELETING",
  Failed: "FAILED",
  NotApplicable: "NOT_APPLICABLE",
  Processing: "PROCESSING",
  Success: "SUCCESS",
} as const;

/**
 * @public
 */
export type SchemaStatus = (typeof SchemaStatus)[keyof typeof SchemaStatus];

/**
 * @public
 */
export interface GetSchemaCreationStatusResponse {
  /**
   * <p>The current state of the schema (PROCESSING, FAILED, SUCCESS, or NOT_APPLICABLE). When
   *          the schema is in the ACTIVE state, you can add data.</p>
   * @public
   */
  status?: SchemaStatus | undefined;
}

https://github.com/aws/aws-sdk-js-v3/blob/dd06bc124bf2cce24109f57e6ead7d404e2ec15d/clients/client-appsync/src/models/models_0.ts#L4803-L4837

IMHO, we should not convert that value to a generic unknown value. People might have implemented the logic with the new value and don't care if async-aws is not-yet ready. Also that would require AsyncAws to check the value of each enum => (small) performance impact to handle a 0.001% edge-case.

If one day we switch to Enum, it will be an issue that should be addressed. But this not-yet the case.

So, just wondering if all that is not just noise for strict madness.

If we don't provide a 'unknown' case, the question for #1923: Should we reflect the truth in the DocBloc and tell people to handle not-yet-known cases? => this make linter less powerful (no typo detection)

Or just let people deal with it like we did and aws-sdk-js does...

@GrahamCampbell @Nyholm any opinion on this?

jderusse avatar Jul 24 '25 12:07 jderusse

Very interesting question.

It seams like the promise AWS is doing is to return a value for X, Y and Z. But they reserve the ability to add more values in the future. I dont think we should make a promise that is more strict nor should we let everything else fallback to a hard coded "unknown".

I think https://github.com/async-aws/aws/pull/1923 is the best option. RuntimeName::*|string.

Yes, it is not great for type safety but that is what the API promise. Doing a type hint of RuntimeName::*|string would also tell the user handle any string input which is exactly what we would like.

Nyholm avatar Aug 11 '25 07:08 Nyholm

Note that this means we will never be able to use PHP enums then.

stof avatar Aug 18 '25 10:08 stof

Note that this means we will never be able to use PHP enums then.

Yes, that is true. That sounds like an additional benefit =)

Nyholm avatar Aug 18 '25 12:08 Nyholm

Indeed, their JS SDK is inconsistent:

  • for string-based enums, the deserialization only expects a string without extra validation, meaning that they return unknown values directly (breaking their type declaration): https://github.com/aws/aws-sdk-js-v3/blob/dd06bc124bf2cce24109f57e6ead7d404e2ec15d/clients/client-appsync/src/protocols/Aws_restJson1.ts#L2668-L2671. They use any in their deserialize logic to avoid issues reported by the type checker about those violations (which are equivalent to the ones we have in our psalm baseline)
  • for union shapes, they represent unknown types with a dedicated $unknown property storing the property name and its value instead of returning the object as is.

stof avatar Aug 19 '25 10:08 stof