aws-sdk-java icon indicating copy to clipboard operation
aws-sdk-java copied to clipboard

Adding Extension Points in DynamoDBMapper to allow transformation of Query/Scan Conditions

Open deepvyas opened this issue 4 years ago • 2 comments

We are looking to implement a encryption scheme for our services where we would like to allow developers to have an easy way of telling us they want an field to be encrypted by an custom annotation something on the likes of:

@Getter
@Setter
@DynamoDBTable(tableName = "student")
public class Student extends ParentClass 
{
    @DynamoDBHashKey(attributeName = "studentid")
    private String studentid;

    @DynamoDBAttribute(attributeName = "name")
    @Encrypted
    @DynamoDBIndexHashKey(globalSecondaryIndexName = "GSI-name")
    private String name;

    @DynamoDBAttribute(attributeName = "universityName")
    private String universityName;

    @DynamoDBAttribute(attributeName = "course")
    private String course;
}

We chose not to go with the aws-dynamodb-encryption-java due to a bunch of factors mentioned in additional context.

Describing the Feature

We have been able to achieve a the way to encrypt/decrypt these attributes, the way we would like to (context on the way we are aiming below) using the AttributeTransformer which make the process transparent to our service developers and easy to use.

We are also looking to achieve the same transparency while supporting Query type operations (including Get, Scan, Load type operations) and are wondering if there would be some aptitude to have a similar hook for the final criteria being sent to wire, so that we could swap out actual AttributeValues there as well (given the attribute involved in comparision is included).

We use a bunch of abstraction over the DynmoDBMapper in our environments (like a spring-data and a custom written framework for development, and hence are looking to solve this at the sdk layer to make the abstraction common and easy to plug across these frameworks.

Additional Context

We chose a custom manifestation of encryption over aws-dynamodb-encryption-java primarily due to following factors:

  • We have other data stores in our environment an the general principle we use is to do opt-in encryption of fields rather than opt-out out of encryption which the sdk uses ( as it requires to mark fields as @DoNotTouch )
  • We use AES/GCM for encrypting sensitive fields in our databases ( as a tech risk mandate) and since it is going to generate a random IV each time, to get a deterministic searchable value, we also store an HMAC based hash alongside the encrypted value. (the sdk doesnot use this approach)
  • In cases where we have document based NoSQL stores (for example Mongo) we transform the field to a json containing
{ "encryptedValue": "a binary blob", "hashValue": "a binary blob" }
  • Since DynamoDB doesnot allow GSI/LSI on nested attributes, we are looking to blast up the encrypted attribute into two seperate attributes, and setup indexes on the hash attribute (if needed). The AttributeTransformer helps us achieve this easily.
  • Though we are not able to transform the AttributeValues in conditions on Query type operations (including Get, Scan, Load type operations) and it forces us to make our business application code to be aware of the transformation, which kind of defeats the purpose of doing it transparently in the first place.

Proposed Solution

We are looking to see if hooks similar to AttributeTransformer can be exposed to transform the final Conditions formed in Query type operations (including Get, Scan, Load type operations).

If there is apetite for this and it seems like a good feature to have would be happy to groom the implementation up and contribute (given we know the AWS team is focused on the v2-sdk, but due to certain frameworks we use, it is going to a be larger effort for us to migrate to the enhanced-client since it changes the format of interaction. Hence would be great if we could include such a feature in both this and v2-sdk.

Also happy to discuss if there is a better way to model this, such that it allows to create GSI/LSI using the hash part of the encrypted value so we could fire query type operations (including Get, Load etc..) on these attributes if required.

Alternatives Tried

  • We also explored to see if we could extend the DynmoDBTableModel used by the DynamoDBMapper to scan the beans annotated with our annotations, and create a model where the encrypted attribute (name in above example) translates to 2 different fields (the encryptedValue and hashValue) so that we could then smoothly integrate with the rest of the functions in the mapper (which uses table model to iterate over attributes/conditions and convert them before sending down the wire

If we could potentially make DynmoDBTableModel extensible (or transformable by allowing to build from an existing instance etc.) that could also possibly help achieve a similar end goal.

Environment

  • AWS Java SDK version used: 1.11.792
  • JDK version used: 8
  • Operating System and version: Rhel 7/8

deepvyas avatar Apr 16 '21 17:04 deepvyas

Hi @deepvyas thank you for the detailed report, it's an interesting use case. I believe there's no way to add custom extensions to query operations today in DynamoDB Mapper.

As you mentioned, we don't have plans to add extensive features in v1. The Java v2 DynamoDB EnhancedClient has support for extensions with DynamoDbEnhancedClientExtension, have you researched it? It provides two hooks, one that is called just before a record is written to the database, and one called just after a record is read from the database - as an example, the optimistic locking is implemented using extensions, in the VersionedRecordExtension class.

Maybe DynamoDbEnhancedClientExtension fits your case? You can use SDK v1 and v2 side by side.

debora-ito avatar Apr 24 '21 03:04 debora-ito

Hey @debora-ito looking at the DynamoDbEnhancedClientExtension it seems pretty much like the AttributeTransformer of the v1 sdk, which is super cool and we are using to achieve the purpose of encryption/decryption of fields.

The ask here was to see if we could achieve a similar extension for QueryExpressions or load Queries for ex here so that from the application layer they could specify the plain text value and we can transparently convert it to it's HMAC hash (since the table actually has HMAC hash which was saved and not plain-text)

One more possible option was to allow overriding the DynmoDBTableModel generated so that we could add converter to go from plain-text to hash and vice-versa for fields which are marked as encrypted.

deepvyas avatar Apr 28 '21 10:04 deepvyas