FSharp.AWS.DynamoDB icon indicating copy to clipboard operation
FSharp.AWS.DynamoDB copied to clipboard

Add multi table transactions

Open purkhusid opened this issue 1 year ago • 13 comments

What?

Changes the transaction API to allow multi table transactions

How?

Instead of letting TransactWrite accept items with a specific record type I added a TransactWrite member to the table context.

This TransactWrite member has the Put/Update/Delete/Checkoprerations and replaces operations on the TransactWrite module that previously existed.

purkhusid avatar Aug 28 '24 10:08 purkhusid

@samritchie @bartelink The changes currently expose the TransactWriteItem that is part of the AWS SDK. Do you prefer not to expose it and would you rather have a wrapper type that has the AWS SDK type as an internal member?

purkhusid avatar Aug 28 '24 10:08 purkhusid

The tests are also not done. Just wanted feedback on the approach before spending more time on this.

purkhusid avatar Aug 28 '24 10:08 purkhusid

Hi Daniel, Firstly, thanks for contributing; I hope to get some time to look at it properly later.

The main thing is you have tests

I punted on doing multi-table originally, and didn't have any particular ideas on how it should be shaped as such

Breaking changes for the Transaction support isnt a problem from my point of view - I'll be able to do an update to Equinox.DynamoStore pretty quickly if anything needs to be moved (as opposed to bending over backwards and leaving a mess in the name of binary compatibility)

bartelink avatar Aug 28 '24 10:08 bartelink

Hi Daniel, Firstly, thanks for contributing; I hope to get some time to look at it properly later.

The main thing is you have tests

I punted on doing multi-table originally, and didn't have any particular ideas on how it should be shaped as such

Breaking changes for the Transaction support isnt a problem from my point of view - I'll be able to do an update to Equinox.DynamoStore pretty quickly if anything needs to be moved (as opposed to bending over backwards and leaving a mess in the name of binary compatibility)

There is a test that shows the new API. This is of course a breaking change but I figured it might be acceptable since the transaction API in this library wasn't covering the whole functionality.

A similar approach could be used for the BatchWriteItem as well.

purkhusid avatar Aug 29 '24 09:08 purkhusid

It would be great to support this functionality. @bartelink probably has more to contribute here as he did the TransactWrite work.

Probably my main concern with the API is it being an instance method hanging off an arbitrary table (which hopefully has the same account/credentials as the items). I’m not sure how to resolve this without introducing a top level DB type though.

samritchie avatar Aug 29 '24 11:08 samritchie

It would be great to support this functionality. @bartelink probably has more to contribute here as he did the TransactWrite work.

Probably my main concern with the API is it being an instance method hanging off an arbitrary table (which hopefully has the same account/credentials as the items). I’m not sure how to resolve this without introducing a top level DB type though.

This was the least intrusive way I found without having to add a new type that only has a DynamoDB client but that also makes the UX worse since you will already have a TableContext with a DynamoDB client.

purkhusid avatar Aug 29 '24 11:08 purkhusid

You will already have multiple TableContexts with their own DynamoDB client, I’m wondering if it’s neater/more consistent/more obvious to just have an extra non-generic MultiTableContext that implements the Transact (& multi-table batch?) methods.

samritchie avatar Aug 29 '24 12:08 samritchie

You will already have multiple TableContexts with their own DynamoDB client, I’m wondering if it’s neater/more consistent/more obvious to just have an extra non-generic MultiTableContext that implements the Transact (& multi-table batch?) methods.

I'm not sure if i like that approach since it would just add one more thing that the end user has to create without any real value. The end user has already configured the context for each of his types and I think it would be a step backwards if they had to configure one more thing to use transactions.

I still agree that it is kind of wonky that you can use any TableContext to do the actual transaction but I think that is rooted in that this library kind of clashes with how the DynamoDB client is request based while this library is designed around a configured context.

Any concrete suggestions on how this API might work better are of course greatly appreciated.

purkhusid avatar Aug 29 '24 13:08 purkhusid

We could certainly add a convenience TableContext.MultiTableContext getter to minimise the ceremony for users that want to do ad hoc multi table transactions, but creating an additional context would not be too big a deal for people already managing a larger number of tables.

Probably another alternative would be creating a TransactionBuilder type and threading it through a fluent (or piped) chain.

For batch operations I wouldn’t want to lose the existing strongly typed single call methods so multi-table batch should be a new interface rather than a modification of the existing. Mixing generic & non-generic methods on a single context is a bit redundant & likely to confuse.

samritchie avatar Aug 30 '24 04:08 samritchie

What about something like this:

namespace FSharp.AWS.DynamoDB

[<RequireQualifiedAccess>]
module TransactionBuilder =
    open Amazon.DynamoDBv2.Model
    open ExprCommon
    open Amazon.DynamoDBv2

    type Transaction = { RequestItems: (TransactWriteItem * IAmazonDynamoDB) seq }

    let createTransaction () : Transaction = { RequestItems = [] }

    let put
        (tableContext: TableContext<'TRecord>)
        (item: 'TRecord)
        (precondition: option<ConditionExpression<'TRecord>>)
        (transaction: Transaction)
        : Transaction =
        let req = Put(TableName = tableContext.TableName, Item = tableContext.Template.ToAttributeValues item)
        precondition
        |> Option.iter (fun cond ->
            let writer = AttributeWriter(req.ExpressionAttributeNames, req.ExpressionAttributeValues)
            req.ConditionExpression <- cond.Conditional.Write writer)
        { transaction with
            RequestItems = (Seq.append [ (TransactWriteItem(Put = req), tableContext.Client) ] transaction.RequestItems) }

    let check
        (tableContext: TableContext<'TRecord>)
        (key: TableKey)
        (condition: ConditionExpression<'TRecord>)
        (transaction: Transaction)
        : Transaction =
        let req = ConditionCheck(TableName = tableContext.TableName, Key = tableContext.Template.ToAttributeValues key)
        let writer = AttributeWriter(req.ExpressionAttributeNames, req.ExpressionAttributeValues)
        req.ConditionExpression <- condition.Conditional.Write writer
        { transaction with
            RequestItems = (Seq.append [ (TransactWriteItem(ConditionCheck = req), tableContext.Client) ] transaction.RequestItems) }

    let update
        (tableContext: TableContext<'TRecord>)
        (key: TableKey)
        (precondition: option<ConditionExpression<'TRecord>>)
        (updater: UpdateExpression<'TRecord>)
        (transaction: Transaction)
        : Transaction =
        let req = Update(TableName = tableContext.TableName, Key = tableContext.Template.ToAttributeValues key)
        let writer = AttributeWriter(req.ExpressionAttributeNames, req.ExpressionAttributeValues)
        req.UpdateExpression <- updater.UpdateOps.Write(writer)
        precondition |> Option.iter (fun cond -> req.ConditionExpression <- cond.Conditional.Write writer)
        { transaction with
            RequestItems = (Seq.append [ (TransactWriteItem(Update = req), tableContext.Client) ] transaction.RequestItems) }

    let delete
        (tableContext: TableContext<'TRecord>)
        (key: TableKey)
        (precondition: option<ConditionExpression<'TRecord>>)
        (transaction: Transaction)
        : Transaction =
        let req = Delete(TableName = tableContext.TableName, Key = tableContext.Template.ToAttributeValues key)
        precondition
        |> Option.iter (fun cond ->
            let writer = AttributeWriter(req.ExpressionAttributeNames, req.ExpressionAttributeValues)
            req.ConditionExpression <- cond.Conditional.Write writer)
        { transaction with
            RequestItems = (Seq.append [ (TransactWriteItem(Delete = req), tableContext.Client) ] transaction.RequestItems) }

    let writeItems (clientRequestToken: string option) (transaction: Transaction) : Async<unit> = async {
        if (Seq.length transaction.RequestItems) = 0 || (Seq.length transaction.RequestItems) > 100 then
            raise
            <| System.ArgumentOutOfRangeException(nameof transaction.RequestItems, "must be between 1 and 100 items.")
        let req =
            TransactWriteItemsRequest(
                ReturnConsumedCapacity = returnConsumedCapacity,
                TransactItems = (transaction.RequestItems |> Seq.map (fun i -> fst i) |> ResizeArray)
            )
        clientRequestToken |> Option.iter (fun x -> req.ClientRequestToken <- x)
        let! ct = Async.CancellationToken
        let! response = (snd (Seq.head transaction.RequestItems)).TransactWriteItemsAsync(req, ct) |> Async.AwaitTaskCorrect
        maybeReport
        |> Option.iter (fun r -> r TransactWriteItems (Seq.toList response.ConsumedCapacity) (Seq.length items))
        if response.HttpStatusCode <> HttpStatusCode.OK then
            failwithf "TransactWriteItems request returned error %O" response.HttpStatusCode
    }

module Usage =
    let request =
        TransactionBuilder.createTransaction ()
        |> TransactionBuilder.put ctx1 { Name = "John"; Age = 30 } None
        |> TransactionBuilder.put ctx1 { Name = "Jane"; Age = 25 } None
        |> TransactionBuilder.put ctx2 { SomethingElse = "Hello" } None
        |> TransactionBuilder.writeItems None
        |> Async.RunSynchronously

purkhusid avatar Aug 30 '24 09:08 purkhusid

I think TransactionBuilder is definitely better than methods on the context. My main concern here is the rest of the public API is method-based rather than modules & functions - there’s value in keeping it consistent/predictable.

samritchie avatar Sep 04 '24 02:09 samritchie

I think TransactionBuilder is definitely better than methods on the context. My main concern here is the rest of the public API is method-based rather than modules & functions - there’s value in keeping it consistent/predictable.

Yeah, I figured that using modules and functions was not really consistent with how the rest of the library operates. I'll create a new class that has methods instead and the class can keep the state internally.

purkhusid avatar Sep 04 '24 08:09 purkhusid

@samritchie I changed the API to use a class instead of a module. Please take a look when you have the time.

purkhusid avatar Sep 05 '24 14:09 purkhusid

@samritchie @bartelink Friendly reminder, would appreciate a review if you have some spare time for it.

purkhusid avatar Oct 07 '24 11:10 purkhusid

Thanks @purkhusid for your efforts here.

I’m happy with this design, just a bit unsure about naming (not wanting to start an endless 'naming things' discussion but would like to look at options). And yes, I know I suggested the name originally - it’s relevant that I felt this was the best name to communicate the concept.

However in .NET I’m expecting a type {Something}Builder to have a Build() method at the end of the chain (even though it’s unnecessary here). In F# I’m expecting a type {Something}Builder to be a CE.

A couple of other options off the top of my head:

    Transaction()
        .AddCheck(table1, key, existsConditionTable1)
        .AddPut(table2, item2) // or .WithCheck(), .WithPut()?
        .WriteItems()
    TransactWriteItems()
        .AppendCheck(table1, key, existsConditionTable1)
        .AppendPut(table2, item2)
        .Write()
    table1.BeginTransaction()
        .Check(table1, key, existsConditionTable1)
        .Put(table2, item2)
        .WriteTransaction()

I may be overthinking it, open to any sort of feedback here.

samritchie avatar Oct 09 '24 13:10 samritchie

I think just renaming the builder should be enough. The methods on the build er all follow the DynamoDB client naming which I think is preferable. So I think I like it like this:

    table1.Transaction()
        .Check(table1, key, existsConditionTable1)
        .Put(table2, item2)
        .TransactWriteItems()

What do you think?

purkhusid avatar Oct 09 '24 15:10 purkhusid

@purkhusid yes - that looks good.

samritchie avatar Oct 09 '24 23:10 samritchie

@samritchie Updated the API

purkhusid avatar Oct 14 '24 10:10 purkhusid

@samritchie Added docs and addressed your comments.

purkhusid avatar Oct 15 '24 12:10 purkhusid

@bartelink Thanks for the review. These changes sound good to me if @samritchie likes them as well.

purkhusid avatar Oct 24 '24 11:10 purkhusid

Yes, that seems okay, we don’t have particularly rapid uptake of the new version. Re: metricsCollector I’d like to see this go eventually & expose the same data via OpenTelemetry - it was always a bit of a stop-gap solution.

samritchie avatar Oct 25 '24 07:10 samritchie

Yeah, I've been hiding from the need to go OT in Equinox too - the core and the MessageDb piece are wired, but the Cosmos, Dynamo and ESDB drivers are not yet. Using that does nicely sidestep my earlier non-proposal of having something to wrap the client and collector.

bartelink avatar Oct 25 '24 07:10 bartelink

There’s AWS SDK OpenTelemetry instrumentation in beta, but I’ve had a look and it’s not very useful; one of the perils of a codegen SDK. Having something implemented here would be more useful to more people. I’ll have a look & see if any other dynamo clients have good naming conventions to follow.

samritchie avatar Oct 25 '24 08:10 samritchie