soto-core icon indicating copy to clipboard operation
soto-core copied to clipboard

Performance issues with `DictionaryDecoder`

Open stsandro opened this issue 2 years ago • 6 comments

Is your feature request related to a problem? Please describe. With my Swift Lambda functions, I could not get to the same performance level as with my Python Lambda functions. So I did some digging. And I discovered that decoding the output is where the most time is lost.

https://github.com/soto-project/soto-core/blob/87d80c199e90de0413b4dd99fbd23c4eae57b88f/Sources/SotoCore/Message/AWSResponse.swift#L164

I looked at the code and saw that this DictionaryDecoder is based on the code from Swift Foundation's JSONDecoder from 2019. Since then there has been a huge performance improvement to that code. See: https://github.com/swift-server/swift-aws-lambda-runtime/issues/153#issuecomment-797476001

I did some very quick and dirty tests by replacing:

if let data = buffer.getData(at: buffer.readerIndex, length: buffer.readableBytes, byteTransferStrategy: .noCopy) {
    // if required apply hypertext application language transform to body
    if self.isHypertextApplicationLanguage {
        outputDict = try self.getHypertextApplicationLanguageDictionary()
    } else {
        outputDict = try JSONSerialization.jsonObject(with: data, options: []) as? [String: Any] ?? [:]
    }
    // if payload path is set then the decode will expect the payload to decode to the relevant member variable
    if let payloadKey = payloadKey {
        outputDict = [payloadKey: outputDict]
    }
    decoder.dateDecodingStrategy = .secondsSince1970
}

with:

let jsonDecoder = JSONDecoder()
jsonDecoder.dateDecodingStrategy = .secondsSince1970

let out = try jsonDecoder.decode(Output.self, from: buffer)

return out

It worked, at least for my simple use case of fetching some data from a DynamoDB, and it brought me to the same or better performance level as my Python code.

Describe the solution you'd like I would really like to use Foundation's JSONDecoder because it would mean less code to maintain. I know there is also a desire to reduce the dependency on Foundation, but that goal seems not very high on the priority list.

Describe alternatives you've considered We could also port the code from the new and faster JSONDecoder over to DictionaryDecoder.

Additional context I want to give some context on the possible performance gains here. I created a Lambda function just to test this. This Lambda function only fetches a few objects from DynamoDB with a batchGetItem() request. Nothing else. When the Lambda function is warm, a request takes ~115ms vs. ~15ms. This brings my Swift lambdas into the same performance realm as my Python functions, maybe even slightly better. To achieve the same level of performance per request I would have to more than double the amount of memory for the Lambda and therefore double the cost. IMHO this is worth some effort because it would increase the performance and cost-effectiveness of many Lambda functions and probably other code too.

With all that said, I would like to thank @adam-fowler and all the others that contribute to this project for their effort. I appreciate your hard work 🙏 I would be willing to implement a solution for this if someone would assist me to lead me in the right direction. This has to be rock solid, as JSON decoding is very common and many projects that use Swift on AWS use soto.

stsandro avatar Sep 20 '22 10:09 stsandro

We can't use JSONDecoder because we need to edit the decoded dictionary before decoding to response shape. Response shapes can have data coming from both the body and header values of a response. If a response shape requires a header value, the decode of the response shape would fail if it didn't find it in the JSON body.

To get around this I use JSONSerialization to get a dictionary and then I augment the dictionary with the header values and then use the DictionaryDecoder to decode this to a response shape. So it isn't a case of just replacing the decoder.

adam-fowler avatar Sep 22 '22 14:09 adam-fowler

A possible solution is to copy across the latest implement of JSONDecoder to SotoCore and break it up so we can load a JSON file into a dictionary format that we can augment with header values and then run the decode on that dictionary

adam-fowler avatar Sep 26 '22 08:09 adam-fowler

A possible solution is to copy across the latest implement of JSONDecoder to SotoCore and break it up so we can load a JSON file into a dictionary format that we can augment with header values and then run the decode on that dictionary

What do you mean by copy across the latest implement of JSONDecoder? Do you suggest bringing the performance improvements to DictionaryDecoder by copying the code from JSONDecoder? That was what I had in mind as an alternative solution. Maybe I could set aside some time this week to start working on this.

stsandro avatar Sep 26 '22 08:09 stsandro

By copying across the latest implementation of JSONDecoder I mean copy the code from https://github.com/apple/swift-corelibs-foundation/blob/main/Sources/Foundation/JSONDecoder.swift. You'd also need JSONSerialization.swift and JSONSerialization+Parser.swift. Hopefully there aren't any other requirements.

We would need to break up the function func decode<T: Decodable>(_ type: T.Type, from data: Data) so we can access the parsed json data before it is sent to the decoder. Or as an alternative we could add a little extra code in that function that augments the json data with header values passed in via an additional argument. eg

func decode<T: Decodable>(_ type: T.Type, from data: Data, headers: HTTPHeaders) {
        do {
            var parser = JSONParser(bytes: Array(data))
            var json = try parser.parse()
            addHeadersTo(json: &json, headers: headers)
            return try JSONDecoderImpl(userInfo: self.userInfo, from: json, codingPath: [], options: self.options).unwrap(as: T.self)
            ...
}

adam-fowler avatar Sep 26 '22 17:09 adam-fowler

@stsandro if you'd like to work on this that'd be great.

adam-fowler avatar Sep 26 '22 17:09 adam-fowler

I'll look into this. Hopefully, I have some time that I can spend on this by the end of the week. It would be really nice to get some performance gains here. Thanks for pointing me in the right direction.

stsandro avatar Sep 26 '22 22:09 stsandro

I have started working on v7.x and the 7.x.x branch has now removed DictionaryDecoder. It uses the standard JSONDecoder, the raw AWSResponse has been added as userInfo and decoding of header and payload has been pushed to the init(from decoder functions in the service files

adam-fowler avatar Jul 19 '23 13:07 adam-fowler