fluent-kit icon indicating copy to clipboard operation
fluent-kit copied to clipboard

Optimisations introduced in 1.35.0 cause SQLQueryEncoder to crash

Open michal-tomlein opened this issue 2 years ago • 8 comments

Describe the bug

The performance and behavior improvements for encoding/decoding Fluent relations seem to cause SQLQueryEncoder to crash when encoding even the simplest of models.

To Reproduce

Steps to reproduce the behavior:

  1. Add package with configuration:
// swift-tools-version:5.5

import PackageDescription

let package = Package(
    name: "SQLQueryEncoderCrash",
    platforms: [
        .macOS(.v12),
    ],
    dependencies: [
        .package(url: "https://github.com/vapor/fluent.git", from: "4.0.0"),
        .package(url: "https://github.com/vapor/fluent-kit.git", "1.0.0"..."1.35.0"),
//        .package(url: "https://github.com/vapor/fluent-kit.git", "1.0.0"..<"1.35.0"),
        .package(url: "https://github.com/vapor/sql-kit.git", from: "3.5.0"),
    ],
    targets: [
        .executableTarget(name: "App", dependencies: [
            .product(name: "Fluent", package: "fluent"),
            .product(name: "FluentSQL", package: "fluent-kit"),
            .product(name: "SQLKit", package: "sql-kit"),
        ]),
    ]
)
  1. Add the following to main.swift and run (should crash):
import Fluent
import SQLKit

final class File: Model {
    static let schema = "files"

    @ID(custom: .id)
    var id: Int?

    @OptionalField(key: "fileName")
    var fileName: String?

    init() {}
}

let file = File()
let row = try SQLQueryEncoder().encode(file)

print(row)
  1. Uncomment the alternative package dependency and re-run (will not crash).

Expected behavior

Unless I’m somehow using this API wrong, this should not crash.

Environment

  • Vapor Framework version: 4.65.2
  • OS version: macOS 12.6

michal-tomlein avatar Oct 04 '22 08:10 michal-tomlein

Btw, is there an alternative approach to get row data (keys and a SQLExpression for each value) that might work? My current workaround is pinning to the previous version of FluentKit. Thank you.

michal-tomlein avatar Oct 26 '22 12:10 michal-tomlein

While this crash should not be happening, this is not expected usage for SQLKit. What use case is revealing this behavior?

gwynne avatar Oct 28 '22 16:10 gwynne

I have a SQLUpsert struct similar to SQLKit’s SQLInsert. I use SQLQueryEncoder to fill its columns and values properties.

michal-tomlein avatar Oct 28 '22 16:10 michal-tomlein

I just noticed there’s SQLConflictResolutionStrategy, which might work for me (but doesn’t remove the need for SQLQueryEncoder). Can you elaborate on why this is not expected usage for SQLKit? I rely on SQLKit quite heavily and would like to understand how it’s meant to be used.

michal-tomlein avatar Oct 28 '22 16:10 michal-tomlein

@michal-tomlein Upserts have been fully supported via the .onConflict() method available on SQLInsertBuilder for some time now - it supports specifying most of the criteria allowed by PostgreSQL's conflict handling syntax, and correctly adapts its syntax for SQLite and MySQL (though with reduced functionality, especially in MySQL's case - in particular MySQL can't distinguish conflicts based on which constraint was involved). For example usage, have a look at the benchmark tests for upserts (sql-kit/Sources/SQLKitBenchmark/SQLBenchmark+Upsert.swift)

As regards SQLQueryEncoder, it is definitely broken when it comes to directly encoding Fluent models into SQLKit queries - the unexpected usage is in doing exactly that; the FluentSQL module provides overrides on the decoding side of things so that Fluent models can be directly decoded from SQLKit queries via FluentKit's own facilities rather than the more generic versions which decode any Decodable. There are, however, no equivalent overrides for encoding Fluent models into SQLKit queries - doing so is unexpected because it triggers Fluent's use of reflection, which effectively negates the majority of any performance boost otherwise gained via SQLKit (the same is true for decoding, but there have historically been many more use cases for that than for encoding). Aside from greater control over the generated query (such as the ability to specify upserts! 🙂), it's no better to use SQLKit to insert a Model this way than it is to do it with Fluent, so there's little focus on supporting it - as is obvious, there isn't even a simple unit test for it (though there will be shortly!!).

All this being said, the problem is fairly straightforward: When you encode a Fluent model through SQLKit, you end up (more or less) directly invoking the Encodable logic on Fields, which asks each individual property wrapper to encode itself via FluentKit's AnyCodableProperty protocol. @OptionalField's implementation (correctly) encodes itself into a singleValueContainer(), effectively deferring to the built-in conformance for Optional. And SQLQueryEncoder, having gone with the assumption that it will never have to deal with a superEncoder(forKey:)-style situation where multiple containers are requested at the same nesting level, just crashes. As it happens, I've just finished fixing what boils down to the same issue in both PostgresDataEncoder and MySQLDataEncoder, and this definitely needs to be fixed as well. I'll try to get to it as soon as I can.

In the meantime, the only workaround I can suggest offhand is to use a DTO - either copy your data into a plain structure before encoding via SQLKit, or ideally, skip the use of a Fluent model altogether in favor of structs with Codable conformance. Considering the amount this can benefit the performance of your SQLKit code, I'd tend to recommend doing it that way in any event (don't worry, this issue will get fixed regardless 🙂).

gwynne avatar Oct 30 '22 09:10 gwynne

Thank you for taking the time to share this insight! This is very helpful.

I’m not looking to gain any performance benefit by using SQLKit directly. My upsert is part of a more complex WITH query, which I don’t think is possible with SQLInsertBuilder (and probably shouldn’t be).

I’ll definitely look into using DTOs. I started out writing Fluent models, but for better or worse ended up with a SQLKit-heavy codebase over the past few years (which was a royal pain to migrate from Vapor 3 to 4 😅). So all the reflection is probably hurting me more than helping.

michal-tomlein avatar Oct 30 '22 09:10 michal-tomlein

On the contrary! I would love to support CTEs on all DQL/DML query builders (even MySQL supports them in 8.0) - and you can achieve the effect fairly easily already:

struct SQLCommonTableExpression: SQLExpression {
    var name: SQLExpression
    var query: SQLExpression
    func serialize(to serializer: inout SQLSerializer) {
        serializer.statement {
            $0.append(self.name)
            $0.append("AS")
            $0.append(self.query)
        }
    }
}
struct SQLWithClause: SQLExpression {
    var ctes: [SQLExpression]
    func serialize(to serializer: inout SQLSerializer) {
        serializer.statement {
            $0.append("WITH")
            $0.append(SQLList(ctes))
        }
    }
}

From there I'd probably define an SQLCTEBuilder based on SQLUnionBuilder's API... the next obvious thing to do would be to add an optional withClause property to the SQLQueryBuilder protocol, but that doesn't restrict the types of queries that can use CTEs, so I'd probably end up defining an SQLWithClauseBuilder protocol for the appropriate subset of builders to conform to...

Okay, now you've just got me brainstorming an implementation 😅 Unfortunately I have to fix the encoder first 🤣

gwynne avatar Oct 30 '22 15:10 gwynne

Haha 😅 That code is very similar to what I have now. I didn’t write a builder, but it would definitely be a great addition to the API.

michal-tomlein avatar Nov 01 '22 15:11 michal-tomlein