SQLite.swift icon indicating copy to clipboard operation
SQLite.swift copied to clipboard

Add FTS pattern sanitizer

Open CSchwang opened this issue 8 years ago • 10 comments

An invalid FTS match query leads to a crash. E.g. trying to execute X.match("hello (something") gives:

fatal error: 'try!' expression unexpectedly raised an error: malformed MATCH expression: [hello (adsfasdf]: file /Library/Caches/com.apple.xbs/Sources/swiftlang/swiftlang-700.1.101.15/src/swift/stdlib/public/core/ErrorType.swift, line 50

Sample playground showing the behavior: https://gist.github.com/CSchwang/937f9ae9fcbf8e6e0152

It would be nice to be able to handle the error instead of crashing

CSchwang avatar Dec 27 '15 17:12 CSchwang

Merging PR #290 would address this..

ryanholden8 avatar Dec 31 '15 17:12 ryanholden8

Thanks for the gist, that makes it so much easier!

This still fails even after PR #290, due to this code in Connection.swift:

    public func pluck(query: QueryType) -> Row? {
        return try! prepare(query.limit(1, query.clauses.limit?.offset)).generate().next()
    }

Now that db.prepare can throw, I suspect pluck should be able to also. However perhaps this is a philosophical issue for @stephencelis to weigh in on.


And, fyi, in case the gist disappears, here is the repro code, slightly tweaked to allow for #290 change:

import SQLite

let db = try! Connection()

db.trace { print($0) }

let emails = VirtualTable("emails")

let subject = Expression<String?>("subject")
let body = Expression<String?>("body")

try! db.run(emails.create(.FTS4(subject, body)))

try! db.run(emails.insert(
    subject <- "Hello, world!",
    body <- "This is a hello world message."
    ))

let workingRow = db.pluck(emails.match("hello"))

// crashes here
let failingRow = db.pluck(emails.match("hello (adsfasdf"))

mikemee avatar Jan 09 '16 22:01 mikemee

@mikemee is right in wondering if there is a philosophical reason for pluck causing a fatal error and prepare throwing. From the discussion:

  1. On the one hand, recovering from errors is important when you're dealing with validation and other recoverable failures.
  2. On the other hand, SQL statements that read from the database generally shouldn't fail unless the statement is invalid (e.g., a syntax error, which is usually a developer issue). Because of this, I'm wondering if we should only promote prepare to be throws, but let the others remain fatal. With this change, developers that need the flexibility of error recovery can use db.prepare, but the rest can generally use the simpler, non-throwing APIs (scalar, pluck, etc.).

I'm open to discussion on these points, but I'd like to minimize the need for explicit error handling (and noise) wherever possible. At the moment I'm inclined to not consider this a bug (but it might require further documentation/reasoning).

stephencelis avatar Jan 10 '16 23:01 stephencelis

Not to be contrary, but I can easily imagine how the argument to match could be a user generated string, say. If so, how can this be tested beforehand to avoid the crash? I guess we're saying everyone should use prepare first, just in case?

Darn error handling!

mikemee avatar Jan 11 '16 04:01 mikemee

Note that #259 is also having a problem with db.pluck crashing (though seemingly for multi-threading and/or database contention problems).

It does really suck when errors can't be caught ... I fear there is little choice here but to eventually remove all the try! invocations and allow the errors to propagate. Philosophically, the strong typing already takes care of many errors, but imnsho database code can always fail for strange reasons...

mikemee avatar Jan 11 '16 04:01 mikemee

We just got caught by this. Our queries are user generated. For now, we're stripping non-alphanumerics, but it'd be great to match on other characters. Can we fix it at least for the match method, or does it require deeper plumbing?

tomtaylor avatar Jun 08 '16 17:06 tomtaylor

I just get caught by this, too. If I put a parenthesis into the query, it would crash. I can work around by not allowing user to enter this character, but this not seem to be a proper solution

trongcuong1710 avatar Oct 27 '16 09:10 trongcuong1710

You can look at the way GRDB.swift handles search pattern sanitization (FTS3/4, FTS5).

groue avatar Oct 27 '16 09:10 groue

I just wanted to provide a FTS3Pattern I implemented based on @groue's solution from GRDB.swift translated to SQLite.swift.

import SQLite

/// A full text pattern that can query FTS3 and FTS4 virtual tables.
public struct FTS3Pattern: RawRepresentable {

  /// The raw pattern string. Guaranteed to be a valid FTS3/4 pattern.
  public let rawValue: String

  /// Attempts to create a pattern from a raw pattern string.
  ///
  /// This method checks to see if the provided `rawValue` is valid FTS pattern syntax. If it is valid then the value
  /// is stored, unmodified, in `rawValue`. If it is not, then `nil` is returned from the initializer.
  ///
  /// - Important: The check is done by creating an empty in memory virtual table and then performing the query against
  /// that virtual table. Any raised error indicates that the value provided is not valid.
  /// - SeeAlso: https://github.com/groue/GRDB.swift/blob/7af96a69a02a1a10073af0a0e6d66fd24d2b7df7/GRDB/FTS/FTS3Pattern.swift
  /// - Parameter rawValue: The potential FTS pattern to check.
  public init?(rawValue: String) {
    let sql = """
    CREATE VIRTUAL TABLE documents USING fts3();
    SELECT * FROM documents WHERE (content MATCH '\(rawValue)');
    """

    do {
      // Create empty in memory virtual table and perform sample query
      try Connection(.inMemory).execute(sql)
    } catch {
      return nil
    }

    self.rawValue = rawValue
  }

}

Additionally, here is an extension to VirtualTable that allows it to be used as a drop in replacement for a String.

extension VirtualTable {
  /// Builds a copy of the query with a `WHERE … MATCH` clause.
  ///
  ///     let emails = VirtualTable("emails")
  ///
  ///     emails.match("Hello")
  ///     // SELECT * FROM "emails" WHERE "emails" MATCH 'Hello'
  ///
  /// - Parameter pattern: An already validated FTS3Pattern.
  /// - Returns: A query with the given `WHERE … MATCH` clause applied.
  func match(_ pattern: FTS3Pattern) -> QueryType {
    self.match(pattern.rawValue)
  }
}

RLovelett avatar Oct 12 '20 20:10 RLovelett

@RLovelett, thanks for the mention of GRDB. Beware that the code you provide embeds a raw string right into SQL, which should always be avoided:

SELECT * FROM documents WHERE (content MATCH '\(rawValue)');

groue avatar Oct 13 '20 05:10 groue