norm icon indicating copy to clipboard operation
norm copied to clipboard

[FeatureRequest] Add a public "fromRow"-like proc that allows handing a raw SQL statement and only do conversion to any object type

Open PhilippMDoerner opened this issue 2 years ago • 9 comments

Heyho!

As part of the web application that I am writing I am utilizing sqlite's full text search feature FTS5. It's really useful and provides basic search utilities which means you can do somewhat elaborate searches without needing to add something heavy like elasticsearch. However, that table looks in DDL something like this:

CREATE VIRTUAL TABLE search_article_content USING fts5( title, title_rev, body, body_rev, table_name, record_id, campaign_id, guid )

And I want to query content on that table with an SQL Query like this

            SELECT 
                table_name,
                record_id,
                title,
                body,
                guid,
                bm25(search_article_content, 15) as search_score -- 15 states that a match in the title is 15 times as valuable as a match in the body.
            FROM search_article_content 
            WHERE
                campaign_id = %s
                AND (
                    title MATCH %s
                    OR title_rev MATCH %s
                    OR body MATCH %s
                    OR body_rev MATCH %s
                )
            ORDER BY search_score
            LIMIT %s;

All I want from norm at this point, is the functionality of taking my very custom SQL statement, make a db query with it and transforming the data it receives into an object whose type I give it. Not even an object that inherits from Model really, since I don't want to infer anything from the model, it's job should solely be to be the transformation target as everything else should be determined by my raw sql query.

The proc I imagine could have a signature like this:

rawSelect(con: DbConn, query: SqlQuery, targetEntries: seq[T])

T in this scenario explicitly only needs to be an object, nothing more.

Would such a feature even be desirable for norm?

PhilippMDoerner avatar Jan 23 '22 19:01 PhilippMDoerner

Correct me if I'm wrong @moigagoo but given how tightly norm interacts with models, I think conversion of Row to Model is only possible when norm generated the query, correct? So I don't think there's any of the code where norm converts from Row to a Model that could be used in the scenario where the user themselves writes the entire SQL Query.

Thus I'm getting the feeling that, if we want such a proc, it would need its own "convert to object" implementation instead of hooking into anything norm has already written.

For my own needs, I have currently solved this by using nisane, which tries to do exactly that (albeit some wonky behaviour and its very macro heavy). For something with nisane's API, I can even provide some procs on how I work with them:

import std/[db_sqlite]
import nisane 
import genericNisaneRepository

export nisane
export sequtils

type NotFoundError* = object of KeyError
type InvalidQueryError* = object of KeyError

proc parseRow[T: object](row: Row, containerType: typedesc[T]): T =
  row.to(result, nil)

proc parseRow[T: ref object](row: Row, containerType: typedesc[T]): T =
  result = new(T)
  row.to(result, nil)

proc rawSelect*[T](connection: DbConn, containerType: typedesc[T], sqlQueryString: static string, queryParams: varargs[string]): seq[T] =
    const sqlQuery = sql(sqlQueryString)
    var rows: seq[Row] = connection.getAllRows(sqlQuery, queryParams)
    result = rows.map(row => parseRow(row, T))

  proc rawSingleSelect*[T](connection: DbConn, containerType: typedesc[T], sqlQueryString: static string, queryParams: varargs[string]): T =
    let queryResult = connection.rawSelect(T, sqlQueryString, queryParams)

    case(queryResult.len()):
    of 0:
      raise newException(NotFoundError, "Could not find any rows for query")
    of 1:
      result = queryResult[0]
    else:
      raise newException(InvalidQueryError, fmt"Found more than 1 row for query that expected to find a single row.")

Example usage:


import norm/[model, sqlite]
import constructor/defaults

type Creature* {.defaults.} = ref object of Model
    ##[TableModel of the table of creatures ]##
    name*: string = ""

implDefaults(Creature)
proc newModel*(T: typedesc[Creature]): Creature = newCreature()

let con = open(":memory:", "", "", "")
con.createTables(newModel(Creature))

var c1 = Creature(name: "lula")
var c2 = Creature(name: "lula")
con.createEntry(c1)
con.createEntry(c2)

const query = "SELECT * FROM Creature"
type CreatureObj* = object
  name*: string
  id*: int64

let x = con.rawSelect(CreatureObj, query) # If you try out rawSingleSelect with this query, this will raise an Exception as one would anticipate
echo x.repr

This is NOT an implementation suggestion! For norm we'd naturally need to implement an adjusted version to this, as nisane relies on Row to be defined as Row = seq[string] while norm has ndb which defines Row as Row = seq[DbValue].

However, given nisane's existence I see a way how this all could work. It's mostly a mattter of: Do we want it? In my personal opinion, this would be convenient for users in order to also deal with/execute complex SQL (Recursive queries for example, or the search query as above).

PhilippMDoerner avatar Jul 17 '22 14:07 PhilippMDoerner

@PhilippMDoerner I've just looked at the fromRow implementation and it's pretty simple: https://github.com/moigagoo/norm/blob/28121e0c334551f4b69e8f18f0f6bd997dc8aeaf/src/norm/private/sqlite/rowutils.nim#L17

The complexity comes from the needing to populate nested models.

I think it shouldn't be hard to implement an dumber version that would just concert every row item to the field type of a provided object. The same conversion rules defined in dbtypes would apply.

moigagoo avatar Jul 18 '22 20:07 moigagoo

And given that you can actually get a raw Row with just plain ndb select, you could achieve your goal.

One thing that I have concerns about is that this functionality feels alien to the concept of Norm. It's more of a general-purpose util.

moigagoo avatar Jul 18 '22 20:07 moigagoo

I agree and I think it's a matter of weighing options. Django for example provides that functionality, but it also has the goal of always being your only gateway to the database, since it's "signal" concept and various other things hinge on the idea that you don't bypass it .

It's a question of what scope is desired for norm, what the desired vision is, that's why I included the question on whether such a feature would be desired (implement in sqlite/postgres modules), tolerated (e.g. by having such a proc in a general utils module that must be imported separately), or not wanted due to being out of scope.

I'm not sure what the fully desired scope of norm is, thus I'm asking.

PhilippMDoerner avatar Jul 19 '22 15:07 PhilippMDoerner

@moigagoo Could you tell me whether you'd want such a feature?

I have my own solution for this already with nisane, so I wouldn't want to go through the effort of figuring out how to implement this only for us to reach the conclusion that this is out of scope for norm.

PhilippMDoerner avatar Jul 22 '22 18:07 PhilippMDoerner

@PhilippMDoerner I think we already have it?

Here's an example:

nim> import norm/model
nim> import norm/sqlite
nim> import norm/private/sqlite/rowutils
nim> type
....   M = ref object of Model
....     name: string
....
nim> let r = @[?"foo", ?1]
nim> var m = M()
nim> m.fromRow(r)
nim> echo m[]
(name: "foo", id: 1)
nim> let rr: Row = m.toRow(force = true)
nim> echo rr
@['foo', 1]

moigagoo avatar Jul 23 '22 10:07 moigagoo

Nice that we have part of the problem somewhat covered! I likely should write up some docs for that or sth. However, there are 2 points I would want to bring up:

First, this is only the conversion part. Ideally I'd want to have an entire convenience proc with roughly this signature proc rawSelect[T](connection: DbConn, sqlQuery: string, objectType: typedesc[T], params: varargs[DbValue, dbValue] so that I can use such a proc for example to do more complicated count statements. (Alternatively, instead of objectType it could also be an instantiated object for consistency with other norm-procs, which would allow us once again to side-step the issue of initialization).

Here an example for a more complicated count query that currently can't be executed by norm (but it really is just an example, you can also imagine me aggregating over a join of 5 tables with some sort of Grouping in there, SQL can get complicated).

const sqlStmt = """
  SELECT COUNT(*)
  FROM wikientries_creature creature
  INNER JOIN wikientries_campaign campaign ON creature.campaign_id = campaign.id
  WHERE campaign.name LIKE ?
"""
let campaignName = "SomeCampaign"
let campaignCreatureCount = connection.rawSelect(sqlStmt, int64, campaignName.dbValue())

Or do some obscure joins over 5 tables to aggregate some data in complicated SQL fashion in a single query. fromRow is the lions share of the work in that proc, but a rawSelect proc (or whatever its name) would still need to be written (which executes the sql query directly and passes received rows to fromRow and returns the output).

Second, the current fromRow comes with some limitations that I would want to bypass somehow. These are:

  1. The type you want to convert to must be a model.
  2. Deriving from that the type must be a model, it also means that whatever Row instance you feed fromRow it must have an "id" column.
  3. Deriving from that the type must be a model, it also means that you can not convert a Row into a value-type object should you want to.

PhilippMDoerner avatar Jul 23 '22 14:07 PhilippMDoerner

I've got a first draft going that currently can't deal with nested objects because hot damn do I not get why it doesn't properly work. I think there's a copy happening somewhere in my other versions that leads to data loss somewhere.

All proc names below are placeholders, I think if we were to include them it might make more sense to just keep calling them fromRow and add a "not Model" in the type definition or something.


func isObject*[T: object | ref object](val: T): bool = true
func isObject*[T: object | ref object](val: Option[T]): bool = true
func isObject*[T: not object and not ref](val: Option[T]): bool = false
func isObject*[T: not object and not ref](val: T): bool = false

func toObjectOptional*[T: object | ref object](val: T): Option[T] = some val
func toObjectOptional*[T: object | ref object](val: Option[T]): Option[T] = val
func toObjectOptional*[T: not object and not ref object](val: Option[T]): Option[T] = none(T)
func toObjectOptional*[T: not object and not ref object](val: T): Option[T] = none(T)

#func to[T: object | ref object, M](obj: T, t: typedesc[M]): M = 

template fieldPairsR(t: untyped) =
  when t is object:
    t.fieldPairs
  else:
    t[].fieldPairs

proc fromRawRowPos[T: object | ref object](obj: var T, row: Row, pos: var Natural, skip: bool = false) =
  for field, value in fieldPairsR(obj):
    echo "chief"
    if isObject(value): # value type is object, so maybe must fill that # object, string, int, float
      var valueOptional = value.toObjectOptional()
      if valueOptional.isSome(): # value has object to fill, try to do so
        # parse row into object
        var valueRef = valueOptional.get()

        # row has no data to give
        let columnIsNull = row[pos].kind == dvkNull
        if columnIsNull:
          #When value is an optional of object give it a null type
          when value is Option:
            when value.get() is object or value.get() is ref object:
              value = none typeof(valueRef)
            #When value is an optional of something else
          

      else: # value contains no object to fill, skip
        inc pos
    
    elif not skip:
      value = row[pos].to(typeof(value))
      inc pos
    
    else:
      inc pos

proc fromRawRow*[T: object | ref object](obj: var T, row: Row) =
  ##[ Populate `Model`_ instance from ``ndb.postgres.Row`` instance.

  Nested `Model`_ fields are populated from the same ``ndb.postgres.Row`` instance.
  ]##

  var pos: Natural = 0
  obj.fromRawRowPos(row, pos)

The above works with code that has NO nested objects. I've been obviously mostly stealing from the code that is already there and just renamed a couple things to make it more understandable for me. You might notice it's missing a section right after val = none typeof(subMod):

          inc pos
         subMod.fromRowPos(row, pos, skip = true)  ## Then we skip all the ``row`` values that should've gone into this submodel.

       else:                                       ## If ``row[pos]`` is not a ``NULL``,
         inc pos                                   ##
         subMod.fromRowPos(row, pos) 

I can't get it to work. I mean, it doesn't cause compile errors or anything, but it falsely gives you the assumption that nested objects will be correctly parsed, which they won't.

I tried with this code on a local sqlite db of mine:


type NestIds* = object
  record_id: Option[int64]
  guid: string

type Hit* = ref object
  title: string
  table_name: Option[string]
  record_id: Option[int64]
  guid: string

let con = open("mydb.sqlite3", "", "", "")
let rows = con.getAllRows(sql"SELECT title, table_name, record_id, guid FROM search_article_content LIMIT 10")

var objs: seq[Hit] = @[]
echo Hit().isObject()
for row in rows:
  var obj: Hit = Hit(title: "", table_name: some(""), record_id: some(-1.int64), guid: "")
  obj.fromRawRow(row)
  objs.add(obj)

 #If you don't want to use treeforms pretty print module, just use "echo objs.repr"
print objs

And found that NestedIds does not get filled properly. In fact, if NestIds is a ref object it actually causes compiler errors. I'll need to figure out why that is.

PhilippMDoerner avatar Aug 09 '22 06:08 PhilippMDoerner

@moigagoo I've settled on an implementation for the parsing of Row to arbitrary object. And I think it now comes down whether this should be in norm, some sort of addon for norm or its own package. I think trying to support object for this is a fool's errand, precisely as you stated because of nested objects. Thus I swapped to only supporting ref object types.

I then basically just copy pasted getRowPos and swapped out the bits that were model specific and moved some stuff to compile-time because I could. See the new PR for an implementation

The new approach is surprisingly fast btw. , it seems to be able to fulfill the very same job as the current fromRowPos proc. Comparing my second implementation here via benchy gave me around a 10% speed up.

PhilippMDoerner avatar Aug 10 '22 17:08 PhilippMDoerner

Through merging #164 this issue is no longer necessary. We can now fire raw SQL select queries and have the output be parsed into a custom ref object type of our choosing.

Therefore I'm closing the issue.

PhilippMDoerner avatar Sep 10 '22 17:09 PhilippMDoerner