py-retrosheet icon indicating copy to clipboard operation
py-retrosheet copied to clipboard

SQLite errors

Open pqnelson opened this issue 9 years ago • 0 comments

Hello,

So, I am using the latest version of py-retrosheet with SQLite3. There are two problems I've run into, namely:

  1. Queries of the form 'SELECT * FROM rosters WHERE year = %s AND player_id = %s AND team_tx = %s' are malformed, and
  2. SQLite's use of rowcount is botched

Malformed Queries. The initial run, it connects to the database, but when trying populating my database with 2014 data...it throws errors. I saw on line 58-59 of parse.py the queries look like

        sql = 'SELECT * FROM rosters WHERE year = %s AND player_id = %s AND team_tx = %s'
        res = conn.execute(sql, [row[0], row[1], row[6]])

But shouldn't the queries be:

        sql = 'SELECT * FROM rosters WHERE year = ? AND player_id = ? AND team_tx = ?'
        res = conn.execute(sql, [row[0], row[1], row[6]])

Note the change of %s to ?. This appears to be how SQLalchemy expects parameters (at least, when I made these changes locally, it worked for SQLite).

SQLite Cannot Count. This bug is really SQLite's fault.

The second run, I try download.py -y 2013, then parse.py -y 2013. It blows up. There's a problem with SQLalchemy's SQLite implementation for rowcount (c.f., this blog post). Consequently, parse.py thinks the Anaheim Angels are not in the teams table, so it attempts to insert them in...causing an error.

The solution is, according to SQLalchemy's documentation, we shouldn't be using results.rowcount() at all (the doc claims it makes sense only after an UPDATE or DELETE to call results.rowcount()). We should instead be using len(results.fetchall()), which may be dangerous in general (since we don't know how many rows may be fetched)...but since we're expecting at most 1 row, the change appears harmless.

pqnelson avatar Jul 12 '15 18:07 pqnelson