intelmq icon indicating copy to clipboard operation
intelmq copied to clipboard

Type error with SQL output bot's prepare_values returning list instead of tuple

Open creideiki opened this issue 3 years ago • 2 comments

After rebasing our local branch onto mainline for the first time in way too long, our SQL output bots started crashing when inserting data into the database, with the following backtrace:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/intelmq/lib/bot.py", line 319, in start
    self.process()
  File "/usr/local/lib/python3.6/site-packages/intelmq/bots/outputs/sql/output.py", line 56, in process
    if self.execute(query, values, rollback=True):
  File "/usr/local/lib/python3.6/site-packages/intelmq/lib/mixins/sql.py", line 118, in execute
    self.cur.execute(query, values)
  File "src/pymssql/_pymssql.pyx", line 460, in pymssql._pymssql.Cursor.execute
  File "src/pymssql/_mssql.pyx", line 1104, in pymssql._mssql.MSSQLConnection.execute_query
  File "src/pymssql/_mssql.pyx", line 1135, in pymssql._mssql.MSSQLConnection.execute_query
  File "src/pymssql/_mssql.pyx", line 1252, in pymssql._mssql.MSSQLConnection.format_and_run_query
  File "src/pymssql/_mssql.pyx", line 1274, in pymssql._mssql.MSSQLConnection.format_sql_command
  File "src/pymssql/_mssql.pyx", line 2038, in pymssql._mssql._substitute_params
ValueError: 'params' arg (<class 'list'>) can be only a tuple or a dictionary.

This seems to be because after #2223 all values going to the database now pass through the function prepare_values() in the SQL output bot: https://github.com/certtools/intelmq/blob/e411d4ff0150aeb7fc386895ee4e7cc32984b907/intelmq/bots/outputs/sql/output.py#L60-L67

prepare_values() takes parameters in the form of a tuple but returns them transformed in the form of a list, which is then passed directly as the parameters argument to the database API's Cursor.execute() method.

PEP-0249 is not explicit on the type of parameters, but pymssql accepts an atom or tuple, psycopg2 accepts a tuple or dictionary and sqlite3 is also not explicit, but defaults to a empty tuple.

Notably, none of the three database APIs supported by IntelMQ are documented to accept the list that prepare_values() returns, but all are documented to accept tuples.

I changed the existing

return list(values) 

to simply

return values

which is a tuple and works for me.

But I am unclear on why prepare_values() changes the data type of its input from a tuple to a list. Am I missing something? Do your tests pass with lists?

Also, I have only changed the else branch and tested with MSSQL. I don't have a PostgreSQL instance handy, and thus can't test that branch.

creideiki avatar Nov 08 '22 12:11 creideiki

What about something like this? (untested)

    def prepare_values(self, values: tuple) -> tuple:
        if self._engine_name == self.POSTGRESQL:
            # escape JSON-encoded NULL characters. JSON escapes them once, but we need to escape them twice,
            # so that Postgres does not encounter a NULL char while decoding it
            # https://github.com/certtools/intelmq/issues/2203
            return tuple(value.replace('\\u0000', '\\\\u0000') if isinstance(value, str) else value for value in values)
        else:
            return values

wagner-intevation avatar Nov 09 '22 08:11 wagner-intevation

What about something like this? (untested)

That looks like what I would try to do. But I also haven't tested it.

And I still don't understand why the type was changed to list in the first place, and how nobody but me has encountered that error.

creideiki avatar Nov 09 '22 09:11 creideiki