apsw icon indicating copy to clipboard operation
apsw copied to clipboard

Support for eponymous virtual tables

Open coleifer opened this issue 10 years ago • 4 comments

An eponymous virtual table is defined as a virtual table whose xCreate is the same as xConnect or xCreate is NULL (docs).

From my reading of the apsw vtable source code, the xCreate and xConnect method pointers are hard-coded and thus it doesn't appear possible to me to implement such a virtual table using APSW.

I would suggest a change that allows the VTModule.Create to be set to None or some other sentinel value to signal that this virtual table is eponymous.

coleifer avatar Nov 30 '15 17:11 coleifer

It is tricky based on what information is known when, but I'll see if I can figure something out.

rogerbinns avatar Nov 30 '15 17:11 rogerbinns

Yes, for sure. Perhaps there could be a second apsw_vtable_module struct that does not declare an xCreate?

coleifer avatar Nov 30 '15 18:11 coleifer

Interestingly, I'm running into a little trouble trying to test a virtual table implementation of my own. I've verified that the appropriate methods on the actual implementation are being called by using print statements. No results are ever returned, though.

Here is the wrapper class:

class TableFunction(object):
    columns = None
    parameters = None
    name = None

    @classmethod
    def get_columns(cls):
        if cls.columns is None:
            raise ValueError('No columns defined.')
        return cls.columns

    @classmethod
    def get_parameters(self):
        if cls.parameters is None:
            raise ValueError('No parameters defined.')
        return cls.parameters

    def initialize(self, **query):
        raise NotImplementedError

    def iterate(self, idx):
        raise NotImplementedError

    @classmethod
    def module(cls):
        class Cursor(object):
            def __init__(self, table_func):
                self.table_func = table_func
                self.current_row = None
                self.iterator = None
                self._idx = 0
                self._consumed = False

            def Close(self):
                pass

            def Column(self, idx):
                if not self.current_row:
                    raise ValueError('No current row.')
                if idx == -1:
                    return self._idx
                return self.current_row[idx - 1]

            def Eof(self):
                return self._consumed

            def Filter(self, idx_num, idx_name, constraint_args):
                query = {}
                params = idx_name.split(',')
                for idx, param in enumerate(params):
                    value = constraint_args[idx]
                    query[param] = value

                self.table_func.initialize(**query)
                self.Next()

            def Next(self):
                try:
                    self.current_row = self.table_func.iterate(self._idx)
                except StopIteration:
                    self._consumed = True
                    self.current_row = None
                else:
                    self._idx += 1

            def Rowid(self):
                return self._idx + 1

        class Table(object):
            def __init__(self, table_func, params):
                self.table_func = table_func
                self.params = params

            def Open(self):
                return Cursor(self.table_func())

            def Disconnect(self):
                pass

            Destroy = Disconnect

            def UpdateChangeRow(self, *args):
                raise ValueError('Cannot modify eponymous virtual table.')

            UpdateDeleteRow = UpdateInsertRow = UpdateChangeRow

            def BestIndex(self, constraints, order_bys):
                constraints_used = []
                columns = []
                for i, (column_idx, comparison) in enumerate(constraints):
                    if comparison != apsw.SQLITE_INDEX_CONSTRAINT_EQ:
                        continue

                    constraints_used.append(i)
                    columns.append(self.params[column_idx - 1])

                return [
                    constraints_used,
                    0,
                    ','.join(columns),
                    False,
                    1000,
                ]

        class BaseModule(object):
            def Connect(self, connection, module_name, db_name, table_name,
                        *args):
                columns = cls.get_columns()
                parameters = cls.get_parameters()

                columns = ','.join(
                    columns +
                    ['%s HIDDEN' % param for param in parameters])

                return 'CREATE TABLE x(%s)' % columns, Table(cls, parameters)

            Create = Connect

        module = type('%sModule' % cls.__name__, (BaseModule,), {})
        return module()

To test I've created a simple implementation that allows searching for regex results iteratively:

class ReSearch(TableFunction):
    columns = ['grp']
    parameters = ['regex', 'search_string']

    def initialize(self, regex=None, search_string=None):
        self._regex = regex
        self._search_string = search_string
        self._iter = re.finditer(self._regex, self._search_string)

    def iterate(self, idx):
        return [int(next(self._iter).group(0))]

The following SQL should produce results, but does not for some reason:

CREATE VIRTUAL TABLE foo USING re_search();

SELECT * FROM sqlite_master;
-- Returns [(u'table', u'foo', u'foo', 0, u'CREATE VIRTUAL TABLE foo using re_search()')]

SELECT * FROM foo(?, ?) -- '[0-9]+', 'foo 003 test 302, bar 234 blah 1'

coleifer avatar Nov 30 '15 18:11 coleifer

Hi @rogerbinns, just wanted to post a follow-up to my previous wall-of-code. I'm trying to encapsulate as much logic as possible to allow creating table-valued functions which specify the minimum amount of code (i.e. input parameters and an iterator generating output rows).

While testing the demo "regex search" table above, I noted that the Column(), Eof(), Next(), Filter(), etc methods were all called correctly and returned without error. I even hopped into the C code for apsw and riddled it with print statements to try and catch any potential problem.

As far as I could tell, what happens is that the virtual table would be called as one would expect, returning appropriate values for each invocation of Column(). APSW was appropriately setting values on a context object. However, when the apsw cursor tried to execute the query, the result was not SQLITE_ROW but SQLITE_DONE. I have no idea why this is.

coleifer avatar Dec 01 '15 18:12 coleifer