apsw
apsw copied to clipboard
export colUsed to python BestIndex
if I know, what columns will be requested I can optimize virutal table retrieval - ie.; askim wmi interface only about needed columns, or not reading files, that are used to calculate output of abset columns.
straight forward implementation is easy - but it breadks backwards compatiblity - we need to extend BestIndex function by 1 argument coUsed and pass 64-bit integer through that.
If changing BestIndex arguments is acceptable I will send PR. If not - question is how to do it? Define another function BestIndex2 with different signature?
Picking up from the mailing list.. I’ll gladly refine my patch to include what’s listed. I tried to follow the existing patterns. The only call that (I think) allocates memory is the getting the python object, and you don’t generally wrap those with fault injection.
As for checking types (since we’re not returning anything from python in this case) there’s not much to be done.
Tests will have to be done, but given how well apsw is tested already (which has saved me tons of probably very painful troubleshooting), totally happy to do that as well. It’s also relatively simple.
That leaves the calling convention to BestIndex (and associated documentation).
@rogerbinns if you don’t mind breaking api, it’s definitely the simple route to just add a third arg. vtables are #ifdef’d experimental after all :) That said, wrapping it in a tuple is another option.
Though it’ll still break the api, it has the advantage of any future fields SQLite might add to the indexinfo struct can be rolled into the same tuple.
The work that is most valuable for someone not me to do is demo code actually using the proposed feature. I then essentially just have to make that work.
I think the simplest compatibility solution is to look for a bestindex2 and supply it with a dict of all the fields. That will then take care of future changes without breaking code.. Still thinking ...
Rather than adding a BestIndex2
could this instead be handled by adding a version specifier to VTModule
or VTTable
?
@rogerbinns I'm the main developer of Shillelagh, a library that wraps the virtual table functionality of apsw in a user- and developer-friendly way. The library is mostly used to allow querying APIs via SQL, and Apache Superset is a big user.
Recently someone started developing a Shillelagh adapter for Google Analytics. The GA API has a limitation on the number of columns that can be requested, so we can't fetch all the columns in a single request. If colUsed
is exposed we can request only the necessary columns or raise an exception informing the user to request fewer columns.
@betodealmeida You may find hidden columns useful for meeting your needs.
@betodealmeida You may find hidden columns useful for meeting your needs.
Sorry, but I don't understand how that would help?
Since the virtual table doesn't know which columns are being used in the prepared statement it still needs to request all the columns from the Google Analytics API, which requires performing and joining multiple network requests. I don't see how hidden columns would change that.
@betodealmeida hidden columns help in the scenario where people are using gui tools or similar, and so do SELECT * FROM TABLE
hence getting every column even if they aren't using them all. Hidden columns wont be included in the *
.
When Cursor.Column is called you will know a column is needed, and it will work even when BestIndex is not called (and updated per this issue). I understand that may be difficult in Shillelagh.
It is the approach I have taken in virtual tables - list the core ones normally and always fetch those, list the rest as hidden and fetch them on demand.
Will be part of #332