apsw icon indicating copy to clipboard operation
apsw copied to clipboard

export colUsed to python BestIndex

Open graf0 opened this issue 5 years ago • 8 comments

if I know, what columns will be requested I can optimize virutal table retrieval - ie.; askim wmi interface only about needed columns, or not reading files, that are used to calculate output of abset columns.

graf0 avatar Nov 28 '19 14:11 graf0

straight forward implementation is easy - but it breadks backwards compatiblity - we need to extend BestIndex function by 1 argument coUsed and pass 64-bit integer through that.

If changing BestIndex arguments is acceptable I will send PR. If not - question is how to do it? Define another function BestIndex2 with different signature?

graf0 avatar Nov 28 '19 15:11 graf0

Picking up from the mailing list.. I’ll gladly refine my patch to include what’s listed. I tried to follow the existing patterns. The only call that (I think) allocates memory is the getting the python object, and you don’t generally wrap those with fault injection.

As for checking types (since we’re not returning anything from python in this case) there’s not much to be done.

Tests will have to be done, but given how well apsw is tested already (which has saved me tons of probably very painful troubleshooting), totally happy to do that as well. It’s also relatively simple.

That leaves the calling convention to BestIndex (and associated documentation).

@rogerbinns if you don’t mind breaking api, it’s definitely the simple route to just add a third arg. vtables are #ifdef’d experimental after all :) That said, wrapping it in a tuple is another option.

Though it’ll still break the api, it has the advantage of any future fields SQLite might add to the indexinfo struct can be rolled into the same tuple.

abotsis avatar May 11 '20 23:05 abotsis

The work that is most valuable for someone not me to do is demo code actually using the proposed feature. I then essentially just have to make that work.

I think the simplest compatibility solution is to look for a bestindex2 and supply it with a dict of all the fields. That will then take care of future changes without breaking code.. Still thinking ...

rogerbinns avatar May 12 '20 00:05 rogerbinns

Rather than adding a BestIndex2 could this instead be handled by adding a version specifier to VTModule or VTTable?

cancan101 avatar Apr 06 '22 03:04 cancan101

@rogerbinns I'm the main developer of Shillelagh, a library that wraps the virtual table functionality of apsw in a user- and developer-friendly way. The library is mostly used to allow querying APIs via SQL, and Apache Superset is a big user.

Recently someone started developing a Shillelagh adapter for Google Analytics. The GA API has a limitation on the number of columns that can be requested, so we can't fetch all the columns in a single request. If colUsed is exposed we can request only the necessary columns or raise an exception informing the user to request fewer columns.

betodealmeida avatar Jul 25 '22 04:07 betodealmeida

@betodealmeida You may find hidden columns useful for meeting your needs.

rogerbinns avatar Jul 25 '22 14:07 rogerbinns

@betodealmeida You may find hidden columns useful for meeting your needs.

Sorry, but I don't understand how that would help?

Since the virtual table doesn't know which columns are being used in the prepared statement it still needs to request all the columns from the Google Analytics API, which requires performing and joining multiple network requests. I don't see how hidden columns would change that.

betodealmeida avatar Jul 25 '22 15:07 betodealmeida

@betodealmeida hidden columns help in the scenario where people are using gui tools or similar, and so do SELECT * FROM TABLE hence getting every column even if they aren't using them all. Hidden columns wont be included in the *.

When Cursor.Column is called you will know a column is needed, and it will work even when BestIndex is not called (and updated per this issue). I understand that may be difficult in Shillelagh.

It is the approach I have taken in virtual tables - list the core ones normally and always fetch those, list the rest as hidden and fetch them on demand.

rogerbinns avatar Jul 26 '22 22:07 rogerbinns

Will be part of #332

rogerbinns avatar Nov 29 '22 22:11 rogerbinns