peloton
peloton copied to clipboard
Decouple the DataTable pointer from the plan objects
We now have the storage::DataTable
pointer inside many plan objects, such as AbstractScanPlan
, AnalyzePlan
, DeletePlan
, InsertPlan
, and UpdatePlan
. This prevents us from shipping the plan objects around or using the "what-if" API on the brain side.
I'll take SeqScanPlan
for example. Instead of getting the table pointer and storing it in the plan object here:
https://github.com/cmu-db/peloton/blob/master/src/optimizer/plan_generator.cpp#L82
We should directly store the database oid
and table oid
. And then at the query execution/compilation time, we should get the TableCatalogObject
using the `oid's with this API:
https://github.com/cmu-db/peloton/blob/master/src/include/catalog/catalog.h#L224
Then we can get the table name, database name and the schema name.
Finally we can use this API to get the table object pointer:
https://github.com/cmu-db/peloton/blob/master/src/include/catalog/catalog.h#L203
Everything should go through the catalog.
I will assign this to @Zeninma once he joins the group on Github.
I am having a problem while doing the decoupling. It would be great if I can get some help on this.
My approach for decoupling
In ordert to do the decoupling, I substitute storage::DataTable target_table
in some plan objects with the corresponding database and table oid_t
. The storage::DataTale *GetTable()
is replaced with storage::DataTable *GetTable(concurrency::TransactionContext *txnt)
. The new GetTable
method uses the database and tabe oid_t
to first get DatabaseCatalogObject
and GetTableObject
via (https://github.com/cmu-db/peloton/blob/master/src/include/catalog/catalog.h#L218), (https://github.com/cmu-db/peloton/blob/master/src/include/catalog/catalog.h#L228) in Catalog. Then uses GetTableWithName
(https://github.com/cmu-db/peloton/blob/master/src/include/catalog/catalog.h#L207) to get the storage::DataTable
object.
My current Problem
The GetDatabaseObject
, GetTableObject
and GetTableWithName
all take in the concurrency::TransactionContext
as their arguments. However, under some circumstances, the transaction argument cannot be derived. Here is one such circumstance for instance:
In the PostgresProtocolHandler::ExecBindMessage
method it calls a plan object's SetParameterValues
function (https://github.com/cmu-db/peloton/blob/master/src/network/postgres_protocol_handler.cpp#L526). If the plan object is an IndexScanPlan
, it will result in calling GetTable
method(https://github.com/cmu-db/peloton/blob/master/src/planner/index_scan_plan.cpp#L88) and needs a concurrency::TransactionContext
argument. However, the transaction object cannot be derived. I am not sure how to bypass this issue.
The storage::DataTale *GetTable() is replaced with storage::DataTable *GetTable(concurrency::TransactionContext *txnt)
I don't think that the plan objects should support GetTable
anymore. That should be done from outside of the plan.
If the plan object is an IndexScanPlan, it will result in calling GetTable method and needs a concurrency::TransactionContext argument.
@tli2 It looks IndexScanPlan
just needs the value type for a particular column in the index:
https://github.com/cmu-db/peloton/blob/master/src/planner/index_scan_plan.cpp#L88
Can we modify the binder to record this when it constructs the IndexScanPlan
? The plan objects should never need to reference the catalog internally. All of that should be done on the outside.