penumbra
penumbra copied to clipboard
cometindex and pindexer: defining tables for exploring block data
Summary
Following the discussions and work from #4573, #4604, #4610, and #4611, Penumbra's builtin indexer should define and provide tables that allow for efficient querying of common data that is used by block explorers like cuiloa.
Example of the problem now
Using cuiloa as a motivating example, you can find all currently used queries in the respective route.ts
file of a given endpoint in the api directory of cuiloa, i.e. when someone looks up a specific IBC Channel that has been created by an IBC Client on Penumbra, this is the query that gets ran to collect the relevant information.
Putting aside query optimizations, the default index schema used by cometbft requires pulling together data that requires a lot of nested reads within a single table in addition to filtering across several relations to collect current information for e.g. a specific IBC identifier and any associated data (block, transactions, etc).
In short, it's not too bad to look up a single thing (does this block exist? Is there a nullifier with this specific value?) but the complexity and overhead of a query quickly escalates for each additional piece of information you want to find beyond that.
Suggested Solution
The desired outcome for this feature request is for penumbra's indexer to provide tables that drastically improve the ergonomics and efficiency of querying data. By using Cuiloa's current queries as a baseline, we can define tables that achieve this outcome for both Cuiloa and future alternative block explorers and related tooling.
Immediately below the following schema outlines you will also fine caveats and open questions that may be useful to skim over first.
-
[ ]
clients(rowid, client_id, client_type, block_id, height, tx_id, tx_hash)
tracks existing light clients defined on Penumbra.
-
rowid
table index. -
client_id
is the same value for the keyclient_id
found for any{create|update}_client
type event fromevent_attributes
as defined in the cometbft indexer schema, e.g.07-tenderminet-0
-
client_type
the type of the client. This is the same value in anevent_attributes
row with type of{create|update}_client
and keyclient_type
, e.g.07-tendermint
-
block_id
the id of either the creation block or the block with the lastclient_update
event for the respective client. If a client has only been created with no updates, then it will be the former. If it has received updates, it will be the latter. -
height
is the height of the block identified withblock_id
-
tx_id
is the id of the transaction of either the creation or last update of the client. Same asblock_id
. -
tx_hash
is the hash of the transaction identified withtx_id
-
-
[ ]
connections(rowid, connection_id, client_id, client_idx, counterparty_client_id, counterparty_connection_id)
tracks existing IBC connections on Penumbra
-
rowid
table index -
connection_id
is the same value for the keyconnection_id
found for anyconnection_open_{init|ack}
type event fromevent_attributes
with a key ofconneciton_id
-
client_id
is theclients.client_id
of the client that the connection belongs to. -
client_idx
is theclients.rowid
of the client that the connection belongs to. -
counterparty_client_id
is the counterparty client id and the same value you will find for anevent_attributes
row withconnection_open_{init|awk}.counterparty_client_id
composite_key -
counterparty_connection_id
is theconnection_id
of the connection on the counterparty chain. this is the same value found inevent_attributes
with the composite keyconnection_open_{init|awk}.counterparty_connection_id
-
-
[ ]
channels(rowid, channel_id, connection_id, connection_idx, client_id, client_idx, port_id, counterparty_channel_id, counterparty_port_id)
tracks existing IBC channels on Penumbra
-
rowid
table index -
channel_id
is the same value for the keychannel_id
found for anychannel_open_{init|ack}
type event fromevent_attributes
with a key ofchannel_id
-
connection_id
is theconnections.connection_id
of the connection the channel belongs to. The same value found in anychannel_open_{init|ack}.connection_id
in `event_attributes. -
connection_idx
is theconnection.rowid
-
client_id
is theclients.client_id
of the client that the channel belongs to. -
client_idx
is theclients.rowid
of the client that the channel belongs to. -
port_id
is the port of the channel and is the same value you will find for anychannel_open_{init|ack}.port_id
composite_key event inevent_attributes
. -
counterparty_channel_id
is the counterparty channel id and the same value you will find for anevent_attributes
row withchannel_open_{init|awk}.counterparty_channel_id
composite_key -
counterparty_port_id
is theport_id
of the channel on the counterparty chain. this is the same value found inevent_attributes
with the composite keychannel_open_{init|awk}.counterparty_port_id
-
-
[ ]
client_events (rowid, client_id, client_idx, type, block_id, tx_id, header, consensus_height)
tracks all
client_create
andclient_update
events-
rowid
table index -
client_id
theclients.client_id
of the client event -
client_idx
theclients.rowid
of the client event -
type
the type of the client event, i.e.client_create
orclient_update
.client_update
ought to make the overwhelmingly majority of these events. -
block_id
theblocks.rowid
of the block this event occurred on -
tx_id
thetx_results.rowid
of the transaction -
header
is the IBC Header segment when thetype
isupdate_client
-
consensus_height
the consensus height of the counterparty chain for the given client event. This is the same value for any row inevent_attributes
with the composite_key of{create|update}_client.consensus_height
.
-
-
[ ]
packet_events (rowid, type, client_id, client_idx, block_id, tx_id, connection_id, connection_idx, channel_id, channel_idx, port_id, ordering, sequence, counterparty_channel_id, counterparty_port, timeout_height, timeout_timestamp, data, data_hex)
tracks all
send_packet
andrecv_packet
events.-
rowid
table index -
type
the type of the packet event, i.e.send_packet
orrecv_packet
. -
client_id
theclients.client_id
of the packet. -
client_idx
theclients.rowid
of the packet. -
block_id
theblocks.rowid
of the packet. -
tx_id
thetx_results.rowid
of the packet. -
connection_id
is theconnections.connection_id
of the connection that the packet belongs to. The same value found in any row with the composite_key{send|recv}_packet.packet_connection
inevent_attributes
-
connection_idx
is theconnection.rowid
of the connection that the packet belongs to. -
channel_id
is thechannel.channel_id
of the channel that the packet belongs to.- for
recv_packet
s, this is the value found for events with the composite_keyrecv_packet.packet_dst_channel
. - for
send_packet
s, this is the value found for events with the composite_keysend_packet.packet_src_channel
.
- for
-
channel_idx
is thechannel.rowid
of the channel that the packet belongs to. -
port_id
is thechannel.port_id
of the channel that the packet belongs to.- for
recv_packet
s, this is therecv_packet.packet_dst_port
. - for
send_packet
s, this is thesend_packet.packet_src_port
.
- for
-
ordering
the ordering of the packet, the same value for the composite_key{send|recv}_packet.packet_channel_ordering
inevent_attributes
-
sequence
is thepacket_sequence
key for a given{send|recv}_packet
type event. -
counterparty_channel_id
is the counterparty channel_id.- for
recv_packet
s, this is the value of thepacket_src_channel
key - for
send_packet
s, this is the value of thepacket_dst_channel
key
- for
-
counterparty_port_id
is the counterparty port ID.- for
recv_packet
s, this is the value of thepacket_src_port
key - for
send_packet
s, this is the value of thepacket_dst_port
key
- for
-
timeout_height
the packet timeout height. this is the value of the keypacket_timeout_height
. -
timeout_timestamp
the packet timeout stamp. this it the value of the keypacket_timeout_timestamp
. -
data
is the packet data. this is the value of the keypacket_data
-
data_hex
is the packet data in hex form. this is the value of the keypacket_data_hex
-
Caveats
- the table schemas try to exchange normalization for quick and easy access to data but how this data will look in the wild is something that should probably heavily influence their definitions.
- I don't have a perfect grasp of the IBC spec and how Penumbra integrates. As an example, is it possible for an IBC related event on Penumbra to be a block event (i.e. no transaction)? Similarly, what are the plans for supporting IBC operations like closing channels and connections?
Open questions Building off of the caveats, there are several issues that I'm not entirely confident on and the suggested schema may fail to address.
- Is keeping
clients
with an updatedblock_id
,tx_id
, etc a desirable property or should this be siloed entirely inclient_events
? - In a similar vein, if it's not a bad idea to update
clients
for the most recentblock_id
,tx_id
, etc, would it be reasonable to add this to bothconnections
andchannels
?- every
send_packet
andrecv_packet
type event would require a write here
- every
-
packet_events
will probably be a very big and wide table. How exactly this should be managed is not clear to me. Breaking it bytype
so that there are tworecv_packets
andsend_packets
tables instead might not be as bad but maybe tables that store data values likepacket_data
might be required? -
write_acknowledgements
are a type of event that I have not addressed but only because I haven't included them in the block explorer thus far. This is probably something worth acknowledging. - As with
write_acknowledgements
, the same goes for events of typeaction_delegate
. This would be a simple table, however,action_delegate(block_id, tx_id, key, value)
if it's something that the penumbra indexer should be tracking. - I am not sure how useful the pattern of
foo_id
andfoo_idx
is for the above tables.- The idea is to save a possible join for each pair by keeping the ID for a given client, connection, channel on the table row for a given entity.
- Similarly, indexing directly on IDs is probably(?) not possible. If I understand correctly,
client_id
s should always be unique to mitigate against replay attacks but I am not sure the same can be said for the semantics ofconnection_id
s andchannel_id
s