cardano-db-sync
cardano-db-sync copied to clipboard
Nomen clature for columns
Hi Team,
We were thinking of adding a CIP for normalising variable names (and types) used at query layers using existing providers as baseline. Since a lot of the tooling extracts information via mini-Protocol or dbsync, would be good to know if there is a naming convention (for object representations like block_no
, hash
) that is preferred to use as source or if there is flexibility/scope to make things more uniform where possible, couple of examples below:
- We have columns like
hash
for block and transaction itself, but then we havehash_raw
/view
for objects which often use bech32 representation (pool/stake_addresses), while foraddress
representation , we haveaddress_raw
/address
for hex/bech32 values. - References like
value
in (tx_out
collateral_tx_out
) vsamount
in account related tables (withdrawal
,reward
,reserves
,datum
) vsquantity
in asset representations (ma_tx_mint
/ma_tx_out
)
There are still patterns here, so these are probably in place based on some logic. Depending on preferences (updates to dbsync vs only restrict ourselves to query layers themselves), we might be able to prepare a better initial draft and add issue/suggestions against appropriate destination repositories.
Totally agree that all the columns named amount
and quantity
should be renamed value
.
Not so sure about the first point (or maybe I am missing the point).
Those were just examples, there are a lot more that we would want to perhaps drill down and document , but my question was really if there is a preferred source at the moment where the column names come from (for instance, if there is a relation to the names for dbsync objects being initially built from ledger/consensus code-as-spec), if not - we should be able to brainstorm as ecosystem together across orgs/projects to perhaps have a higher level of consistency for object representation.
About first point, I was referring to use of hash
vs hash_raw
- which are both similar representation, while using view
for bech32 representation compared to use of address
vs address_raw
for bech32 vs hex representation. One option could be calling all such objects to be of the format $obj_[hash]|[view] . Not proposing this as solution rightaway, our aim is to just start preparing a draft for review with the naming for all objects first that would be commonly used in queries and them see if we can adapt/align better across the ecosystem
Adding these as well: I'd like to suggest to rename tx_in_id
and tx_out_id
in tx_in
and collateral_tx_in
table to in_tx_id
and out_tx_id
- In the
tx_in
table, thetx_in_id
column may be mistaken as the primary key, but there's also aid
column, so that may cause some confusion. - In the
tx_in
,collateral_tx_in
andma_tx_out
table, there are the same column namestx_out_id
but they are referring to different things. Thetx_out_id
intx_in
andcollateral_tx_in
tables actually references the pk oftx
table, meanwhile thetx_out_id
inma_tx_out
is actually referencing the pk oftx_out
table. - Renaming it to
in_tx_id
andout_tx_id
enables us to immediately identify that the data in this column are in facttx_ids
, referencing thetx
table, but they are input or output transactions denoted by thein
andout
prepend. This is also in line with theregistered_tx_id
columns in other tables such asstake_address
where the prepend is further describing the transaction.
Many of these changes sound reasonable. However, they would be breaking changes for exisisting applications and would require integration.
IMO - a breaking change has been long pending, would be nice to reduce some of the backlog on these (will also allow for more freedom with tx_out table updates)?