tealer
tealer copied to clipboard
Initial design for support of group-transactions analysis
First task of issue #83
Data Model
class Instruction:
prev: List[Instruction]
next: List[Instruction]
line_num: int
source_code_line: str
comment: str
comments_before_ins: List[str]
tealer_comments: List[str]
bb: Optional["BasicBlock"]
supported_version: int
supported_mode: Mode.Application | Mode.LogicSig
class BasicBlock:
instructions: List[Instruction]
prev: List[BasicBlock]
next: List[BasicBlock]
idx: int
subroutine: Optional["Subroutine"]
tealer_comments: List[str]
class Subroutine:
subroutine_name: str
entry: BasicBlock
basicblocks: List[BasicBlock]
exit_blocks: List[BasicBlock]
contract: Optional["Teal"] = None
class Function:
cfg: List[BasicBlock]
function_name: str
contract: Optional["Teal"]
class Teal:
contract_name: str
version: int
execution_mode: ContractType
instructions: List[Instruction]
basicblocks: List[BasicBlock]
main: Subroutine
subroutines: Dict[str, Subroutine]
functions: Dict[str, Function]
class ContractType(Enum):
LogicSig
ApprovalProgram
ClearStateProgram
CFG of the contract is divided into subroutines. Every subroutine is an independent CFG. The blocks of one subroutines are not connected to blocks of any other subroutine. In the CFG, The callsub instruction is connected with immediately next instruction (the return point/address of the called subroutine.)
The contract needs to be divided into "Functions/Operations" as well. Functions are equivalent to ARC-4 methods. Every function should have their own CFG for analysis.
The CFG containing the basicblocks that are not part of any of the subroutines is referred to as "contract-entry" CFG. The execution always starts at the entry block of this CFG. If we consider every subroutine as external contract/program then "contract-entry" CFG can be considered as the CFG of the entire contract.
If the method/function dispatcher used by the contract does not touch any of the subroutines, each function can be represented by a CFG that consists of basicblocks that are related to that particular function.
In common use cases, method dispatcher is not part of any subroutine. The method dispatcher generated by PyTeal's Router class also does not touch the subroutines. Under this assumption, The "contract-entry" CFG can be divided into individual function CFGs. Functions can be analyzed independently.
A block might be part of multiple functions. In that case, Every function should have their own copy of the block.
class Transaction:
type: TransactionType
logic_sig: Optional[Function]
application: Optional[Function]
logic_sig_group_context: Optional[Dict[Instruction, Transaction]]
application_group_context: Optional[Dict[Instruction, Transaction]]
class TransactionType(Enum):
Payment
AssetConfig
AssetTransfer
AssetFreeze
KeyReg
ApplicationCall
A transaction may involve execution of both a LogicSig and an Application.
The group-context information contains a mapping from every group related instruction of a function to the Transaction object.
group related instructions:
- gtxn t f, gtxna t f i, gtxns f, gtxnsa f i, gtxnas t f, gtxnsas f
- gaid t, gaids,
- gload t i, gloads i, gloadss
group-context of "gtxn 1 RekeyTo" would point to the Transaction object representing the transaction at index 1.
The Transaction object for an instruction is specified for each execution of the function.
A group of transactions may involve execution of same function of a program in multiple transactions Or a block of code can be executed for two different functions and both functions are part of a group. So, The group-context should be provided per (Transaction, Function) because a group-context instruction can refer to different Transaction object for each of the execution.
At the same time, It is not possible to provide a Transaction object for instructions which are executed multiple times in a given execution.
If a gtxns f instruction is used in a loop with different transaction indices then the Transaction object will be different for each iteration. These kind of instructions are ignored for the analysis.
class GroupConfig:
transactions: List[Transaction]
tealer: Optional[Tealer] = None
class Tealer:
contracts: List[Teal]
group_configs: List[GroupConfig]
Detector API
class AbstractDetector:
def __init__(self, tealer: Tealer):
pass
def detect(self):
pass
The detectors can be classified into two classes:
- Type 1: Detectors which does not need any information about other contract executed in the group to detect the bug.
- Example "Inner Txn Fee must be zero" bug: The detector have to find all the inner transactions in a function and check if the fee field is set to zero or not. Execution of other contracts does not invalidate the bug.
- Type 2: Detectors which require information of group transactions.
- Example RekeyTo, AssetCloseTo, ... Any bug which relies on possibility of value for a transaction field. A different contract executed in the same group can access the transaction field and perform validations on it.
Type 1 detectors will have to use the Tealer.contracts to access the individual contracts and run analysis on each of them.
Type 2 detectors will have to use the Tealer.group_configs to access the transaction configs and run analysis on each group of transactions.
Output format:
TBD
User Configuration
Contract = {
/* name of the contract, e.g pool. Every contract should have a unique name */
"name": string;
/* filesystem path of the contract, (relative path)*/
"path": string;
/* Type of the contract: one of LogicSig, ApprovalProgram or ClearStateProgram*/
"type": string;
/* Contract's teal version */
"version": int;
/* Names of subroutines present in the contract */
"subroutines": string[];
/* Functions/User operations */
"functions": Function[];
}
Function = {
/* execution path to reach the function's entry block.
The execution path is part of the method dispatcher CFG.
The execution path is array of strings. For example, ["B0", "B1", "B3", "B4"]
The basic blocks "B0", "B1", "B3", "B4" are part of the method dispatcher. The code in these blocks check for function identifier and route to the function accordingly.
The block "B4" is start of the function code.
*/
"execution_path": string[];
/* Name of the operation, function. used as identity in "group_configurations". Should be unique for a contract. */
"function_name": string;
}
Transaction = {
/* A unique id for this transaction. The id is only used to refer this transaction in other transactions of the group configuration. Example: "T1" */
"tx_id": string;
/* Type of the transaction: one of "pay", "keyreg", "acfg", "axfer", "afrz", "appl" or "txn". "txn" can be used to represent any type of transaction" */
"txn_type": string?;
/* if the transaction is to be signed with a LogicSig, specify the contract name and the function name */
"logic_sig": {
"contract": string;
"function": string;
}?;
/* if the transaction is an application call. specify the contract and the function being called */
"application": {
"contract": string;
"function": string;
}?;
/* Transaction's index in the group. if the transaction MUST be present at a predefined index in the group and contracts in the group use an absolute index to access fields of this transaction then specify that index in this field. if the transaction is always the first transaction
in the group then the "absolute_index" should be `0`.
*/
"absolute_index": int?;
/* Relative index of other transactions from this transactions in the group. The relative index specified are predefined and static. The relative index should not depend on any other runtime information, for example, on application arguments. Such relative indices are not completely supported. Not supported completely in the sense that if the contracts in this transaction perform validations on that transaction, Tealer will not be able to consider these validations when analyzing that transaction.
*/
"relative_indexes": [
{
/* id specified in the Transaction "id" field. Need a better field name */
"other_tx_id": string;
/* relative index of "other_tx_id" transaction from this transaction.
For example, if "other_tx_id" transaction must preceed this transaction then relative index is `-1`.
The contract executed in this transaction will access "other_tx_id" transaction using "(Txn.GroupIndex) - 1" */
"relative_index": int;
}?;
]?;
}
GroupConfig = Transaction[]
UserConfig = {
"contracts": Contract[],
"group_configurations": GroupConfig[],
}
- I would add an
entry/exit_blocksinFunction. - Should we consider the possibility for a function/subroutine to have more than 1 entry point? I haven't see this happening in other smart contract platform; but because here we can have custom CFG, if we have a function that is an irreducible loop it could happen? Hopefully not, as we should still have 1 entry point, but better to consider it in advance
- For the user configuration, I think we should not use a dictionary, but a class. We should also load the information from a yaml, or something similar
Example config:
name: protocol name
contracts:
- name: contract1
path: contracts/contract1.teal
type: LogicSig
version: 6
subroutines:
- sub1
- sub2
functions:
- name: function1
execution_path: [B0, B1, B2]
entry: B2
exit: [B10, B11]
- name: function2
execution_path: [B0, B1, B3]
entry: B3
exit: [B12, B13]
- name: contract2
path: contracts/contract2.teal
type: ApprovalProgram
version: 6
subroutines:
- opt
- delete
functions:
- name: init
execution_path: [B0, B1]
entry: B1
exit: [B13]
- name: clear
execution_path: [B0, B2]
entry: B2
exit: [B13]
groups:
- - txn_id: T1
txn_type: pay
logic_sig:
contract: contract1
function: function1
absolute_index: 0
- txn_id: T2
txn_type: appl
application:
contract: contract2
function: init
absolute_index: 1
- - txn_id: T1
txn_type: axfer
- txn_id: T2
txn_type: appl
logic_sig:
contract: contract1
function: function2
application:
contract: contract2
function: clear
relative_indexes:
- other_txn_id: T1
relative_index: -1