go-ethereum
go-ethereum copied to clipboard
Refactor state journalling system
The journal is our appendable tracker for changes to state, making it possible to revert callscopes.
Originally, we did copy-on-write: whenever we entered a new call, we copied the entire statedb and all objects, and operated on the copy. This bit us back in the shanghai attacks, and we switched to a journal. Every time we do a modification, we add e.g.
0: account 0xA balancechange, was 1
1: account 0xA balancechange, was 2
2: account 0xA storagechange, key 0x123, was 0x00
3: account 0xA storagechange, key 0x123, was 0x01
And at any point, we can revert, applying the changes in the reverse order.
The journal is not aware of different scopes; it is just aware of a long list. The statedb tracks which indexes correlate to scopes:
func (s *StateDB) Snapshot() int {
id := s.nextRevisionId
s.nextRevisionId++
s.validRevisions = append(s.validRevisions, revision{id, s.journal.length()})
return id
}
Also, the journal is very basic, the events are added by the state
package:
func (s *stateObject) SetNonce(nonce uint64) {
s.db.journal.append(nonceChange{
account: &s.address,
prev: s.data.Nonce,
})
s.setNonce(nonce)
}
Changing the API of the journal
In order to make changes to the journal possible, a few changes should be introduced. First of all, instead of external callers just append
ing changes, they should invoke methods, such as:
type ChangeJournal interface {
// Changes involving accounts
JournalCreateObject(common.Address)
JournalSelfDestruct(a common.Address, prev bool, prevBalance uint256.Int)
JournalBalance(a common.Address, prev uint256.Int)
JournalNonce(a common.Address, prev uint64)
JournalStorage(a common.Address, key, prevValue common.Hash)
JournalTStorage(a common.Address, key, prevValue common.Hash)
JournalCode(a common.Address, key, prvCode, prevHash []byte)
// Changes involving other state values
JournalRefund(prev uint64)
JournalLog(txHash common.Hash)
JournalTouch(hash common.Hash)
JournalAccessListAddAccount(address common.Address)
JournalAccessListAddSlot(address common.Address, slot common.Hash)
}
(the method JournalReset
has been left out, I'm thinking we'll merge https://github.com/ethereum/go-ethereum/pull/28666 ) .
Then external callers would do
func (s *stateObject) SetNonce(nonce uint64) {
s.db.journal.JournalNonce(s.address, data.Nonce)
s.setNonce(nonce)
}
By doing this, we leave it up to the journal internals exactly how to store changes.
Marking scopes
Secondly, we should move the scope-awareness into the Journal.
// Marks that a new scope has started. This methord returns an identifier,
// which can be used to revert the changes in this scope
NewScope() int
// Marks that the scope has ended. An ended scope is either not reverted,
// or reverted in full when/if the parent scope reverts.
EndScope(int)
// RevertScope reverts the changes in the given scope.
RevertScope(*StateDB, int)
So the callers would change from
func (s *StateDB) Snapshot() int {
id := s.nextRevisionId
s.nextRevisionId++
s.validRevisions = append(s.validRevisions, revision{id, s.journal.length()})
return id
}
// RevertToSnapshot reverts all state changes made since the given revision.
func (s *StateDB) RevertToSnapshot(revid int) {
// Find the snapshot in the stack of valid snapshots.
idx := sort.Search(len(s.validRevisions), func(i int) bool {
return s.validRevisions[i].id >= revid
})
if idx == len(s.validRevisions) || s.validRevisions[idx].id != revid {
panic(fmt.Errorf("revision id %v cannot be reverted", revid))
}
snapshot := s.validRevisions[idx].journalIndex
// Replay the journal to undo changes and remove invalidated snapshots
s.journal.revert(s, snapshot)
s.validRevisions = s.validRevisions[:idx]
}
into
// Snapshot returns an identifier for the current revision of the state.
func (s *StateDB) Snapshot() int {
return s.journal.NewScope()
}
// RevertToSnapshot reverts all state changes made since the given revision.
func (s *StateDB) RevertToSnapshot(revid int) {
s.journal.Revert(s, snapshot)
}
Using Sets
After these changes are in place, we can start collecting changesets based on scope, instead of linearly. For example, a contract which re-uses a storage slot will have several journal-entries
2: account 0xA storagechange, key 0x123, was 0x00
3: account 0xA storagechange, key 0x123, was 0x01
4: account 0xA storagechange, key 0x123, was 0x02
These can all be represented by only one journal-entry. Either naively by merging journal-entries, or by using a more elaborate scope. For example:
type storageChanges map[common.Hash]common.Hash
type ScopeChanges struct{
storageChanges map[common.Address]storageChanges
nonceChanges map[common.Address]uint64
balanceChanges map[common.Address]uint256.Int
...
}
These changes are possible as long as the changes do not interfere with eachother. It does not matter whether nonceChange is reverted before or after the balanceChange. Some care needs to be taken with selfdestruct-change in this respect.
Also, the case when a child-scope finished is a bit finicky. It can be "merged up" to the parent scope, which is possibly wasted work. However, if it is not "merged up", then the work performed after the call returns needs to be considered it's own, new, scope.
1. sstore(0,1)
2. call( b) // might call sstore(0,2) on this same adress,
3. sstore(0,3)