go-ethereum icon indicating copy to clipboard operation
go-ethereum copied to clipboard

Refactor state journalling system

Open holiman opened this issue 1 year ago • 0 comments

The journal is our appendable tracker for changes to state, making it possible to revert callscopes.

Originally, we did copy-on-write: whenever we entered a new call, we copied the entire statedb and all objects, and operated on the copy. This bit us back in the shanghai attacks, and we switched to a journal. Every time we do a modification, we add e.g.

0: account 0xA balancechange, was 1
1: account 0xA balancechange, was 2
2: account 0xA storagechange, key 0x123, was 0x00
3: account 0xA storagechange, key 0x123, was 0x01

And at any point, we can revert, applying the changes in the reverse order.

The journal is not aware of different scopes; it is just aware of a long list. The statedb tracks which indexes correlate to scopes:

func (s *StateDB) Snapshot() int {
	id := s.nextRevisionId
	s.nextRevisionId++
	s.validRevisions = append(s.validRevisions, revision{id, s.journal.length()})
	return id
}

Also, the journal is very basic, the events are added by the state package:

func (s *stateObject) SetNonce(nonce uint64) {
	s.db.journal.append(nonceChange{
		account: &s.address,
		prev:    s.data.Nonce,
	})
	s.setNonce(nonce)
}

Changing the API of the journal

In order to make changes to the journal possible, a few changes should be introduced. First of all, instead of external callers just appending changes, they should invoke methods, such as:

type ChangeJournal interface {
	// Changes involving accounts
	JournalCreateObject(common.Address)
	JournalSelfDestruct(a common.Address, prev bool, prevBalance uint256.Int)
	JournalBalance(a common.Address, prev uint256.Int)
	JournalNonce(a common.Address, prev uint64)
	JournalStorage(a common.Address, key, prevValue common.Hash)
	JournalTStorage(a common.Address, key, prevValue common.Hash)
	JournalCode(a common.Address, key, prvCode, prevHash []byte)

	// Changes involving other state values
	JournalRefund(prev uint64)
	JournalLog(txHash common.Hash)
	JournalTouch(hash common.Hash)
	JournalAccessListAddAccount(address common.Address)
	JournalAccessListAddSlot(address common.Address, slot common.Hash)
}

(the method JournalReset has been left out, I'm thinking we'll merge https://github.com/ethereum/go-ethereum/pull/28666 ) .

Then external callers would do

func (s *stateObject) SetNonce(nonce uint64) {
	s.db.journal.JournalNonce(s.address, data.Nonce)
	s.setNonce(nonce)
}

By doing this, we leave it up to the journal internals exactly how to store changes.

Marking scopes

Secondly, we should move the scope-awareness into the Journal.

	// Marks that a new scope has started. This methord returns an identifier,
	// which can be used to revert the changes in this scope
	NewScope() int
	// Marks that the scope has ended. An ended scope is either not reverted,
	// or reverted in full when/if the parent scope reverts.
	EndScope(int)
	// RevertScope reverts the changes in the given scope.
	RevertScope(*StateDB, int)

So the callers would change from

func (s *StateDB) Snapshot() int {
	id := s.nextRevisionId
	s.nextRevisionId++
	s.validRevisions = append(s.validRevisions, revision{id, s.journal.length()})
	return id
}

// RevertToSnapshot reverts all state changes made since the given revision.
func (s *StateDB) RevertToSnapshot(revid int) {
	// Find the snapshot in the stack of valid snapshots.
	idx := sort.Search(len(s.validRevisions), func(i int) bool {
		return s.validRevisions[i].id >= revid
	})
	if idx == len(s.validRevisions) || s.validRevisions[idx].id != revid {
		panic(fmt.Errorf("revision id %v cannot be reverted", revid))
	}
	snapshot := s.validRevisions[idx].journalIndex

	// Replay the journal to undo changes and remove invalidated snapshots
	s.journal.revert(s, snapshot)
	s.validRevisions = s.validRevisions[:idx]
}

into

// Snapshot returns an identifier for the current revision of the state.
func (s *StateDB) Snapshot() int {
	return s.journal.NewScope()
}
// RevertToSnapshot reverts all state changes made since the given revision.
func (s *StateDB) RevertToSnapshot(revid int) {
	s.journal.Revert(s, snapshot)
}

Using Sets

After these changes are in place, we can start collecting changesets based on scope, instead of linearly. For example, a contract which re-uses a storage slot will have several journal-entries

2: account 0xA storagechange, key 0x123, was 0x00
3: account 0xA storagechange, key 0x123, was 0x01
4: account 0xA storagechange, key 0x123, was 0x02

These can all be represented by only one journal-entry. Either naively by merging journal-entries, or by using a more elaborate scope. For example:


type storageChanges map[common.Hash]common.Hash

type ScopeChanges struct{
  storageChanges map[common.Address]storageChanges
  nonceChanges map[common.Address]uint64
  balanceChanges map[common.Address]uint256.Int
  ... 
}

These changes are possible as long as the changes do not interfere with eachother. It does not matter whether nonceChange is reverted before or after the balanceChange. Some care needs to be taken with selfdestruct-change in this respect.

Also, the case when a child-scope finished is a bit finicky. It can be "merged up" to the parent scope, which is possibly wasted work. However, if it is not "merged up", then the work performed after the call returns needs to be considered it's own, new, scope.

1. sstore(0,1)
2. call( b)  // might call sstore(0,2) on this same adress, 
3. sstore(0,3) 

holiman avatar Jan 25 '24 08:01 holiman