This PR:

Removes Task data from aggregator response maps after a response has been responded to or expired.

closes #977

#To Test: Setup local dev net as per the guide

make anvil_start_with_block_time

 make aggregator_start

make operator_register_and_start

make batcher_start_local

To confirm you can check validate the aggregator mappings for a specific task have been removed by adding:

	agg.AggregatorConfig.BaseConfig.Logger.Info(
		"Info deleted from Batcher:",
		"Task Index", blsAggServiceResp.TaskIndex,
		"Batch Identifier Hash", batchIdentifierHash,
		"agg.batchesIdxByIdentifierHash:", agg.batchesIdxByIdentifierHash[batchIdentifierHash],
		"agg.batchCreatedBlockByIdx:", agg.batchCreatedBlockByIdx[blsAggServiceResp.TaskIndex],
		"agg.batchIdentifierHashByIdx:", agg.batchesIdentifierHashByIdx[blsAggServiceResp.TaskIndex],
		"agg.batchDataByIdentifierHash:", agg.batchDataByIdentifierHash[batchIdentifierHash],
	)

to line 324 in aggregator/internal/pkg/aggregator.go and

	agg.AggregatorConfig.BaseConfig.Logger.Info(
		"Info deleted from Batcher:",
		"Task Index", batchIndex,
		"Batch Identifier Hash", batchIdentifierHash,
		"agg.batchesIdxByIdentifierHash:", agg.batchesIdxByIdentifierHash[batchIdentifierHash],
		"agg.batchCreatedBlockByIdx:", agg.batchCreatedBlockByIdx[batchIndex],
		"agg.batchIdentifierHashByIdx:", agg.batchesIdentifierHashByIdx[batchIndex],
		"agg.batchDataByIdentifierHash:", agg.batchDataByIdentifierHash[batchIdentifierHash],
	)

to line 410 in aggregator/internal/pkg/aggregator.go.

Sep 18 '24 20:09 PatStiles

I ran a local testnet with 3 operators produced. The aggregator did not experience any errors and and successfully removed added task information from the aggregator after successfully responding.

Sep 23 '24 05:09 PatStiles

Changed this PR to draft since we should prioritize other issues first

Sep 23 '24 19:09 entropidelic

What happens if an answer comes after the batch is closed? It should probably be tested with more operators

The aggregator when receiving the response checks that the batch indeed exists here. Anyway, I tested it on my machine doing:

Registered four operators
Killed one
Sent a bunch of proofs
Wait for the aggregator to respond to the task
Woke the killed operator and when it got the latest task, it sent his response and the aggregator ignored it, logging the following message:

[2024-09-27 11:54:10.904 -03] INFO (pkg/server.go:63) - Locked Resources: Starting processing of Response
[2024-09-27 11:54:10.904 -03] INFO (pkg/server.go:67) - Unlocked Resources: Task not found in the internal map

Sep 27 '24 14:09 MarcosNicolau

Imo the deletion of the task in the maps should be delayed, so it will accept responses from operators after reaching the quorum

It can be done with a go routine like

go func() {
		time.Sleep(10 * time.Second)
		// Deletion logic
	}()

The deletion is done once the task was verified in Ethereum (so there is a time between the quorum is reached and the transaction is accepted in Ethereum). So I'm not sure if we should add an extra delay in the deletion

Sep 27 '24 18:09 JuArce

The deletion is done once the task was verified in Ethereum (so there is a time between the quorum is reached and the transaction is accepted in Ethereum). So I'm not sure if we should add an extra delay in the deletion

That sounds to me like what we really want is a kind of queue (jargon for "an array") and a periodic task that checks if we're done with stuff.

EDIT: actually, I think it's simpler than that. We already have indices for each batch. Every once in a while (the appropriate time is probably around the time Ethereum takes to confirm our transactions) we check the oldest (lowest) index, find its Merkle root, and use that and the index to clean the dictionaries. This could be a long-lived goroutine started by main.

Oct 03 '24 16:10 Oppen

fix(audit): Remove Task data from aggregator after a response has been responded or expires.

This PR: