calyx Tracker for realistic queue benchmarking harness

Tracker for realistic queue benchmarking harness

Open polybeandip opened this issue 1 year ago • 1 comments

At a high level, our Shared Testing Harness works by processing a workload of pushes and pops as quickly as possible.

Benefits of the current setup.

There's a simple way to verify correctness: simply check the hardware output matches that of the oracle.
Benchmarking our queues is straight forward:
1. run synthesis and compute cycle counts
2. compute

total_time = cycle_count * (1000/(7 - worst_slack))

to estimate the total time spent on our workload. Smaller total_time means roughly faster queue!

Drawbacks of the current setup.

This is an unrealistic depicition of the way switches process packets.
- IRL, our queues can't look into the future and know the entire workload of pushes and pops at the start.
- IRL, there may come times where our queue does nothing (when packets are in flight and it's not yet time to call pop). However, since our test harness tries to process all pushes and pops as fast as possible, our tests have no idle time!

We remedy this by making a benchmarking harness that more closely models actual PCAPs. Broadly, we wish to do the following:

Fix a specific clock period for our queues.
Determine the rate at which we call pop
For each push in our workload, keep track of an "arrival time" for the associated packet.
Actually push a packet only once its arrival time has passed.
- the hardware can do this by counting cycles since we've fixed the clock period

Challenges with the new setup.

Benchmarking our queues becomes trickier: there's no longer a single number (total_time) we can use to compare designs. Instead, we might consider some subset of the following:
- generate graphs similar to those produced by our simulator
- keep track of how often overflow/underflow occurs Perhaps we can qualitatively compare queues with the helps of these statistics.
We can no longer use this setup to check the correctness of our hardware.
- the number of cycles spent to push and pop now influences the order packets are popped

[x] Write script to parse PCAPs and generate a .data file. The data file should include the following memories:
- commands, values, ans_mem as usual
- arrival_cycles, to keep track of the packet's arrival time for each push
- mac_addrs, to keep track of the packet's source for each push; we'll use this for flow inference
[ ] Make a calyx component similar to queue_call.py to repeatedly invoke our queue.
[ ] Generate graphs for our queues in the style of Formal Abstractions and our simulator.

Oct 10 '24 21:10 polybeandip

Hooray, looks great! And yes, just bake in some simple flow inference for now.

Oct 14 '24 20:10 anshumanmohan