vtr-verilog-to-routing
vtr-verilog-to-routing copied to clipboard
[Route] Precision Issues in the Connection Router
I have found while working on the parallel connection router that there exists situations where the precision of the cost terms is too low and is causing problems.
I have found this problem while investigating a bug in the bitcoin_miner
circuit as part of the Titan benchmark suite. What appears to happen is that around iteration 68 of the router, the backward cost appears to grow to around 3e-3 / 4e-3 for some connections. What appears to be happening is that the edge costs are so small that, when added to the backward cost, it has no affect. This makes the circuit appear to be nothing but 0 cost edges towards the target node, which regresses the best-first search into basically a heuristically-driven breadth-first search.
This is technically not a problem for the serial router, it may just cause it to spend more time on convergence; however, this is a problem for the parallel connection router.
I have not tested this, but I think this may have even worst affect when the flat router is enabled. When the flat router is enabled, it will add more (low-cost) edges at the end of the wave expansion which are particularly susceptible to this precision issue. @amin1377 I wonder if this could cause the flat router to perform worst?
I am also wondering if this is the cause for the issues @duck2 is facing with the course-grained parallel router. It appears as though bitcoin_miner
is particularly problematic with these precision issues, and turning on flat routing may just make it much worst.
Proposed Solution
I propose that we change the backward and total cost terms stored in the router's data structures from float
to double
. This will give us more precision and should prevent this issue.
This is kicking the can down the road, since this will happen again for even larger circuits, but I cannot think of any other way of preventing this issue. We need some way of adding a very small number to a very large number without loss of precision, and I am not sure if we can exploit logs for numerical stability.
I have found, at least for the parallel connection router, that this fixed the issue I was facing; so this should likely work for circuits much larger than bitcoin_miner
and may be sufficient for the far future.