Wrong Optimized Network Contraction Order
I have encountered a problem when setting the optimized network contraction order using Cytnx version 0.9.7. The attachment is a demo code demonstrating the issue: show_problem.zip
The T tensor and its dagger are rank-4 tensors with dimension chi for each bond:
0 2
┌─────┘ └────┐
│ ┏━━━╳━━┓ ┏━━━╳━━┓ │
└─┨ ┃ ┃ ┠─┘
1 ───┨ T ┠─── 2 0 ───┨ T.d ┠─── 3
┃ ┠─┐ ┌─┨ ┃
┗━━━━━━┛ │ │ ┗━━━━━━┛
┌────┘ └─────┐
3 1
Below, I use the network:
┌──────101───────┐
│ │
┌─────┘ └────┐
│ ┏━━━╳━━┓ ┏━━━╳━━┓ │
└─┨ ┃ ┃ ┠─┘
0 ───┨ Tu ┠───102───┨ Tu.d ┠───2
┃ ┠─┐ ┌─┨ ┃ ┏━━━╳━━┓
┗━━━━━━┛ │ │ ┗━━━━━━┛ 0 ───┨ ┠─── 2
┌────103───┘ └────104───┐ --> ┃ TOUT ┃
│ ┏━━━╳━━┓ ┏━━━╳━━┓ │ 1 ───┨ ┠─── 3
└─┨ ┃ ┃ ┠─┘ ┗━━━━━━┛
1────┨ Td ┠───105───┨ Td.d ┠────3
┃ ┠─┐ ┌─┨ ┃
┗━━━━━━┛ │ │ ┗━━━━━━┛
┌────┘ └─────┐
│ │
└──────106───────┘
or
net = cytnx.Network()
net.FromString(["Tu : 101,0,102,103",
"Td : 103,1,105,106",
"Tu.d: 102,104,101,2",
"Td.d: 105,106,104,3",
"TOUT: 0, 1; 2, 3"])
For this network, the optimized contraction order should be ((Tu, Tu.d), (Td, Td.d)), with
- first step: (Tu, Tu.d) -->Tu2, (Td, Td.d) --> Td2 CPU cost = O(chi^6), memory = O(chi^4).
- second step: (Tu2, Td2) --> TOUT CPU cost = O(chi^6), memory = O(chi^4).
However, when using Network.setOrder(optimal=True), the contraction order provided is ((Tu, Td), (Tu.d, Td.d)), with
- first step: (Tu, Td) -->Tud1, (Tu.d, Td.d) --> Tud2 CPU cost = O(chi^7), memory = O(chi^6).
- second step: (Tud1, Tud2) --> TOUT CPU cost = O(chi^8), memory = O(chi^4). That gives a contraction order with higher CPU cost and memory.
Additionally, kernel crash happens by manually changing the contraction order by setting Network.setOrder(optimal=False, contract_order=((Tu,Tu.d),(Td,Td.d))).
I believe the network optimizer using dynamic programming might have a bug. @j9263178 also encountered a similar issue. I propose to replace it with the algorithm used in Uni10 in this paper Phys. Rev. E 90, 033315 (2014) or https://joss.theoj.org/papers/10.21105/joss.00753
The algorithm is ported from uni10 tho
Uni10 implemented the algorithm in the PRE paper.
Should be fixed now