chakra
chakra copied to clipboard
Can't get rf_id in kineto trace!!!, trace_link.py can't find relation of cpu_op.
chakra_device_trace_loader.py:
kineto_rf_id_to_kineto_op_map = {op.rf_id: op for op in kineto_cpu_ops if op.rf_id is not None}
kineto.jason:
{
"ph": "X", "cat": "cpu_op", "name": "MseLossBackward0", "pid": 1182, "tid": 1331,
"ts": 1726134043823422, "dur": 158,
"args": {
"External id": 2562,"Sequence number": 4583, "Fwd thread id": 1, "Ev Idx": 1
}
},
{
"ph": "X", "cat": "cpu_op", "name": "aten::mse_loss_backward", "pid": 1182, "tid": 1331,
"ts": 1726134043823444, "dur": 135,
"args": {
"External id": 2563,"Ev Idx": 2
}
},
{
"ph": "X", "cat": "cpu_op", "name": "aten::zeros_like", "pid": 1182, "tid": 1331,
"ts": 1726134043823452, "dur": 90,
"args": {
"External id": 2564,"Ev Idx": 3
**There is no rf_id entry, that means trace_linker.py can't find connection of TE.json and kineto.json. **
Maybe pytorch version is key
@profvjreddi @TaekyungHeo @JoongunPark @sanrise
How did you collect the trace? Have you used the recent PyTorch?
@32HD can you please try https://github.com/mlcommons/chakra/pull/190 ? It may be related.