chakra icon indicating copy to clipboard operation
chakra copied to clipboard

Can't get rf_id in kineto trace!!!, trace_link.py can't find relation of cpu_op.

Open 32HD opened this issue 1 year ago • 3 comments

chakra_device_trace_loader.py:

kineto_rf_id_to_kineto_op_map = {op.rf_id: op for op in kineto_cpu_ops if op.rf_id is not None}


kineto.jason:

{
    "ph": "X", "cat": "cpu_op", "name": "MseLossBackward0", "pid": 1182, "tid": 1331,
    "ts": 1726134043823422, "dur": 158,
    "args": {
      "External id": 2562,"Sequence number": 4583, "Fwd thread id": 1, "Ev Idx": 1
    }
  },
  {
    "ph": "X", "cat": "cpu_op", "name": "aten::mse_loss_backward", "pid": 1182, "tid": 1331,
    "ts": 1726134043823444, "dur": 135,
    "args": {
      "External id": 2563,"Ev Idx": 2
    }
  },
  {
    "ph": "X", "cat": "cpu_op", "name": "aten::zeros_like", "pid": 1182, "tid": 1331,
    "ts": 1726134043823452, "dur": 90,
    "args": {
      "External id": 2564,"Ev Idx": 3

**There is no rf_id entry, that means trace_linker.py can't find connection of TE.json and kineto.json. **

Maybe pytorch version is key

32HD avatar Sep 13 '24 06:09 32HD

@profvjreddi @TaekyungHeo @JoongunPark @sanrise

32HD avatar Sep 13 '24 06:09 32HD

How did you collect the trace? Have you used the recent PyTorch?

JoongunPark avatar Nov 26 '24 21:11 JoongunPark

@32HD can you please try https://github.com/mlcommons/chakra/pull/190 ? It may be related.

theodorbadea avatar Apr 03 '25 12:04 theodorbadea