kineto icon indicating copy to clipboard operation
kineto copied to clipboard

GPU traces fail when using PyTorch lightning due to square braces in traceName

Open agkphysics opened this issue 11 months ago • 2 comments

PyTorch Lightning saves traces in the format fit-profiler-[Strategy]SingleDeviceStrategy-ts.pt.trace.json, and thus the traceName key contains a square brace ]. Since the traceName key is the final key in the JSON file, it fails to load in the TensorBoard viewer when GPU ops are present due to this line, which cuts off the string and creates invalid JSON: https://github.com/pytorch/kineto/blob/8466a8b111b36dc725e6855d52a0b133d925a8e0/tb_plugin/torch_tb_profiler/run.py#L137

This results in the error Uncaught (in promise) SyntaxError: JSON.parse, ... in the web browser.

agkphysics avatar Mar 08 '24 13:03 agkphysics

@agkphysics would this catch this issue https://github.com/pytorch/kineto/commit/51fd6e6a1caccbbbf0e23e5cf2b68cc6afd79ad4

briancoutinho avatar Mar 08 '24 18:03 briancoutinho

@briancoutinho No I don't think this would help because this just seems to replace forward slashes and remove newlines from JSON strings. The issue is how the GPU stats are appended to the file before being passed to the trace viewer.

agkphysics avatar Mar 08 '24 21:03 agkphysics