Remove unused `profile.json` fields?
I enjoyed your FOSDEM 24 talk very much, and I have been exploring the profile.json format a little bit out of curiosity.
There are a bunch of things I noticed which might be specific to the firefox internal profiles which are not relevant for samply profiling any random program.
In particular, there is a ton of fields in the profile.json which I believe do not make any sense outside of firefox, but they are completely filled in with their default values:
funcTable.isJSandfuncTable.relevantForJsare allfalseand seem irrelevantframeTable.innerWindowID,frameTable.implementationandframeTable.optimizationsare allnulland I doubt they are relevant?frameTable.line,frameTable.column,funcTable.fileName,funcTable.lineNumberandfuncTable.columnNumberseem to be allnullat least all the ones I was looking at. Though filenames are being displayed in the flame graph and opening the source definition works as well, and I have no idea how that even works? :-DframeTable.inlineDepthseems all0, at least the ones I looked at.
Then there is also frameTable.category, frameTable.subcategory, stackTable.category and stackTable.subcategory. I wonder if the entries in stackTable are duplicated, as the stackTable has a reference to the frameTable which also has the same info, or can the actual values diverge?
In either case, samply only ever outputs a single category (for now, I wish it would be possible to customize this and will open a separate issue for it), so I wonder if this can be compressed a bit better.
Another fun fact here is that the main thread defaults to the name GeckoMain :-D
These are all written using this SerializableSingleValueColumn((), len) helper, which just writes len nulls. @mstange is this required for the front end? Can we just write an empty array here, and have the front end understand undefined? inlineDepth gets written as len count of 0, so maybe some code to treat undefined as 0 there.
These are taking up a significant amount of space in the json for long multi-process profiles. Even better would be to change the format to make these properties fully optional. For comparison, a ~1 minute capture with 8000 tracks (threads) is around 180MB with all the extra nulls. If I just make SerializableSingleValueColumn just output an empty array, it goes down to 135MB -- which is a very significant 30% size reduction!
In particular, there is a ton of fields in the
profile.jsonwhich I believe do not make any sense outside of firefox, but they are completely filled in with their default values:
funcTable.isJSandfuncTable.relevantForJsare allfalseand seem irrelevant
These are now seeing some use for profiles with JIT frames. Having two columns is a bit overkill though; they could be combined into a single flags column.
frameTable.innerWindowID,frameTable.implementationandframeTable.optimizationsare allnulland I doubt they are relevant?
Wait, optimizations is still there? It's no longer needed as of format version 45. So yes we should stop outputting that one immediately.
The removal of implementation is tracked in https://github.com/firefox-devtools/profiler/issues/3713 .
innerWindowID indeed is completely useless for us and could be made optional. I've filed https://github.com/firefox-devtools/profiler/issues/5006 about it.
frameTable.line,frameTable.column,funcTable.fileName,funcTable.lineNumberandfuncTable.columnNumberseem to be allnullat least all the ones I was looking at. Though filenames are being displayed in the flame graph and opening the source definition works as well, and I have no idea how that even works? :-D
Yes they're null before symbolication and non-null after symbolication, and samply currently only emits unsymbolicated profiles. The filenames are displayed in the flamegraph because the front-end requests symbolication information asynchronously through an extra network request. Once we have a presymbolicate mode, these columns will be non-null in that mode even before the front-end consumes the JSON.
frameTable.inlineDepthseems all0, at least the ones I looked at.
Same as above - it is often non-zero after symbolication.
Then there is also
frameTable.category,frameTable.subcategory,stackTable.categoryandstackTable.subcategory. I wonder if the entries instackTableare duplicated, as thestackTablehas a reference to theframeTablewhich also has the same info, or can the actual values diverge?
The values in the stackTable are somewhat duplicated, yes. The reason for this is that the frameTable can have null categories or subcategories - they're always non-null in profiles generated by fxprof-processed-profile, but the format allows null. And then the stackTable always has non-null categories, by inheriting the nearest non-null category from a caller.
In profiles generated by Firefox, all C++ frames have a null category in the frameTable, and the actual categories like Graphics / DOM / JS are supplied by synthetic label frames.
In either case,
samplyonly ever outputs a single category (for now, I wish it would be possible to customize this and will open a separate issue for it), so I wonder if this can be compressed a bit better.
It outputs different categories for User and Kernel on Linux and Windows now. And JIT frames have a different category on all platforms.
Another fun fact here is that the main thread defaults to the name
GeckoMain:-D
This was fixed in e74e5145c27fbbe672cd2d44e78bc9270210c466 - it had already been fixed at the time this issue was filed, the fix just hadn't made it into a release.
frameTable.innerWindowID,frameTable.implementationandframeTable.optimizationsare allnulland I doubt they are relevant?Wait,
optimizationsis still there? It's no longer needed as of format version 45. So yes we should stop outputting that one immediately.
I've made this change in #232.