MSBuildStructuredLog icon indicating copy to clipboard operation
MSBuildStructuredLog copied to clipboard

Strip secrets from .binlog

Open KirillOsenkov opened this issue 7 years ago • 4 comments

KirillOsenkov avatar Aug 27 '18 18:08 KirillOsenkov

We are also interested in the feasibility of this to start sharing some high quality large (3,000+ Project) Build Logs to troubleshoot performance issues when viewing such logs. It is difficult to create this volume of high quality data synthetically.

To prevent this from being "+1" post here are several related cases (I almost opened a new one):

  • https://github.com/KirillOsenkov/MSBuildStructuredLog/issues/175 - Asks for Similar ability with concern to Environment Variables
  • https://github.com/microsoft/msbuild/issues/3432 - The MSBuild Ask for the above

I have started a check list for areas of concern:

  • [ ] Environment Variables
  • [ ] Paths / File Names - (But we need to preserve "shape" in case its a long path issue)
  • [ ] Messages - This is probably an impossible ask?

aolszowka avatar Jan 03 '20 22:01 aolszowka

I think the most feasible approach for now is rewriting .binlogs via a command-line tool. The API to read and write binlogs is very easy, here's an example: http://msbuildlog.com/#api

I would imagine that you would write a tool that reads a binlog, iterates over all records, rewrites each record (or drops it entirely) and writes a stripped binlog. You could probably have an extensible set of rewriting rules.

For environment variables you could probably write an analyzer: for each variable, check if it is being used elsewhere in the build, and if not, delete it.

KirillOsenkov avatar Jan 03 '20 23:01 KirillOsenkov

@KirillOsenkov If you can save all targets/props (and environment) content to disk from a binlog, I wonder if it would be also possible to “re-load” modified content back into the binlog.

that way you could use just an offline regular text search&replace to strip secure content and prep a binlog for analysis by someone else

japj avatar Dec 20 '20 19:12 japj

I guess a binlog rewriter is simpler than that:

  1. read all records using https://github.com/KirillOsenkov/MSBuildStructuredLog/blob/901b4d84ff0b2d34e8711c0094fe12780b7a7dde/src/StructuredLogger/BinaryLogger/BinLogReader.cs#L33
  2. create a new instance of BinaryLogger: https://github.com/KirillOsenkov/MSBuildStructuredLog/blob/master/src/StructuredLogger/BinaryLogger/BinaryLogger.cs
  3. stream all event args from BinLogReader to BinaryLogger
  4. transform each event args arbitrarily on the way

KirillOsenkov avatar Dec 21 '20 00:12 KirillOsenkov

A prototype was implemented by @JanKrivanek: https://github.com/KirillOsenkov/MSBuildStructuredLog/pull/711

KirillOsenkov avatar Nov 09 '23 01:11 KirillOsenkov

Let’s close this issue and file new ones for any remaining work.

KirillOsenkov avatar Nov 18 '23 04:11 KirillOsenkov