msbuild icon indicating copy to clipboard operation
msbuild copied to clipboard

Perf: Frozen environment variable caching

Open ccastanedaucf opened this issue 5 months ago • 0 comments
trafficstars

Fixes

CPU + allocations + memory usage by repeated construction of environment variables by returning a cached FrozenDictionary instance when no changes exist.

These are also currently held onto by ConfigCache -> BuildRequestConfiguration, so these will actually stay in the working set for the entire build if not de-duped.

Before:

image

image

After:

image

image

Context

On .NET Core, Environment.GetEnvironmentVariables() internally creates a new Hashtable instance and string allocations for each key-value pair. Since the rest of MSBuild expects a typed Dictionary<string, string> instance, we need to allocate yet another dictionary and copy all the results.

IDictionary vars = Environment.GetEnvironmentVariables();
Dictionary<string, string> table = new Dictionary<string, string>(vars.Count, StringComparer.OrdinalIgnoreCase);
foreach (var key in vars.Keys)
{
    // copy
}

On .NET Framework, we use the native Win32 APIs so we directly parse into a Dictionary instance, but we still end up allocating a new instance on every call and a bunch of additional strings.

Dictionary<string, string> table = new(200, StringComparer.OrdinalIgnoreCase);
for (int i = 0; i < stringBlockLength; i++)
{
    string key = new string(pEnvironmentBlock, startKey, i - startKey);
    string value = new string(pEnvironmentBlock, startValue, i - startValue);
}

The environment set only changes a handful of times throughout a build, so in reality we're just creating the same set over and over again.

Given that these are then handed to long lived objects, we also just end up with a ton of duplicated strings in memory. Here's a time slice of OrchardCode taken at the same point of the build graph, where ~100k strings less are being held on to in memory after this:

image

image)

Changes Made

This introduces a container for caching the previous environment state. If we read the environment variables and the count, keys, and values are an exact match, we skip the allocation altogether and return a FrozenDictionary instance.

private static EnvironmentState s_environmentState;

private sealed record class EnvironmentState(FrozenDictionary<string, string> EnvironmentVariables, string EnvironmentBlock = null);

This also changes .NET Core on Windows to use the native codepath. In this case, we can use the entire environment block string as our cache comparison, and utilize StringTools to avoid unnecessary extra string allocations if we need to build a new set.

ReadOnlySpan<char> stringBlock = new(pEnvironmentBlock, (int)stringBlockLength);
EnvironmentState lastState = s_environmentState;
if (lastState?.EnvironmentBlock.AsSpan().SequenceEqual(stringBlock) == true)
{
    return lastState.EnvironmentVariables;
}
string key = Strings.WeakIntern(new ReadOnlySpan<char>(pEnvironmentBlock + startKey, i - startKey));
string value = Strings.WeakIntern(new ReadOnlySpan<char>(pEnvironmentBlock + startValue, i - startValue));

On Unix, we still need to call Environment.GetEnvironmentVariables(), but we can still avoid the extra dictionary allocation and duplicated memory by doing a set comparison.

The TaskHost path is ifdef-ed to behave as before, due to the absence of System.Collections.Frozen.

Notes

My one main concern was whether the dictionary is exposed in some public API that expects a mutable dictionary. As far as I can tell, that isn't the case since the only place we expose it (in BuildParameters.BuildProcessEnvironment) already returns a ReadOnlyDictionary, and the rest of our uses are internal and already don't mutate

ccastanedaucf avatar Jun 13 '25 20:06 ccastanedaucf