Remove boxing allocations in Record writer
Describe the solution you'd like
While profiling the application for memory footprint, I noticed a lot of boxing allocations
Note, that highlighted Guid.ToString() allocations will be removed once #133 is merged. As highlighted, there are a lot of allocations for value types (Guid, boolean, Int64, Int32). I think this is due to boxing that takes place in dynamically compiled lambda GenerateGetValue. In a nutshell it does
static object GetValue(object instance, string memberName)
{
var nameHash = memberName.GetHashCode();
switch (nameHash)
{
case 123:
return (object) ((User) instance).Id;
case 456:
return (object) ((User) instance).IsActive;
// the rest of the properties
}
}
So, all value types returned as object will be boxed which we can see from the profiler. The code that is used to profile:
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello World!");
Fixture fixture = new Fixture();
var data = fixture
.Build<User>()
.With(u => u.Offerings, fixture.CreateMany<Offering>(50).ToList)
.CreateMany(1000).ToArray();
var serialized = AvroConvert.Serialize(data);
Console.WriteLine($"Serialized {serialized.Length}");
Console.ReadLine();
}
}
Looking into User, Contact, Offering classes, they indeed have int, long, bool, Guid properties.
It would be good to remove these box allocations and improve memory footprint of AvroConvert
Describe implementation idea
The idea is to instead use dynamically compiled lambda which will write each property value using corresponding IWriter method.
WriteRecordFields method will become
private static void WriteRecordFields(object recordObj, WriteStep[] writers, IWriter encoder)
{
if (recordObj is null)
{
encoder.WriteNull();
return;
}
if (recordObj is ExpandoObject expando)
{
HandleExpando(writers, encoder, expando);
return;
}
var type = recordObj.GetType();
var lazyWriters = writersDictionary.GetOrAdd(type, Factory, writers);
Action<object, IWriter> recordWriter = lazyWriters.Value;
recordWriter.Invoke(recordObj, encoder);
}
private static Func<Type, WriteStep[], Lazy<Action<object, IWriter>>> Factory =>
(type, writeSteps) => new Lazy<Action<object, IWriter>>(() => GetRecordWriter(type, writeSteps),
LazyThreadSafetyMode.ExecutionAndPublication);
Action<object, IWriter> represents a compiled lambda which avoids boxing allocations and corresponds to roughly
static void WriteValues(object instance, IWriter encoder)
{
var actualInstance = (User) instance;
encoder.WriteInt(actualInstance.Id);
encoder.WriteBoolean(actualInstance.IsActive);
// rest of the properties
}
Which could be achieved by a similar expression builder (I did not handle every possible property type, rather made sure this will cover User, Contact, Offering model properties)
private static Action<object, IWriter> GetRecordWriter(Type type, WriteStep[] writeSteps)
{
var namePropertyInfoMap =
type.GetProperties(BindingFlags.Instance | BindingFlags.Public | BindingFlags.IgnoreCase |
BindingFlags.FlattenHierarchy)
.ToDictionary(pi => pi.Name, pi => pi);
var instance = Expression.Parameter(typeof(object), "instance");
var writer = Expression.Parameter(typeof(IWriter), "writer");
var actualInstance = Expression.Variable(type, "actualInstance");
var expressions = new List<Expression>
{
Expression.Assign(actualInstance, Expression.Convert(instance, type))
};
for (var index = 0; index < writeSteps.Length; index++)
{
var writeStep = writeSteps[index];
if (namePropertyInfoMap.TryGetValue(writeStep.FieldName, out var propInfo))
{
var propertyAccess = Expression.Property(actualInstance, propInfo);
if (propInfo.PropertyType.IsValueType)
{
var methodCallExpression = GetMethodCall(propInfo.PropertyType, writer, propertyAccess);
expressions.Add(methodCallExpression);
}
else
{
// Convert the property value to object, as WriteItem expects an object as the first parameter.
var convertedPropertyAccess = Expression.Convert(propertyAccess, typeof(object));
// Create the delegate invocation expression.
var writeFieldDelegate = Expression.Constant(writeStep.WriteField, typeof(WriteItem));
var delegateInvokeExpression = Expression.Invoke(writeFieldDelegate, convertedPropertyAccess, writer);
expressions.Add(delegateInvokeExpression);
}
}
}
var block = Expression.Block(new[] { actualInstance }, expressions);
return Expression.Lambda<Action<object, IWriter>>(block, instance, writer).Compile();
static MethodCallExpression GetMethodCall(Type primitiveType, ParameterExpression writer, Expression propertyAccess)
{
if (primitiveType == typeof(int))
return Expression.Call(writer, nameof(IWriter.WriteInt), Type.EmptyTypes, propertyAccess);
if (primitiveType == typeof(Guid))
return Expression.Call(writer, nameof(IWriter.WriteGuid), Type.EmptyTypes, propertyAccess);
if (primitiveType == typeof(bool))
return Expression.Call(writer, nameof(IWriter.WriteBoolean), Type.EmptyTypes, propertyAccess);
if (primitiveType == typeof(long))
return Expression.Call(writer, nameof(IWriter.WriteLong), Type.EmptyTypes, propertyAccess);
if (primitiveType.IsEnum)
{
var enumAsInt = Expression.Convert(propertyAccess, typeof(int));
return Expression.Call(writer, nameof(IWriter.WriteInt), Type.EmptyTypes, enumAsInt);
}
throw new NotImplementedException();
}
}
Running same profiling code with this approach eliminates boxing allocations
Additional Context
I appreciate that this is quite a refactoring which makes changes to the core of the library, nevertheless wanted to share my findings.
That's a brilliant idea Manvel and actually one of the directions that I was investigating some time ago.
I was trying to build the whole expression tree based on the resolvers results and have it invoked at the very end of read/write operation.
What is needed in your opinion to get to that point?
Ideally, your GetMethodCall() method would be replaced with regular ResolveWriter() invocation and have the current paths covered. Could you maybe create a branch with your code that we could work on together?
@AdrianStrugala I pushed the changes described in the issue to perf/improve-record-resolver branch. As for expression tree approach to resolve writer for objects, let me get back to you after I spent some time thinking about it.