FastExpressionCompiler
FastExpressionCompiler copied to clipboard
Execution slow-down, or incorrect use?
Hi,
In the compiled delegates below (adding two ints), using FastExpressionCompiler the execution time is consistently 16% and 65% slower than using standard BCL expressions. Any idea why, or am I using it incorrectly?
Thanks, Kristian
ETSumBCL: 164 ms ETSumCompileFastInfo: 191 ms (+16%) ETSumCompileFast: 271 ms (+65%)
// using Fast = FastExpressionCompiler;
// using FastPEI = FastExpressionCompiler.ParameterExpressionInfo;
Time("Expression", ETSum());
Time("Expression Compile Fast Info", ETSumCompileFastInfo());
Time("Expression Compile Fast", ETSumCompileFast());
Func<int, int, int> ETSum()
{
var aExp = Expression.Parameter(typeof(int), "a");
var bExp = Expression.Parameter(typeof(int), "b");
return Expression.Lambda<Func<int, int, int>>(
Expression.Add(aExp, bExp), aExp, bExp).Compile();
}
Func<int, int, int> ETSumCompileFastInfo()
{
var aExp = FastPEI.Parameter(typeof(int), "a");
var bExp = FastPEI.Parameter(typeof(int), "b");
return Fast.ExpressionCompiler.CompileFast<Func<int, int, int>>(
FastPEI.Lambda(Expression.Add(aExp, bExp), aExp, bExp)
);
}
Func<int, int, int> ETSumCompileFast()
{
var aExp = Expression.Parameter(typeof(int), "a");
var bExp = Expression.Parameter(typeof(int), "b");
return Fast.ExpressionCompiler.CompileFast<Func<int, int, int>>(
Expression.Lambda(Expression.Add(aExp, bExp), aExp, bExp)
);
}
private static void Time(String name, Func<Int32, Int32, Int32> fn)
{
int result = 0;
Stopwatch sw = new Stopwatch();
sw.Start();
for (Int32 index = 0; index <= 100000000; index++)
{
result += fn(index, 1);
}
sw.Stop();
Console.WriteLine($"{name}: {sw.ElapsedMilliseconds} ms{result.ToString().Substring(0, 0)}");
}
Can you do the same benchmark with BenchmarkDotNet?
It provides much more robust evaluation and stats for multiple possible platforms, Jits.
I am using it here: https://github.com/dadhi/FastExpressionCompiler/tree/master/test/FastExpressionCompiler.Benchmarks
I have checked with BDN and did not found any issue. I've also added case with ExpressionInfo
as fastest (the Scaled column). ExpressionInfo.Add
was not available, so I've added it too. The results are below. The ExpressionInfo.Add
will be available in the next version:
The code for benchmark:
using System;
using System.Linq.Expressions;
using BenchmarkDotNet.Attributes;
namespace FastExpressionCompiler.Benchmarks
{
[MemoryDiagnoser]
public class SimpleExpr_ParamPlusParam
{
private static Expression<Func<int, int, int>> CreateSumExpr()
{
var aExp = Expression.Parameter(typeof(int), "a");
var bExp = Expression.Parameter(typeof(int), "b");
return Expression.Lambda<Func<int, int, int>>(Expression.Add(aExp, bExp), aExp, bExp);
}
private static ExpressionInfo<Func<int, int, int>> CreateSumExprInfo()
{
var aExp = ExpressionInfo.Parameter(typeof(int), "a");
var bExp = ExpressionInfo.Parameter(typeof(int), "b");
return ExpressionInfo.Lambda<Func<int, int, int>>(
ExpressionInfo.Add(aExp, bExp), aExp, bExp);
}
private static Expression<Func<int, int, int>> SumExpr = CreateSumExpr();
private static ExpressionInfo<Func<int, int, int>> SumExprInfo = CreateSumExprInfo();
[Benchmark]
public object Expression_Compile() => SumExpr.Compile();
[Benchmark]
public object Expression_CompileFast() => SumExpr.CompileFast();
[Benchmark(Baseline = true)]
public object ExpressionInfo_CompileFast() => SumExprInfo.CompileFast();
}
}
Oops, sorry. For some reason I misunderstand it to measure a compile time and not an execution tim. Will re-check Tomorrow.
(NOTE: the benchmarks are done on .NET Core 2, cause BDN keeps failing for me on Full CLR)
I am observing the slowdown on this case (a+b):
From the debugging I see the same IL is generated. So if anyone can help, I will be grateful.
May be this case somehow optimized cause I don't see a slow down in other cases
Tested expression:
private static Expression<Func<bool>> Get_and_with_or_Expr()
{
var x = 1;
var s = "Test";
return () => x == 1 && (s.Contains("S") || s.Contains("s"));
}
And there is still a big difference for nested lambdas:
Tested expression:
private static Expression<Func<X>> Get_expr_with_2_nested_lambdas()
{
var a = new A();
var b = new B();
return () => CreateX((aa, bb) => new X(aa, bb), new Lazy<A>(() => a), b);
}
Found the related question in StackOverflow without a definite answer. But there are the same benchmark results with compiled lambda faster than direct delegate.
I'm pretty sure this is a JIT "bug". You could call it an interesting optimization choice, too ;-).
Because if you try this code, which is a little closer to what Expression.Compile does, then CompileFast's output is faster:
class Bla
{
public static readonly Bla Instance = new Bla();
public readonly int SomeValue = 3;
}
var aExp = Expression.Parameter(typeof(int), "a");
var bExp = Expression.Parameter(typeof(int), "b");
return Expression.Lambda<Func<int, int, int>>(Expression.Add(aExp, Expression.Field(Expression.Constant(Bla.Instance), typeof(Bla).GetField( "SomeValue"))), aExp, bExp);
Basically, Expression.Compile always adds a closure object, and that extra baggage makes the JIT... faster? It's crazy, but hey...
Interesting. But until I see an actual IL or better jitted ASM it is not clear.
I am currently building, stealing, looking for ILVisualizer for compiled delegate in #54
Is it still an issue I may look at?
Yes, you can look but likely it is a some glitch.
Linking #225
New results after fixing #225
Proof me wrong ;-)