FastExpressionCompiler icon indicating copy to clipboard operation
FastExpressionCompiler copied to clipboard

Execution slow-down, or incorrect use?

Open KristianWedberg opened this issue 7 years ago • 11 comments

Hi,

In the compiled delegates below (adding two ints), using FastExpressionCompiler the execution time is consistently 16% and 65% slower than using standard BCL expressions. Any idea why, or am I using it incorrectly?

Thanks, Kristian

ETSumBCL: 164 ms ETSumCompileFastInfo: 191 ms (+16%) ETSumCompileFast: 271 ms (+65%)

// using Fast = FastExpressionCompiler;
// using FastPEI = FastExpressionCompiler.ParameterExpressionInfo;

            Time("Expression", ETSum());
            Time("Expression Compile Fast Info", ETSumCompileFastInfo());
            Time("Expression Compile Fast", ETSumCompileFast());

             Func<int, int, int> ETSum()
            {
                var aExp = Expression.Parameter(typeof(int), "a");
                var bExp = Expression.Parameter(typeof(int), "b");

                return Expression.Lambda<Func<int, int, int>>(
                    Expression.Add(aExp, bExp), aExp, bExp).Compile();
            }

            Func<int, int, int> ETSumCompileFastInfo()
            {
                var aExp = FastPEI.Parameter(typeof(int), "a");
                var bExp = FastPEI.Parameter(typeof(int), "b");

                return Fast.ExpressionCompiler.CompileFast<Func<int, int, int>>(
                    FastPEI.Lambda(Expression.Add(aExp, bExp), aExp, bExp)
                    );
            }

            Func<int, int, int> ETSumCompileFast()
            {
                var aExp = Expression.Parameter(typeof(int), "a");
                var bExp = Expression.Parameter(typeof(int), "b");

                return Fast.ExpressionCompiler.CompileFast<Func<int, int, int>>(
                    Expression.Lambda(Expression.Add(aExp, bExp), aExp, bExp)
                    );
            }

        private static void Time(String name, Func<Int32, Int32, Int32> fn)
        {
            int result = 0;
            Stopwatch sw = new Stopwatch();
            sw.Start();
            for (Int32 index = 0; index <= 100000000; index++)
            {
                result += fn(index, 1);
            }
            sw.Stop();
            Console.WriteLine($"{name}: {sw.ElapsedMilliseconds} ms{result.ToString().Substring(0, 0)}");
        }

KristianWedberg avatar Feb 09 '18 15:02 KristianWedberg

Can you do the same benchmark with BenchmarkDotNet?

It provides much more robust evaluation and stats for multiple possible platforms, Jits.

I am using it here: https://github.com/dadhi/FastExpressionCompiler/tree/master/test/FastExpressionCompiler.Benchmarks

dadhi avatar Feb 09 '18 20:02 dadhi

I have checked with BDN and did not found any issue. I've also added case with ExpressionInfo as fastest (the Scaled column). ExpressionInfo.Add was not available, so I've added it too. The results are below. The ExpressionInfo.Add will be available in the next version:

image

The code for benchmark:

using System;
using System.Linq.Expressions;
using BenchmarkDotNet.Attributes;

namespace FastExpressionCompiler.Benchmarks
{
    [MemoryDiagnoser]
    public class SimpleExpr_ParamPlusParam
    {
        private static Expression<Func<int, int, int>> CreateSumExpr()
        {
            var aExp = Expression.Parameter(typeof(int), "a");
            var bExp = Expression.Parameter(typeof(int), "b");
            return Expression.Lambda<Func<int, int, int>>(Expression.Add(aExp, bExp), aExp, bExp);
        }

        private static ExpressionInfo<Func<int, int, int>> CreateSumExprInfo()
        {
            var aExp = ExpressionInfo.Parameter(typeof(int), "a");
            var bExp = ExpressionInfo.Parameter(typeof(int), "b");
            return ExpressionInfo.Lambda<Func<int, int, int>>(
                ExpressionInfo.Add(aExp, bExp), aExp, bExp);
        }

        private static Expression<Func<int, int, int>> SumExpr = CreateSumExpr();
        private static ExpressionInfo<Func<int, int, int>> SumExprInfo = CreateSumExprInfo();

        [Benchmark]
        public object Expression_Compile() => SumExpr.Compile();

        [Benchmark]
        public object Expression_CompileFast() => SumExpr.CompileFast();

        [Benchmark(Baseline = true)]
        public object ExpressionInfo_CompileFast() => SumExprInfo.CompileFast();
    }
}

dadhi avatar Feb 15 '18 14:02 dadhi

Oops, sorry. For some reason I misunderstand it to measure a compile time and not an execution tim. Will re-check Tomorrow.

dadhi avatar Feb 15 '18 20:02 dadhi

(NOTE: the benchmarks are done on .NET Core 2, cause BDN keeps failing for me on Full CLR)

I am observing the slowdown on this case (a+b):

image

From the debugging I see the same IL is generated. So if anyone can help, I will be grateful.

May be this case somehow optimized cause I don't see a slow down in other cases

image

Tested expression:

    private static Expression<Func<bool>> Get_and_with_or_Expr()
    {
        var x = 1;
        var s = "Test";
        return () => x == 1 && (s.Contains("S") || s.Contains("s"));
    }

And there is still a big difference for nested lambdas:

image

Tested expression:

    private static Expression<Func<X>> Get_expr_with_2_nested_lambdas()
    {
        var a = new A();
        var b = new B();
        return () => CreateX((aa, bb) => new X(aa, bb), new Lazy<A>(() => a), b);
    }

dadhi avatar Feb 16 '18 11:02 dadhi

Found the related question in StackOverflow without a definite answer. But there are the same benchmark results with compiled lambda faster than direct delegate.

dadhi avatar Feb 16 '18 13:02 dadhi

I'm pretty sure this is a JIT "bug". You could call it an interesting optimization choice, too ;-).

Because if you try this code, which is a little closer to what Expression.Compile does, then CompileFast's output is faster:

    class Bla
    {
        public static readonly Bla Instance = new Bla();
        public readonly int SomeValue = 3;
    }
            var aExp = Expression.Parameter(typeof(int), "a");
            var bExp = Expression.Parameter(typeof(int), "b");
            return Expression.Lambda<Func<int, int, int>>(Expression.Add(aExp, Expression.Field(Expression.Constant(Bla.Instance), typeof(Bla).GetField( "SomeValue"))), aExp, bExp);

Basically, Expression.Compile always adds a closure object, and that extra baggage makes the JIT... faster? It's crazy, but hey...

EamonNerbonne avatar Feb 27 '18 10:02 EamonNerbonne

Interesting. But until I see an actual IL or better jitted ASM it is not clear.

I am currently building, stealing, looking for ILVisualizer for compiled delegate in #54

dadhi avatar Feb 27 '18 14:02 dadhi

Is it still an issue I may look at?

dzmitry-lahoda avatar Nov 07 '18 13:11 dzmitry-lahoda

Yes, you can look but likely it is a some glitch.

dadhi avatar Nov 07 '18 13:11 dadhi

Linking #225

dadhi avatar Sep 09 '19 20:09 dadhi

New results after fixing #225

image

Proof me wrong ;-)

dadhi avatar Sep 15 '19 18:09 dadhi