graphql-platform icon indicating copy to clipboard operation
graphql-platform copied to clipboard

BatchDataLoader performance issues

Open erwan-joly opened this issue 3 years ago • 3 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Describe the bug

Performance of batch data loader seems to go worst the more we request them. Not sure if that count as an issue as we are talking about milliseconds here but a simple Field call with the exact same parameter (in reality it's more likely to have different parameters I just use the same to simplify)

1 times: ~10ms 10 times: ~20ms 234 times: ~150ms

With the tracing it almost seems like the resolution is sequential

{
	"data": {
		"a1": "test",
		"a2": "test",
		"a3": "test",
		"a4": "test",
		"a5": "test"
	},
	"extensions": {
		"tracing": {
			"version": 1,
			"startTime": "2022-09-20T01:52:18.6630292Z",
			"endTime": "2022-09-20T01:52:18.6938835Z",
			"duration": 30854300,
			"parsing": {
				"startOffset": 0,
				"duration": 0
			},
			"validation": {
				"startOffset": 8200,
				"duration": 69300
			},
			"execution": {
				"resolvers": [
					{
						"path": [
							"a1"
						],
						"parentType": "Query",
						"fieldName": "myField",
						"returnType": "String",
						"startOffset": 159400,
						"duration": 3302000
					},
					{
						"path": [
							"a2"
						],
						"parentType": "Query",
						"fieldName": "myField",
						"returnType": "String",
						"startOffset": 669300,
						"duration": 2946500
					},
					{
						"path": [
							"a4"
						],
						"parentType": "Query",
						"fieldName": "myField",
						"returnType": "String",
						"startOffset": 1538800,
						"duration": 8998000
					},
					{
						"path": [
							"a3"
						],
						"parentType": "Query",
						"fieldName": "myField",
						"returnType": "String",
						"startOffset": 1120400,
						"duration": 13317900
					},
					{
						"path": [
							"a5"
						],
						"parentType": "Query",
						"fieldName": "myField",
						"returnType": "String",
						"startOffset": 1939100,
						"duration": 28580000
					}
				]
			}
		}
	}
}

I may be wrong but I believe this is an issue based on the fact BatchDataLoader are supposed to be a performance tool to avoid things like n+1 and will often be used for relatively big list (234 is not that big for a list so a 15x slowdown seems fairly big here)

Steps to reproduce

class Program
{
    public static void Main(string[] args)
    {
        CreateWebHostBuilder(args).Build().Run();
    }

    public static IWebHostBuilder CreateWebHostBuilder(string[] args) =>
        WebHost.CreateDefaultBuilder(args)
            .UseUrls("https://localhost:5001")
            .UseStartup<Startup>();
}

public class Startup
{
    public void ConfigureServices(IServiceCollection services)
    {
        services.AddGraphQLServer()
            .AddQueryType<Query>()
            .InitializeOnStartup();

    }

    public void Configure(IApplicationBuilder app, IWebHostEnvironment env)
    {
        app.UseRouting();
        app.UseEndpoints(endpoints =>
        {
            endpoints.MapGraphQL().WithOptions(
                new GraphQLServerOptions
                {
                    Tool = { Enable = false },
                    EnableSchemaRequests = env.IsDevelopment()
                });
        });
    }
}

public class Query
{
    public async Task<string> MyField(IResolverContext resolverContext, string myparam, CancellationToken token) =>
        await resolverContext.BatchDataLoader<string, string>(
            (keys, ct) =>
                Task.FromResult<IReadOnlyDictionary<string, string>>(
                keys.ToDictionary(k => k, k => k))
            ).LoadAsync(myparam, token);
}
Sample query
query MyField { a1:myField(myparam: "test") a2:myField(myparam: "test") a3:myField(myparam: "test") a4:myField(myparam: "test") a5:myField(myparam: "test") a6:myField(myparam: "test") a7:myField(myparam: "test") a8:myField(myparam: "test") a9:myField(myparam: "test") b1:myField(myparam: "test") b2:myField(myparam: "test") b3:myField(myparam: "test") b4:myField(myparam: "test") b5:myField(myparam: "test") b6:myField(myparam: "test") b7:myField(myparam: "test") b8:myField(myparam: "test") b9:myField(myparam: "test") c1:myField(myparam: "test") c2:myField(myparam: "test") c3:myField(myparam: "test") c4:myField(myparam: "test") c5:myField(myparam: "test") c6:myField(myparam: "test") c7:myField(myparam: "test") c8:myField(myparam: "test") c9:myField(myparam: "test") d1:myField(myparam: "test") d2:myField(myparam: "test") d3:myField(myparam: "test") d4:myField(myparam: "test") d5:myField(myparam: "test") d6:myField(myparam: "test") d7:myField(myparam: "test") d8:myField(myparam: "test") d9:myField(myparam: "test") e1:myField(myparam: "test") e2:myField(myparam: "test") e3:myField(myparam: "test") e4:myField(myparam: "test") e5:myField(myparam: "test") e6:myField(myparam: "test") e7:myField(myparam: "test") e8:myField(myparam: "test") e9:myField(myparam: "test") f1:myField(myparam: "test") f2:myField(myparam: "test") f3:myField(myparam: "test") f4:myField(myparam: "test") f5:myField(myparam: "test") f6:myField(myparam: "test") f7:myField(myparam: "test") f8:myField(myparam: "test") f9:myField(myparam: "test") g1:myField(myparam: "test") g2:myField(myparam: "test") g3:myField(myparam: "test") g4:myField(myparam: "test") g5:myField(myparam: "test") g6:myField(myparam: "test") g7:myField(myparam: "test") g8:myField(myparam: "test") g9:myField(myparam: "test") h1:myField(myparam: "test") h2:myField(myparam: "test") h3:myField(myparam: "test") h4:myField(myparam: "test") h5:myField(myparam: "test") h6:myField(myparam: "test") h7:myField(myparam: "test") h8:myField(myparam: "test") h9:myField(myparam: "test") i1:myField(myparam: "test") i2:myField(myparam: "test") i3:myField(myparam: "test") i4:myField(myparam: "test") i5:myField(myparam: "test") i6:myField(myparam: "test") i7:myField(myparam: "test") i8:myField(myparam: "test") i9:myField(myparam: "test") j1:myField(myparam: "test") j2:myField(myparam: "test") j3:myField(myparam: "test") j4:myField(myparam: "test") j5:myField(myparam: "test") j6:myField(myparam: "test") j7:myField(myparam: "test") j8:myField(myparam: "test") j9:myField(myparam: "test") k1:myField(myparam: "test") k2:myField(myparam: "test") k3:myField(myparam: "test") k4:myField(myparam: "test") k5:myField(myparam: "test") k6:myField(myparam: "test") k7:myField(myparam: "test") k8:myField(myparam: "test") k9:myField(myparam: "test") l1:myField(myparam: "test") l2:myField(myparam: "test") l3:myField(myparam: "test") l4:myField(myparam: "test") l5:myField(myparam: "test") l6:myField(myparam: "test") l7:myField(myparam: "test") l8:myField(myparam: "test") l9:myField(myparam: "test") m1:myField(myparam: "test") m2:myField(myparam: "test") m3:myField(myparam: "test") m4:myField(myparam: "test") m5:myField(myparam: "test") m6:myField(myparam: "test") m7:myField(myparam: "test") m8:myField(myparam: "test") m9:myField(myparam: "test") n1:myField(myparam: "test") n2:myField(myparam: "test") n3:myField(myparam: "test") n4:myField(myparam: "test") n5:myField(myparam: "test") n6:myField(myparam: "test") n7:myField(myparam: "test") n8:myField(myparam: "test") n9:myField(myparam: "test") o1:myField(myparam: "test") o2:myField(myparam: "test") o3:myField(myparam: "test") o4:myField(myparam: "test") o5:myField(myparam: "test") o6:myField(myparam: "test") o7:myField(myparam: "test") o8:myField(myparam: "test") o9:myField(myparam: "test") p1:myField(myparam: "test") p2:myField(myparam: "test") p3:myField(myparam: "test") p4:myField(myparam: "test") p5:myField(myparam: "test") p6:myField(myparam: "test") p7:myField(myparam: "test") p8:myField(myparam: "test") p9:myField(myparam: "test") q1:myField(myparam: "test") q2:myField(myparam: "test") q3:myField(myparam: "test") q4:myField(myparam: "test") q5:myField(myparam: "test") q6:myField(myparam: "test") q7:myField(myparam: "test") q8:myField(myparam: "test") q9:myField(myparam: "test") r1:myField(myparam: "test") r2:myField(myparam: "test") r3:myField(myparam: "test") r4:myField(myparam: "test") r5:myField(myparam: "test") r6:myField(myparam: "test") r7:myField(myparam: "test") r8:myField(myparam: "test") r9:myField(myparam: "test") s1:myField(myparam: "test") s2:myField(myparam: "test") s3:myField(myparam: "test") s4:myField(myparam: "test") s5:myField(myparam: "test") s6:myField(myparam: "test") s7:myField(myparam: "test") s8:myField(myparam: "test") s9:myField(myparam: "test") t1:myField(myparam: "test") t2:myField(myparam: "test") t3:myField(myparam: "test") t4:myField(myparam: "test") t5:myField(myparam: "test") t6:myField(myparam: "test") t7:myField(myparam: "test") t8:myField(myparam: "test") t9:myField(myparam: "test") u1:myField(myparam: "test") u2:myField(myparam: "test") u3:myField(myparam: "test") u4:myField(myparam: "test") u5:myField(myparam: "test") u6:myField(myparam: "test") u7:myField(myparam: "test") u8:myField(myparam: "test") u9:myField(myparam: "test") v1:myField(myparam: "test") v2:myField(myparam: "test") v3:myField(myparam: "test") v4:myField(myparam: "test") v5:myField(myparam: "test") v6:myField(myparam: "test") v7:myField(myparam: "test") v8:myField(myparam: "test") v9:myField(myparam: "test") w1:myField(myparam: "test") w2:myField(myparam: "test") w3:myField(myparam: "test") w4:myField(myparam: "test") w5:myField(myparam: "test") w6:myField(myparam: "test") w7:myField(myparam: "test") w8:myField(myparam: "test") w9:myField(myparam: "test") x1:myField(myparam: "test") x2:myField(myparam: "test") x3:myField(myparam: "test") x4:myField(myparam: "test") x5:myField(myparam: "test") x6:myField(myparam: "test") x7:myField(myparam: "test") x8:myField(myparam: "test") x9:myField(myparam: "test") y1:myField(myparam: "test") y2:myField(myparam: "test") y3:myField(myparam: "test") y4:myField(myparam: "test") y5:myField(myparam: "test") y6:myField(myparam: "test") y7:myField(myparam: "test") y8:myField(myparam: "test") y9:myField(myparam: "test") z1:myField(myparam: "test") z2:myField(myparam: "test") z3:myField(myparam: "test") z4:myField(myparam: "test") z5:myField(myparam: "test") z6:myField(myparam: "test") z7:myField(myparam: "test") z8:myField(myparam: "test") z9:myField(myparam: "test") }

Relevant log output

No response

Additional Context?

No response

Product

Hot Chocolate

Version

12.13.2

erwan-joly avatar Sep 20 '22 02:09 erwan-joly

Did you test in release mode or debug mode?

michaelstaib avatar Sep 20 '22 21:09 michaelstaib

Did you test in release mode or debug mode?

Debug mode! That may be the issue 🙂 will retry in release mode when I have some time. Anything different in debug mode that could explain this ?

EDIT: I just tried in release mode and I'm getting the same issue

erwan-joly avatar Sep 20 '22 22:09 erwan-joly

Just regarding the timings....

1 times: ~10ms 10 times: ~20ms 234 times: ~150ms

when we have 234 items it takes 150ms per item to resolve? or for all the 234 items?

michaelstaib avatar Sep 22 '22 07:09 michaelstaib

Just regarding the timings....

1 times: ~10ms 10 times: ~20ms 234 times: ~150ms

when we have 234 items it takes 150ms per item to resolve? or for all the 234 items?

For all items to load

erwan-joly avatar Sep 22 '22 11:09 erwan-joly

I have a similar issue when extending GraphQL types and created a PoC project to test extending 1000 items.

It is found that the extension method is called sequentially and has ~0.5ms delay between 2 consecutive calls, ran in release configuration. Each call ends in ~30 - 500ms, mostly waiting for the other nodes to finish. The whole execution finished in 983ms. I tried to remove async/await in the methods but result in similar conclusion. Detailed trace screenshot 1 and screenshot 2.

In erwan-joly's sample, I guess method MyField is called N times. 234 items require 117ms, assumed each call took ~0.5ms.

Minhael avatar Dec 23 '22 07:12 Minhael

We cannot reproduce this one and we are heavily using DL in the BCP cloud with high pressure and have 3 ms for the whole request on reads. I am closing this issue.

michaelstaib avatar Jun 28 '24 16:06 michaelstaib