Empty error message from OCH GraphQL client

Open pkosiec opened this issue 4 years ago • 1 comments

Description

Sometimes on our long-running cluster, we get an empty error message from the GraphQL client we use:


# 1: graphql:

We need to investigate whether there is a bug in our code (OCH client) or in our dependency (retry-go, machinebox/graphql client, etc.).

This may not be related just to local OCH client. Investigate all GraphQL clients (Public OCH, Local OCH, Engine) for potential causes.

Expected behavior

We get the error message for the root cause of GraphQL request failure.

Actual behavior

Example run: https://github.com/capactio/capact/runs/2762064610?check_suite_focus=true


# 1: graphql:

Jun 7 08:37:29.539: while deleting TypeInstance with ID a0ecd4c3-c55a-497a-8c68-05b5e6242cb8: while executing mutation to delete TypeInstance: All attempts fail:

# 1: graphql:

• Failure [305.075 seconds] Action /capact.io/capact/test/e2e/action_test.go:37 Action execution /capact.io/capact/test/e2e/action_test.go:57 should pick proper Implementation and inject TypeInstance based on cluster policy [It] /capact.io/capact/test/e2e/action_test.go:58 Timed out after 300.000s. Expected : FAILED to be an element of <[]interface *** | len:1, cap:1>: [["SUCCEEDED"]] /capact.io/capact/test/e2e/action_test.go:343 ```

Jun 08 '21 07:06 pkosiec

We need to bump prio for this task. It's problematic when it comes to debugging. We had such problem in our engine:

2021-06-29T08:40:59.930Z	DEBUG	controller-runtime.manager.events	[email protected]/zapr.go:69	Warning	{"object": {"kind":"Action","namespace":"capact-system","name":"capact-upgrade-fdhwl","uid":"089ef74c-cb7f-46b6-b1a3-b028c87041f7","apiVersion":"core.capact.io/v1alpha1","resourceVersion":"19263936"}, "reason": "Delete runner action", "message": "while unlocking TypeInstances: while executing mutation to unlock TypeInstances: All attempts fail:\n#1: graphql: "}

The strange thing is that I was able to unlock a given TypeInstance using GraphQL console exposed on Gateway.

EDIT: The reason was that there was no resp body, but the status code was 422.

Jun 29 '21 08:06 mszostok