terraform-provider-flux icon indicating copy to clipboard operation
terraform-provider-flux copied to clipboard

Provider crashes often

Open johnae opened this issue 3 years ago • 6 comments

Can't say with certainty how often the provider crashes but somewhere around 25-30% of the time feels reasonable. We've got several kubernetes clusters deployed via our in-house terraform kubernetes module. We're using flux to seed those clusters with base components. It seems as if the provider crashes more often the more clusters we've got within the same terraform state which kind of make sense I suppose.

We recently upgraded to terraform 0.15 but we had these issues on the 0.14.x version as well. Seems to be a concurrency issue somewhere within the provider - or its dependencies I suppose.

This is the stacktrace:


╷
│ Error: Plugin did not respond
│
│ The plugin encountered an error, and failed to respond to the plugin.(*GRPCProvider).ReadDataSource call. The plugin logs may contain
│ more details.
╵
╷
│ Error: Plugin did not respond
│
│ The plugin encountered an error, and failed to respond to the plugin.(*GRPCProvider).ReadDataSource call. The plugin logs may contain
│ more details.
╵
╷
│ Error: Plugin did not respond
│
│ The plugin encountered an error, and failed to respond to the plugin.(*GRPCProvider).ReadDataSource call. The plugin logs may contain
│ more details.
╵

Stack trace from the terraform-provider-flux_v0.1.3 plugin:

fatal error: concurrent map read and map write

goroutine 61 [running]:
runtime.throw(0x1a7854d, 0x21)
	runtime/panic.go:1117 +0x72 fp=0xc00051b3d0 sp=0xc00051b3a0 pc=0x436472
runtime.mapaccess2(0x18844c0, 0xc0062fbdd0, 0xc007187830, 0xc007187830, 0x22)
	runtime/map.go:469 +0x255 fp=0xc00051b410 sp=0xc00051b3d0 pc=0x40f6d5
reflect.mapaccess(0x18844c0, 0xc0062fbdd0, 0xc007187830, 0x1a680b5)
	runtime/map.go:1318 +0x3f fp=0xc00051b448 sp=0xc00051b410 pc=0x46651f
reflect.Value.MapIndex(0x18844c0, 0xc0062fbdd0, 0x15, 0x180a0a0, 0xc007187830, 0x98, 0x0, 0xc000680000, 0x0)
	reflect/value.go:1189 +0x16e fp=0xc00051b4c0 sp=0xc00051b448 pc=0x4c5f8e
github.com/go-openapi/jsonpointer.getSingleImpl(0x18844c0, 0xc0062fbdd0, 0xc00081b23e, 0x22, 0xc00011f460, 0x18844c0, 0xc0062fbdd0, 0x19, 0x0, 0x0)
	github.com/go-openapi/[email protected]/pointer.go:136 +0x7ba fp=0xc00051b590 sp=0xc00051b4c0 pc=0xb9bf9a
github.com/go-openapi/jsonpointer.(*Pointer).get(0xc004f3e4a0, 0x18844c0, 0xc0062fbdd0, 0xc00011f460, 0xc000643680, 0xc004abd1a0, 0x30, 0xc00051b690, 0xbed7a5)
	github.com/go-openapi/[email protected]/pointer.go:230 +0xda fp=0xc00051b618 sp=0xc00051b590 pc=0xb9cdfa
github.com/go-openapi/jsonpointer.(*Pointer).Get(...)
	github.com/go-openapi/[email protected]/pointer.go:95
sigs.k8s.io/kustomize/kyaml/openapi.resolve(0x1a21d00, 0x288ea60, 0xc004f3e498, 0xc00082fb00, 0x1, 0x1)
	sigs.k8s.io/kustomize/[email protected]/openapi/openapi.go:551 +0x5e fp=0xc00051b6a0 sp=0xc00051b618 pc=0xbee0de
sigs.k8s.io/kustomize/kyaml/openapi.Resolve(...)
	sigs.k8s.io/kustomize/[email protected]/openapi/openapi.go:235
sigs.k8s.io/kustomize/kyaml/openapi.(*ResourceSchema).Field(0xc0006c0198, 0xc00713e090, 0x8, 0x1)
	sigs.k8s.io/kustomize/[email protected]/openapi/openapi.go:348 +0x199 fp=0xc00051b918 sp=0xc00051b6a0 pc=0xbecd99
sigs.k8s.io/kustomize/kyaml/yaml/walk.Walker.walkMap(0x1cad950, 0xc004dbd500, 0xc0006c0198, 0xc0071876d0, 0x2, 0x2, 0xc0071876e0, 0x1, 0x1, 0x0, ...)
	sigs.k8s.io/kustomize/[email protected]/yaml/walk/map.go:59 +0xf05 fp=0xc00051bba8 sp=0xc00051b918 pc=0xc16725
sigs.k8s.io/kustomize/kyaml/yaml/walk.Walker.Walk(0x1cad950, 0xc004dbd500, 0xc0006c0198, 0xc0071876d0, 0x2, 0x2, 0xc0071876e0, 0x1, 0x1, 0x0, ...)
	sigs.k8s.io/kustomize/[email protected]/yaml/walk/walk.go:65 +0x625 fp=0xc00051be48 sp=0xc00051bba8 pc=0xc178a5
sigs.k8s.io/kustomize/kyaml/yaml/walk.Walker.walkMap(0x1cad920, 0x28bd7a8, 0xc0082d3f68, 0xc007187420, 0x2, 0x2, 0x0, 0x0, 0x0, 0x0, ...)
	sigs.k8s.io/kustomize/[email protected]/yaml/walk/map.go:72 +0x7b9 fp=0xc00051c0d8 sp=0xc00051be48 pc=0xc15fd9
sigs.k8s.io/kustomize/kyaml/yaml/walk.Walker.Walk(0x1cad920, 0x28bd7a8, 0x0, 0xc007187420, 0x2, 0x2, 0x0, 0x0, 0x0, 0x0, ...)
	sigs.k8s.io/kustomize/[email protected]/yaml/walk/walk.go:65 +0x625 fp=0xc00051c378 sp=0xc00051c0d8 pc=0xc178a5
sigs.k8s.io/kustomize/kyaml/yaml/merge2.Merge(0xc004dfee00, 0xc001fdc340, 0x1, 0xc0082d3f60, 0x2, 0xc000680000)
	sigs.k8s.io/kustomize/[email protected]/yaml/merge2/merge2.go:20 +0x105 fp=0xc00051c450 sp=0xc00051c378 pc=0xc18e25
sigs.k8s.io/kustomize/api/filters/patchstrategicmerge.Filter.Filter(0xc004dfee00, 0xc0082d3f60, 0x1, 0x1, 0x0, 0xc00705ff28, 0xc00051c540, 0xc25489, 0xc0083a5d48)
	sigs.k8s.io/kustomize/[email protected]/filters/patchstrategicmerge/patchstrategicmerge.go:23 +0x8a fp=0xc00051c4c8 sp=0xc00051c450 pc=0xc1b48a
sigs.k8s.io/kustomize/api/resource.(*Resource).ApplyFilter(0xc001fe6000, 0x1c7c840, 0xc004dfee00, 0x0, 0x0)
	sigs.k8s.io/kustomize/[email protected]/resource/resource.go:488 +0xb2 fp=0xc00051c550 sp=0xc00051c4c8 pc=0xc26232
sigs.k8s.io/kustomize/api/resource.(*Resource).ApplySmPatch(0xc001fe6000, 0xc004deb590, 0x4, 0xc00705ff2d)
	sigs.k8s.io/kustomize/[email protected]/resource/resource.go:473 +0xcb fp=0xc00051c5c0 sp=0xc00051c550 pc=0xc2604b
sigs.k8s.io/kustomize/api/resmap.(*resWrangler).ApplySmPatch(0xc0007ee798, 0xc0082d2990, 0xc003ec22d0, 0xc0082d2990, 0x0)
	sigs.k8s.io/kustomize/[email protected]/resmap/reswrangler.go:593 +0x2ec fp=0xc00051c7a8 sp=0xc00051c5c0 pc=0xc2df8c
sigs.k8s.io/kustomize/api/builtins.(*PatchTransformerPlugin).transformStrategicMerge(0xc003ec2230, 0x1cd2a78, 0xc0007ee798, 0xc003ec22d0, 0xc0088acfc0, 0xe0)
	sigs.k8s.io/kustomize/[email protected]/builtins/PatchTransformer.go:93 +0x136 fp=0xc00051c918 sp=0xc00051c7a8 pc=0xc47316
sigs.k8s.io/kustomize/api/builtins.(*PatchTransformerPlugin).Transform(0xc003ec2230, 0x1cd2a78, 0xc0007ee798, 0x7fd558b225b8, 0xe0)
	sigs.k8s.io/kustomize/[email protected]/builtins/PatchTransformer.go:74 +0x50 fp=0xc00051c968 sp=0xc00051c918 pc=0xc47150
sigs.k8s.io/kustomize/api/internal/target.(*multiTransformer).transform(0xc00051ca78, 0x1cd2a78, 0xc0007ee798, 0xc003dd5501, 0xe)
	sigs.k8s.io/kustomize/[email protected]/internal/target/multitransformer.go:40 +0x79 fp=0xc00051c9b8 sp=0xc00051c968 pc=0xce1b59
sigs.k8s.io/kustomize/api/internal/target.(*multiTransformer).Transform(0xc00051ca78, 0x1cd2a78, 0xc0007ee798, 0xc003dd5520, 0xe)
	sigs.k8s.io/kustomize/[email protected]/internal/target/multitransformer.go:35 +0x85 fp=0xc00051c9f0 sp=0xc00051c9b8 pc=0xce1aa5
sigs.k8s.io/kustomize/api/internal/accumulator.(*ResAccumulator).Transform(...)
	sigs.k8s.io/kustomize/[email protected]/internal/accumulator/resaccumulator.go:141
sigs.k8s.io/kustomize/api/internal/target.(*KustTarget).runTransformers(0xc0005eaa00, 0xc000b0a1e0, 0x0, 0x0)
	sigs.k8s.io/kustomize/[email protected]/internal/target/kusttarget.go:270 +0x245 fp=0xc00051caa8 sp=0xc00051c9f0 pc=0xcdf145
sigs.k8s.io/kustomize/api/internal/target.(*KustTarget).accumulateTarget(0xc0005eaa00, 0xc000b0a1e0, 0xc00051cba8, 0xbed5c7, 0x0)
	sigs.k8s.io/kustomize/[email protected]/internal/target/kusttarget.go:195 +0x2b0 fp=0xc00051cb40 sp=0xc00051caa8 pc=0xcde170
sigs.k8s.io/kustomize/api/internal/target.(*KustTarget).AccumulateTarget(0xc0005eaa00, 0x0, 0xffffffffffffffff, 0x0)
	sigs.k8s.io/kustomize/[email protected]/internal/target/kusttarget.go:156 +0xce fp=0xc00051cb80 sp=0xc00051cb40 pc=0xcdde4e
sigs.k8s.io/kustomize/api/internal/target.(*KustTarget).makeCustomizedResMap(0xc0005eaa00, 0xc0006f7901, 0x0, 0x0, 0x1f)
	sigs.k8s.io/kustomize/[email protected]/internal/target/kusttarget.go:111 +0x2f fp=0xc00051cbb8 sp=0xc00051cb80 pc=0xcddaef
sigs.k8s.io/kustomize/api/internal/target.(*KustTarget).MakeCustomizedResMap(...)
	sigs.k8s.io/kustomize/[email protected]/internal/target/kusttarget.go:107
sigs.k8s.io/kustomize/api/krusty.(*Kustomizer).Run(0xc00051d1e0, 0xc0007beff0, 0x30, 0x0, 0x0, 0x0, 0x0)
	sigs.k8s.io/kustomize/[email protected]/krusty/kustomizer.go:82 +0x331 fp=0xc00051d170 sp=0xc00051cbb8 pc=0x16c43b1
github.com/fluxcd/flux2/pkg/manifestgen/install.build(0xc0007beff0, 0x30, 0xc0007be630, 0x2e, 0xc000128ff0, 0x7)
	github.com/fluxcd/[email protected]/pkg/manifestgen/install/manifests.go:143 +0x211 fp=0xc00051d230 sp=0xc00051d170 pc=0x16c8a91
github.com/fluxcd/flux2/pkg/manifestgen/install.Generate(0x1a839a1, 0x28, 0xc000128ff0, 0x7, 0x1a59b8e, 0xb, 0xc0007b6c80, 0x4, 0x4, 0x28bd7a8, ...)
	github.com/fluxcd/[email protected]/pkg/manifestgen/install/install.go:76 +0x22f fp=0xc00051d4c8 sp=0xc00051d230 pc=0x16c666f
github.com/fluxcd/terraform-provider-flux/pkg/provider.dataInstallRead(0x1cb0eb8, 0xc00093aba0, 0xc000148a80, 0x0, 0x0, 0xc000f34050, 0xc000583948, 0x40e158)
	github.com/fluxcd/terraform-provider-flux/pkg/provider/data_install.go:170 +0x765 fp=0xc00051d8e8 sp=0xc00051d4c8 pc=0x1730385
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).read(0xc0003383c0, 0x1cb0e48, 0xc0006c7c40, 0xc000148a80, 0x0, 0x0, 0x0, 0x0, 0x0)
	github.com/hashicorp/terraform-plugin-sdk/[email protected]/helper/schema/resource.go:297 +0x1ed fp=0xc00051d958 sp=0xc00051d8e8 pc=0xaffdad
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).ReadDataApply(0xc0003383c0, 0x1cb0e48, 0xc0006c7c40, 0xc000a97460, 0x0, 0x0, 0x0, 0xc000a97460, 0x0, 0x0)
	github.com/hashicorp/terraform-plugin-sdk/[email protected]/helper/schema/resource.go:498 +0xfd fp=0xc00051d9e0 sp=0xc00051d958 pc=0xb0185d
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*GRPCProviderServer).ReadDataSource(0xc0007a0828, 0x1cb0e48, 0xc0006c7c40, 0xc000a97240, 0xc0006c7c40, 0x40b9c5, 0x1942560)
	github.com/hashicorp/terraform-plugin-sdk/[email protected]/helper/schema/grpc_provider.go:1105 +0x4d6 fp=0xc00051dae0 sp=0xc00051d9e0 pc=0xaf9696
github.com/hashicorp/terraform-plugin-go/tfprotov5/server.(*server).ReadDataSource(0xc0004a70c0, 0x1cb0ef0, 0xc0006c7c40, 0xc0000e7a40, 0xc0004a70c0, 0xc00039d200, 0xc000561ba0)
	github.com/hashicorp/[email protected]/tfprotov5/server/server.go:247 +0xe5 fp=0xc00051db40 sp=0xc00051dae0 pc=0x9d0a25
github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_ReadDataSource_Handler(0x1997c40, 0xc0004a70c0, 0x1cb0ef0, 0xc00039d200, 0xc00093aa80, 0x0, 0x1cb0ef0, 0xc00039d200, 0xc000180a00, 0xf8)
	github.com/hashicorp/[email protected]/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:416 +0x214 fp=0xc00051dbb0 sp=0xc00051db40 pc=0x9c8d14
google.golang.org/grpc.(*Server).processUnaryRPC(0xc000290c40, 0x1cc5338, 0xc000501980, 0xc000740700, 0xc000289aa0, 0x2875310, 0x0, 0x0, 0x0)
	google.golang.org/[email protected]/server.go:1194 +0x52b fp=0xc00051de50 sp=0xc00051dbb0 pc=0x910d8b
google.golang.org/grpc.(*Server).handleStream(0xc000290c40, 0x1cc5338, 0xc000501980, 0xc000740700, 0x0)
	google.golang.org/[email protected]/server.go:1517 +0xd0c fp=0xc00051df68 sp=0xc00051de50 pc=0x914f4c
google.golang.org/grpc.(*Server).serveStreams.func1.2(0xc000183150, 0xc000290c40, 0x1cc5338, 0xc000501980, 0xc000740700)
	google.golang.org/[email protected]/server.go:859 +0xab fp=0xc00051dfb8 sp=0xc00051df68 pc=0x922f0b
runtime.goexit()
	runtime/asm_amd64.s:1371 +0x1 fp=0xc00051dfc0 sp=0xc00051dfb8 pc=0x46c761
created by google.golang.org/grpc.(*Server).serveStreams.func1
	google.golang.org/[email protected]/server.go:857 +0x1fd

goroutine 1 [select]:
github.com/hashicorp/go-plugin.Serve(0xc000587e90)
	github.com/hashicorp/[email protected]/server.go:468 +0x954
github.com/hashicorp/terraform-plugin-sdk/v2/plugin.Serve(0xc0002895f0)
	github.com/hashicorp/terraform-plugin-sdk/[email protected]/plugin/serve.go:82 +0x22d
main.main()
	github.com/fluxcd/terraform-provider-flux/main.go:26 +0x45

goroutine 14 [chan receive]:
k8s.io/klog/v2.(*loggingT).flushDaemon(0x288e140)
	k8s.io/klog/[email protected]/klog.go:1169 +0x8b

Error: The terraform-provider-flux_v0.1.3 plugin crashed!

This is always indicative of a bug within the plugin. It would be immensely
helpful if you could report the crash with the plugin's maintainers so that it
can be fixed. The output above should help diagnose the issue.

johnae avatar Apr 15 '21 13:04 johnae

This is due to an upstream bug https://github.com/kubernetes-sigs/kustomize/issues/3659

stefanprodan avatar Apr 15 '21 13:04 stefanprodan

@johnae I have not been seeing the same types of crashes but it may be due to what you are saying, that more clusters in the same state may cause issues. Personally when I use the provider I will only create a single cluster per state. Are you seeing any of these issues when you only create a single cluster?

Lets leave this issue open until it is fixed upstream.

phillebaba avatar Apr 26 '21 21:04 phillebaba

@phillebaba I think you're right. We've not seen any crashes, recently anyway, when creating only a single cluster.

johnae avatar Apr 27 '21 08:04 johnae

Hmmm ok I will try to reproduce this error during this week, if the case is that this is an upstream bug with no temporary fix I will add a note that this is a current limitation.

phillebaba avatar Apr 27 '21 11:04 phillebaba