microservices-demo
microservices-demo copied to clipboard
adservice, cartservice and loadgenerator restarting on arm cluster in the loop
Describe the bug
adservice, cartservice and loadgenerator restarting on kind cluster in the loop
To Reproduce
clone the repo kind create cluster kubectl apply -f -f release/kubernetes-manifests.yaml
kubectl get pods
get pod
NAME READY STATUS RESTARTS AGE
adservice-5464cc8db4-w9lsm 0/1 CrashLoopBackOff 5 (37s ago) 6m41s
cartservice-6458db7c7c-wz4rd 0/1 CrashLoopBackOff 5 (63s ago) 6m41s
checkoutservice-55b497bfb8-wb9hk 1/1 Running 0 6m42s
currencyservice-6f868d85d8-7t4vj 1/1 Running 1 (115s ago) 6m41s
emailservice-5cf5fc5898-h2twr 1/1 Running 0 6m42s
frontend-bfdf66596-gdq6g 1/1 Running 0 6m42s
loadgenerator-6568b868f-vvwxm 0/1 CrashLoopBackOff 5 (105s ago) 6m41s
paymentservice-5ff68d9c7d-8w2fw 1/1 Running 0 6m42s
productcatalogservice-5b9c9f6488-dgtst 1/1 Running 0 6m42s
recommendationservice-c58857d6-9sq85 1/1 Running 0 6m42s
redis-cart-79b899577-tg5rv 1/1 Running 0 6m41s
shippingservice-6f65f85b8b-6c72r 1/1 Running 0 6m41s
Logs
k logs adservice-5464cc8db4-w9lsm
Could not create logging file: Read-only file system
COULD NOT CREATE A LOGGINGFILE 20230720-052533.1!Could not create logging file: Read-only file system
COULD NOT CREATE A LOGGINGFILE 20230720-052535.1!Could not create logging file: Read-only file system
COULD NOT CREATE A LOGGINGFILE 20230720-052535.1!E0720 05:25:35.280431 36 throttler_api.cc:92] GRPC: src/core/lib/security/credentials/alts/check_gcp_environment.cc:60 BIOS data file cannot be opened.
E0720 05:25:35.572489 36 throttler_api.cc:92] GRPC: src/core/lib/security/credentials/google_default/google_default_credentials.cc:351 Could not create google default credentials: {"created":"@1689830735.278256041","description":"Failed to create Google credentials","file":"src/core/lib/security/credentials/google_default/google_default_credentials.cc","file_line":284,"referenced_errors":[{"created":"@1689830735.279027375","description":"creds_path unset","file":"src/core/lib/security/credentials/google_default/google_default_credentials.cc","file_line":229},{"created":"@1689830735.280197583","description":"Failed to load file","file":"src/core/lib/iomgr/load_file.cc","file_line":71,"filename":"//.config/gcloud/application_default_credentials.json","referenced_errors":[{"created":"@1689830735.280058666","description":"No such file or directory","errno":2,"file":"src/core/lib/iomgr/load_file.cc","file_line":45,"os_error":"No such file or directory","syscall":"fopen"}]}]}
E0720 05:25:35.573143 36 throttler_api.cc:116] Failed to get Google default credentials
E0720 05:25:35.575800 49 native.cc:42] Could not open maps file: /proc/self/maps
E0720 05:25:35.576072 49 throttler_api.cc:297] Profiler API is not initialized, stop profiling
k logs -f cartservice-6458db7c7c-wz4rd
info: Microsoft.Hosting.Lifetime[14]
Now listening on: http://[::]:7070
info: Microsoft.Hosting.Lifetime[0]
Application started. Press Ctrl+C to shut down.
info: Microsoft.Hosting.Lifetime[0]
Hosting environment: Production
info: Microsoft.Hosting.Lifetime[0]
Content root path: /app
fail: Microsoft.AspNetCore.Server.Kestrel[13]
Connection id "0HMS8S3AON72S", Request id "0HMS8S3AON72S:00000001": An unhandled exception was thrown by the application.
Microsoft.AspNetCore.Routing.RouteCreationException: An error occurred while trying to create an instance of 'Grpc.AspNetCore.Server.Model.Internal.GrpcUnimplementedConstraint'.
---> System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.
---> System.NullReferenceException: Object reference not set to an instance of an object.
at InvokeStub_GrpcUnimplementedConstraint..ctor(Object, Object, IntPtr*)
at System.Reflection.ConstructorInvoker.Invoke(Object, IntPtr*, BindingFlags)
--- End of inner exception stack trace ---
at System.Reflection.ConstructorInvoker.Invoke(Object, IntPtr*, BindingFlags)
at System.Reflection.RuntimeConstructorInfo.Invoke(BindingFlags, Binder, Object[], CultureInfo)
at System.Reflection.ConstructorInfo.Invoke(Object[] parameters)
at Microsoft.AspNetCore.Routing.ParameterPolicyActivator.CreateParameterPolicy(IServiceProvider, Type, String)
at Microsoft.AspNetCore.Routing.ParameterPolicyActivator.ResolveParameterPolicy[T](IDictionary`2, IServiceProvider, String, String& )
--- End of inner exception stack trace ---
at Microsoft.AspNetCore.Routing.ParameterPolicyActivator.ResolveParameterPolicy[T](IDictionary`2, IServiceProvider, String, String& )
at Microsoft.AspNetCore.Routing.DefaultParameterPolicyFactory.Create(RoutePatternParameterPart , String)
at Microsoft.AspNetCore.Routing.ParameterPolicyFactory.Create(RoutePatternParameterPart , RoutePatternParameterPolicyReference)
at Microsoft.AspNetCore.Routing.Matching.DfaMatcherBuilder.DfaBuilderWorker.AddParentsWithMatchingLiteralConstraints(List`1, DfaNode, RoutePatternParameterPart, IReadOnlyList`1)
at Microsoft.AspNetCore.Routing.Matching.DfaMatcherBuilder.DfaBuilderWorker.ProcessSegment(RouteEndpoint, List`1, List`1, RoutePatternPathSegment)
at Microsoft.AspNetCore.Routing.Matching.DfaMatcherBuilder.DfaBuilderWorker.ProcessLevel(Int32)
at Microsoft.AspNetCore.Routing.Matching.DfaMatcherBuilder.BuildDfaTree(Boolean )
at Microsoft.AspNetCore.Routing.Matching.DfaMatcherBuilder.Build()
at Microsoft.AspNetCore.Routing.Matching.DataSourceDependentMatcher.CreateMatcher(IReadOnlyList`1)
at Microsoft.AspNetCore.Routing.DataSourceDependentCache`1.Initialize()
at System.Threading.LazyInitializer.EnsureInitializedCore[T](T& , Boolean&, Object& , Func`1)
at System.Threading.LazyInitializer.EnsureInitialized[T](T& , Boolean&, Object& , Func`1)
at Microsoft.AspNetCore.Routing.DataSourceDependentCache`1.EnsureInitialized()
at Microsoft.AspNetCore.Routing.Matching.DataSourceDependentMatcher..ctor(EndpointDataSource, Lifetime, Func`1)
at Microsoft.AspNetCore.Routing.Matching.DfaMatcherFactory.CreateMatcher(EndpointDataSource)
at Microsoft.AspNetCore.Routing.EndpointRoutingMiddleware.InitializeCoreAsync()
--- End of stack trace from previous location ---
at Microsoft.AspNetCore.Routing.EndpointRoutingMiddleware.<Invoke>g__AwaitMatcher|8_0(EndpointRoutingMiddleware, HttpContext, Task`1)
at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication`1)
Screenshots
Environment
Mac OS 13.4.1 (22F82) kind v0.18.0 go1.20.2 darwin/arm64 docker version Client: Cloud integration: v1.0.35 Version: 24.0.2 API version: 1.43 Go version: go1.20.4 Git commit: cb74dfc Built: Thu May 25 21:51:16 2023 OS/Arch: darwin/arm64 Context: desktop-linux
Server: Docker Desktop 4.21.1 (114176) Engine: Version: 24.0.2 API version: 1.43 (minimum version 1.12) Go version: go1.20.4 Git commit: 659604f Built: Thu May 25 21:50:59 2023 OS/Arch: linux/arm64 Experimental: false containerd: Version: 1.6.21 GitCommit: 3dce8eb055cbb6872793272b4f20ed16117344f8 runc: Version: 1.1.7 GitCommit: v1.1.7-0-g860f061 docker-init: Version: 0.19.0 GitCommit: de40ad0
Additional context
Exposure
I dig a bit deeper it seems resource constraints are too tight for the adservice and cartservice. Increasing CPU limit to 800m helps to move the needle but it takes long time before the service becomes reachable (~2minutes) increasing limit to the whole vCPU makes it is faster to start. IDK why it takes so much resources during startup time on kind.
Another important piece of information that I might miss before I'm running all this on apple m2.
ghrr 🤦♂️ I just had a time to dig more on it and it seems both adservice
and cartservice
images built using amd64 images. Of course, they'll be slow and require more resources on arm machine.
Can we build multi-arch images?
@elinesterov I am facing the same issue. Did you get any fix for this?
pod/adservice-76b59c7744-bblh9 0/1 Pending 0 11m pod/cartservice-79ffddbfc9-7mklz 0/1 Pending 0 11m pod/checkoutservice-67b7cc98bd-lt2wp 1/1 Running 0 11m pod/currencyservice-86f65c677b-795r4 1/1 Running 0 11m pod/emailservice-5b9b6b4978-wg67g 1/1 Running 0 11m pod/frontend-758559b46-thtsn 1/1 Running 0 11m pod/loadgenerator-7f5bb4f549-rpgcm 0/1 Pending 0 11m pod/paymentservice-84d58ff866-lllzj 1/1 Running 0 11m pod/productcatalogservice-65bd9bb7dd-rvrzk 1/1 Running 0 11m pod/recommendationservice-5c5f746db6-227j9 1/1 Running 0 11m pod/redis-cart-65f8cb8d5f-jsqwb 1/1 Running 0 11m pod/shippingservice-b7b489f4b-vzd8t 1/1 Running 0 11m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/adservice ClusterIP 172.16.0.68
NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/adservice 0/1 1 0 11m deployment.apps/cartservice 0/1 1 0 11m deployment.apps/checkoutservice 1/1 1 1 11m deployment.apps/currencyservice 1/1 1 1 11m deployment.apps/emailservice 1/1 1 1 11m deployment.apps/frontend 1/1 1 1 11m deployment.apps/loadgenerator 0/1 1 0 11m deployment.apps/paymentservice 1/1 1 1 11m deployment.apps/productcatalogservice 1/1 1 1 11m deployment.apps/recommendationservice 1/1 1 1 11m deployment.apps/redis-cart 1/1 1 1 11m deployment.apps/shippingservice 1/1 1 1 11m
NAME DESIRED CURRENT READY AGE replicaset.apps/adservice-76b59c7744 1 1 0 11m replicaset.apps/cartservice-79ffddbfc9 1 1 0 11m replicaset.apps/checkoutservice-67b7cc98bd 1 1 1 11m replicaset.apps/currencyservice-86f65c677b 1 1 1 11m replicaset.apps/emailservice-5b9b6b4978 1 1 1 11m replicaset.apps/frontend-758559b46 1 1 1 11m replicaset.apps/loadgenerator-7f5bb4f549 1 1 0 11m replicaset.apps/paymentservice-84d58ff866 1 1 1 11m replicaset.apps/productcatalogservice-65bd9bb7dd 1 1 1 11m replicaset.apps/recommendationservice-5c5f746db6 1 1 1 11m replicaset.apps/redis-cart-65f8cb8d5f 1 1 1 11m replicaset.apps/shippingservice-b7b489f4b 1 1 1 11m
It says below message "Reason Cannot schedule pods: No preemption victims found for incoming pod."
kind: "Event" message: "0/3 nodes are available: 3 Insufficient cpu. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod."
Which sound strange as GKE nodes have good amount of CPUs on it
hi, my apologizes for the delayed response. We are currently do not consider kind clusters as possible deployment destination and do not validate the demo application on the kind clusters. The resource constraints that workload manifest define aim to ensure correct operation of the demo and can come in conflict with the available resources of the kind clusters. Note that you can remove load generator from deploying if you are using kustomize configurations. If you run your kind cluster on Mac, you might want to build images for Mac architecture to reduce overhead required to run cross-architecture containers. Unfortunately, the project does not support it ootb (#1448).
Fixed by changing node machine type thanks
@minherz
my apologizes for the delayed response. We are currently do not consider kind clusters as possible deployment destination and do not validate the demo application on the kind clusters. As you see in my later comments, the problem is not in the kind cluster but rather in the arch of images. Updating limits would help a bit but there is definitely an issue of running amd64 arch image with c# application on arm64 machine.
If you run your kind cluster on Mac, you might want to build images for Mac architecture to reduce overhead required to run cross-architecture containers.
You cannot do this because, in your docker files, you pin to amd64 arch images only.
For instance: FROM eclipse-temurin:19.0.1_10-jre-alpine@sha256:a75ea64f676041562cd7d3a54a9764bbfb357b2bf1bebf46e2af73e62d32e36c
is clearly
amd64 only image.
You can use eclipse-temurin:19
or eclipse-temurin:19.0.1_10-jre
(non-alpine) to build matriarch images.
Closing as a duplicate of https://github.com/GoogleCloudPlatform/microservices-demo/issues/622