rkubelog icon indicating copy to clipboard operation
rkubelog copied to clipboard

CPU and memory increasing linearly

Open rpradeepam opened this issue 4 years ago • 10 comments

We are running a cluster on EKS, Kubernetes version 1.18. Tried rkubelog since we were already using Papertrail before we moved to K8S and it was a breeze to set up. Using rkubelog:r17

Noticed recently that rkubelog was using a large amount of CPU after the T3 instance it was running on had 0 CPU credits for a long time. Restarted the deployment to find the below usage pattern

rkubelog-metrics

The previous pod was using close to 1000 mCPUs before it was restarted.

Is this pattern expected? or have I mis-configured something.

Could be related to this boz/kail#18 , though its an old issue.

rpradeepam avatar Feb 09 '21 04:02 rpradeepam

hello @rpradeepam,

Thx for reporting this issue.

Can you please share the generated yaml you used to deploy rkubelog? (please remember to hide secrets, if any in the generate config)

girishranganathan avatar Feb 09 '21 16:02 girishranganathan

I have stopped using rkubelog currently and moved to a fluentd based solution, which was much more tedious to setup.

Here is the yaml. Would be nice if it can work stably. Its such a convenient solution.

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "8"
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"app":"rkubelog"},"name":"rkubelog","namespace":"kube-system"},"spec":{"replicas":1,"selector":{"matchLabels":{"app":"rkubelog"}},"template":{"metadata":{"labels":{"app":"rkubelog","kail.ignore":"true"}},"spec":{"containers":[{"args":["--ns","qa","--ns","staging","--ns","prod"],"command":["/app/rkubelog"],"env":[{"name":"PAPERTRAIL_PROTOCOL","valueFrom":{"secretKeyRef":{"key":"PAPERTRAIL_PROTOCOL","name":"logging-secret"}}},{"name":"PAPERTRAIL_HOST","valueFrom":{"secretKeyRef":{"key":"PAPERTRAIL_HOST","name":"logging-secret"}}},{"name":"PAPERTRAIL_PORT","valueFrom":{"secretKeyRef":{"key":"PAPERTRAIL_PORT","name":"logging-secret"}}},{"name":"LOGGLY_TOKEN","valueFrom":{"secretKeyRef":{"key":"LOGGLY_TOKEN","name":"logging-secret"}}}],"image":"quay.io/solarwinds/rkubelog:r17","imagePullPolicy":"Always","name":"rkubelog"}],"serviceAccountName":"rkubelog-sa"}}}}
  creationTimestamp: "2020-12-16T07:06:03Z"
  generation: 11
  labels:
    app: rkubelog
  name: rkubelog
  namespace: kube-system
  resourceVersion: "66448575"
  selfLink: /apis/apps/v1/namespaces/kube-system/deployments/rkubelog
  uid: cb2de357-36f0-40bb-b463-16ee446caea6
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: rkubelog
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: rkubelog
        kail.ignore: "true"
    spec:
      containers:
      - args:
        - --ns
        - qa
        - --ns
        - staging
        - --ns
        - prod
        command:
        - /app/rkubelog
        env:
        - name: PAPERTRAIL_PROTOCOL
          valueFrom:
            secretKeyRef:
              key: PAPERTRAIL_PROTOCOL
              name: logging-secret
        - name: PAPERTRAIL_HOST
          valueFrom:
            secretKeyRef:
              key: PAPERTRAIL_HOST
              name: logging-secret
        - name: PAPERTRAIL_PORT
          valueFrom:
            secretKeyRef:
              key: PAPERTRAIL_PORT
              name: logging-secret
        - name: LOGGLY_TOKEN
          valueFrom:
            secretKeyRef:
              key: LOGGLY_TOKEN
              name: logging-secret
        image: quay.io/solarwinds/rkubelog:r17
        imagePullPolicy: Always
        name: rkubelog
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      nodeSelector:
        profile: operations
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: rkubelog-sa
      serviceAccountName: rkubelog-sa
      terminationGracePeriodSeconds: 30
status:
  conditions:
  - lastTransitionTime: "2020-12-16T07:06:03Z"
    lastUpdateTime: "2021-02-08T13:35:55Z"
    message: ReplicaSet "rkubelog-f6f6957b5" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  - lastTransitionTime: "2021-02-09T15:19:57Z"
    lastUpdateTime: "2021-02-09T15:19:57Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  observedGeneration: 11

rpradeepam avatar Feb 10 '21 17:02 rpradeepam

@rpradeepam, thx for sharing your config. Can you please tell me if you are using both Papertrail and Loggly for shipping your logs?

girishranganathan avatar Feb 10 '21 17:02 girishranganathan

@girishranganathan we are using only Papertrail. LOGGLY_TOKEN is set to ''.

rpradeepam avatar Feb 11 '21 03:02 rpradeepam

In that case, can you please try the latest version with r18 image tag? You just have to switch the image tag to :r18. Rest of your existing yaml should work the way it is. Also, if you would, please remove this section all together:

        - name: LOGGLY_TOKEN
          valueFrom:
            secretKeyRef:
              key: LOGGLY_TOKEN
              name: logging-secret

girishranganathan avatar Feb 11 '21 17:02 girishranganathan

Tried the above mentioned changes. Seems to be following the same pattern.

Screen Shot 2021-02-13 at 11 50 52 AM

Also, would be nice if the host and program were reported properly on papertrail. r17 was reporting host as <namespace>/<name>, that's moved to the log itself now.

rpradeepam avatar Feb 13 '21 06:02 rpradeepam

@rpradeepam, can you please take this image for a spin: quay.io/solarwinds/rkubelog:r19-b2? It should, hopefully, consume less memory than the image you are using.

girishranganathan avatar Mar 02 '21 00:03 girishranganathan

Hi, thank you for the fix. Seems to be working fine now. CPU/memory values look stable.

rpradeepam avatar Mar 02 '21 15:03 rpradeepam

Is the fix you mentioned in https://github.com/solarwinds/rkubelog/issues/21#issuecomment-788457196 @girishranganathan included in the current master? Which commit is the fix?

jtomaszewski avatar Jul 15 '21 09:07 jtomaszewski

I'm having the same issue even after using quay.io/solarwinds/rkubelog:r19-b2, but in my case when I turn on the rkubelog deployment, my entire cluster's CPU goes thru the roof: BEFORE running the deployment: before A few minutes AFTER: after Leaving this on would eventually consume the CPU resources of the entire cluster.

It seems as if the API server starts consuming a lot of CPU.

Kubernetes version 1.18.19

RicardoPineda avatar Sep 10 '21 16:09 RicardoPineda