longhorn icon indicating copy to clipboard operation
longhorn copied to clipboard

[IMPROVEMENT] Dump NFS ganesha logs to pod stdout

Open c3y1huang opened this issue 4 years ago • 2 comments

Is your feature request related to a problem? Please describe. Currently, ganesha dumps logs to /tmp/ganesha.log file inside the share-manager container. It could be difficult to retrieve the log file if the pod goes into a crash loop.

Describe the solution you'd like Able to see ganesha log included when using kubectl logs

Describe alternatives you've considered none

Additional context https://github.com/longhorn/longhorn/issues/2284

c3y1huang avatar Mar 19 '21 05:03 c3y1huang

This will help us to improve the support bundle enrichment w/ more helpful logs.

innobead avatar Mar 19 '21 07:03 innobead

If I add output log with stdout in manager shutdown function to implement gracefully shutdown, am I right?

https://github.com/longhorn/longhorn-share-manager/pkg/server/share_manager.go

weizhe0422 avatar Apr 11 '22 11:04 weizhe0422

@weizhe0422 Remember updating the Zenhub pipeline when working on an issue, so right now, it's time to move this to ready-for-testing and provide the testing steps in the ready-for-testing checklist comment which will be created when the issue is moved to review pipeline. Go check all items in the checklist, but ignore some if they are not valid in this task.

image

innobead avatar Aug 23 '22 10:08 innobead

Pre Ready-For-Testing Checklist

  • [x] Where is the reproduce steps/test steps documented? The reproduce steps/test steps are at:

    1. Create an rwx volumes and attach it to one of the nodes.
    2. Create a PVC/PV for this volumes from Step 1.
    3. Deploy the POD and assign its PVC name from step 2.
    4. Type kubectl get pods -n <your_namespace> and it will create share-manager POD whose name like share-manager-<Longhorn_volume_name>
    5. Type kubectl logs share-manager-<Longhorn_volume_name> -n <your_namespace>
    6. It will show off the nfs-ganesha log in the console, and it also appear in support bundle. image
  • [x] Does the PR include the explanation for the fix or the feature? https://github.com/longhorn/longhorn-share-manager/pull/38#issue-1338698544

    1. Add LOG block in defaultConfig
    2. Remove defaultLogFile path as parameter
    3. Output nfs-ganesha log to Stdout if launched by longhorn-share-manager, otherwise it will record in /tmp/ganesha.log

longhorn-io-github-bot avatar Aug 23 '22 10:08 longhorn-io-github-bot

Verified on master-head 20220913

The test steps

  1. Install Longhorn Master Ref.(install-with-kubectl)
  2. Create rwx volume and attach to pod. Deploy pod “mypod1” kubectl apply -f pod_with_pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: longhorn-volv-pvc
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: longhorn
  resources:
    requests:
      storage: 0.5Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: volume-test
  namespace: default
spec:
  restartPolicy: Always
  containers:
  - name: volume-test
    image: nginx
    imagePullPolicy: IfNotPresent
    livenessProbe:
      exec:
        command:
          - ls
          - /data/lost+found
      initialDelaySeconds: 5
      periodSeconds: 5
    volumeMounts:
    - name: volv
      mountPath: /data
    ports:
    - containerPort: 80
  volumes:
  - name: volv
    persistentVolumeClaim:
      claimName: longhorn-volv-pvc
  1. Check pod volume-test and share-manager-pvc-9174f488-1425-421f-9017-c45afc178000 status

**Result Failed longhorn-support-bundle_06960eb3-f5c9-490f-9f38-0c35e4e9ddba_2022-09-13T12-04-23Z.zip **

  1. share-manager-pvc and volume-test pod failed to Running Screenshot_20220913_200447

C.C @weizhe0422

roger-ryao avatar Sep 13 '22 12:09 roger-ryao

I tried deploying with longhorn.yaml in the master branch and I can start share-manger without restarting.

image

weizhe0422 avatar Sep 13 '22 14:09 weizhe0422

Verified on master-head 20220914

The test steps

  1. Install Longhorn Master Ref.(install-with-kubectl)
  2. Create rwx volume and attach to pod. Deploy pod “mypod1” kubectl apply -f pod_with_pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: longhorn-volv-pvc
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: longhorn
  resources:
    requests:
      storage: 0.5Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: volume-test
  namespace: default
spec:
  restartPolicy: Always
  containers:
  - name: volume-test
    image: nginx
    imagePullPolicy: IfNotPresent
    livenessProbe:
      exec:
        command:
          - ls
          - /data/lost+found
      initialDelaySeconds: 5
      periodSeconds: 5
    volumeMounts:
    - name: volv
      mountPath: /data
    ports:
    - containerPort: 80
  volumes:
  - name: volv
    persistentVolumeClaim:
      claimName: longhorn-volv-pvc
  1. Check pod volume-test and share-manager-pvc-9c670e3f-1c35-4734-b788-84f320029ea0 status

Result Failed longhorn-support-bundle_a85ccb2d-8613-45b7-8154-0e216d166f02_2022-09-14T04-39-36Z.zip

  1. We couldn't find out the nfs-ganesha log in the console and in support bundle. We still could find out NFS ganesha in the pod's /tmp/ganesha.log Screenshot_20220914_143203 Screenshot_20220914_143632

C.C @weizhe0422

roger-ryao avatar Sep 14 '22 06:09 roger-ryao

Upon closer inspection, I found that it was because I didn't fix completely in the first time, so I send the 2nd PR to fix, and I think it was not included in v1_20220825.

image

I try to deploy and verify the version of master-head, and it should be work correctly.

image

weizhe0422 avatar Sep 14 '22 07:09 weizhe0422

@weizhe0422 For any change to individual components like share manager, instance manager, and backing image manager, please remember asking for building a new image for that, then update related manifests in longhorn and longhorn-manager repos with the new image.

I just triggered https://github.com/longhorn/longhorn-share-manager/releases/tag/v1_20220914, so when the image is ready (you can check drone-push), please create PRs to update the manifests.

innobead avatar Sep 14 '22 08:09 innobead

Self-test longhornio/longhorn-share-manager:v1_20220914 works in my environment. image

weizhe0422 avatar Sep 14 '22 12:09 weizhe0422

Verified on master-head 20220915

The test steps

  1. Install Longhorn Master Ref.(install-with-kubectl)
  2. Create rwx volume kubectl apply -f create_5vol_rwx.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: vol-0
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: longhorn
  resources:
    requests:
      storage: 0.5Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: vol-1
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: longhorn
  resources:
    requests:
      storage: 0.5Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: vol-2
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: longhorn
  resources:
    requests:
      storage: 0.5Gi      
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: vol-3
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: longhorn
  resources:
    requests:
      storage: 0.5Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: vol-4
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: longhorn
  resources:
    requests:
      storage: 0.5Gi      
  1. Volume attach to the pod. Deploy pod “mypod1” kubectl apply -f pod_mount_5vol.yaml
kind: Pod
apiVersion: v1
metadata:
  name: mypod1
  namespace: default
spec:
  containers:
    - name: testfrontend
      image: nginx
      volumeMounts:
      - mountPath: "/data0/"
        name: vol-0
      - mountPath: "/data1/"
        name: vol-1
      - mountPath: "/data2/"
        name: vol-2
      - mountPath: "/data3/"
        name: vol-3
      - mountPath: "/data4/"
        name: vol-4                                
  volumes:
    - name: vol-0
      persistentVolumeClaim:
        claimName: vol-0
    - name: vol-1
      persistentVolumeClaim:
        claimName: vol-1        
    - name: vol-2
      persistentVolumeClaim:
        claimName: vol-2
    - name: vol-3
      persistentVolumeClaim:
        claimName: vol-3
    - name: vol-4
      persistentVolumeClaim:
        claimName: vol-4      
  1. Check pod mypod1 and share-manager-pvc-XXXXX status

Result Passed

  1. After executing kubectl -n longhorn-system logs share-manager-<Longhorn_volume_name> or kubetail share-manager |grep nfs-ganesha, we would find the nfs-ganesha log in the console and in support bundle

Screenshot_20220915_104745 Screenshot_20220915_104550

longhorn-support-bundle_ba58f018-f125-4e45-82aa-2ff2eb64b58b_2022-09-15T02-43-32Z.zip

roger-ryao avatar Sep 15 '22 03:09 roger-ryao