ksniff
ksniff copied to clipboard
'kubectl sniff' command returning 139 exit/error code during execution. RCA required for failed attempt at packet capture so that workaround can be identified.
In the same environment and same Kubernetes cluster kubectl sniff works for one pod and does not work for another. Evidence below. I am unable to understand the root cause behind exit/error code 139. Can anyone help regarding this please and a possible workaround
Failure Scenario : For first POD
root@node1:/# kubectl krew version OPTION VALUE GitTag v0.4.4 GitCommit 343e657 IndexURI https://github.com/kubernetes-sigs/krew-index.git BasePath /root/.krew IndexPath /root/.krew/index/default InstallPath /root/.krew/store BinPath /root/.krew/bin DetectedPlatform linux/amd64
root@node1:/# kubectl get pods -n Test-upf1 NAME READY STATUS RESTARTS AGE upf-5896cf6b4c-9shws 3/3 Running 0 15d
root@node1:/# kubectl get pods/upf-5896cf6b4c-9shws -o jsonpath='{.spec.containers[*].name}' -n Test-upf1 upfsp upffp upfrsyslog
root@node1:/# kubectl sniff upf-5896cf6b4c-9shws -n Test-upf1 -o /tmp/upf.pcap INFO[0000] using tcpdump path at: '/root/.krew/store/sniff/v1.6.2/static-tcpdump' INFO[0000] no container specified, taking first container we found in pod. INFO[0000] selected container: 'upfsp' INFO[0000] sniffing method: upload static tcpdump INFO[0000] sniffing on pod: 'upf-5896cf6b4c-9shws' [namespace: 'Test-upf1', container: 'upfsp', filter: '', interface: 'any'] INFO[0000] uploading static tcpdump binary from: '/root/.krew/store/sniff/v1.6.2/static-tcpdump' to: '/tmp/static-tcpdump' INFO[0000] uploading file: '/root/.krew/store/sniff/v1.6.2/static-tcpdump' to '/tmp/static-tcpdump' on container: 'upfsp' INFO[0000] executing command: '[/bin/sh -c test -f /tmp/static-tcpdump]' on container: 'upfsp', pod: 'upf-5896cf6b4c-9shws', namespace: 'Test-upf1' INFO[0000] command: '[/bin/sh -c test -f /tmp/static-tcpdump]' executing successfully exitCode: '0', stdErr :'' INFO[0000] file found: '' INFO[0000] file was already found on remote pod INFO[0000] tcpdump uploaded successfully INFO[0000] output file option specified, storing output in: '/tmp/upf.pcap' INFO[0000] start sniffing on remote container INFO[0000] executing command: '[/tmp/static-tcpdump -i any -U -w - ]' on container: 'upfsp', pod: 'upf-5896cf6b4c-9shws', namespace: 'Test-upf1' INFO[0000] command: '[/tmp/static-tcpdump -i any -U -w - ]' executing successfully exitCode: '139', stdErr :'' INFO[0000] starting sniffer cleanup INFO[0000] sniffer cleanup completed successfully Error: executing sniffer failed, exit code: '139'
=========================================
Success Scenario : For other PODs
root@node1:/# kubectl get pods -n Test-udm1 NAME READY STATUS RESTARTS AGE udm-ee-79c897c869-9pt9r 2/2 Running 0 15d udm-sdm-5d75ff8775-54lsf 2/2 Running 0 15d udm-ueau-67944949f5-rwd82 2/2 Running 0 15d udm-uecm-76fcf7c57-c8cbs 2/2 Running 0 15d root@node1:/# root@node1:/# root@node1:/# kubectl get pods/udm-ueau-67944949f5-rwd82 -o jsonpath='{.spec.containers[*].name}' -n Test-udm1 udm-ueau istio-proxy root@node1:/# root@node1:/# root@node1:/# kubectl sniff udm-ueau-67944949f5-rwd82 -n Test-udm1 -o /tmp/udm.pcap INFO[0000] using tcpdump path at: '/root/.krew/store/sniff/v1.6.2/static-tcpdump' INFO[0000] no container specified, taking first container we found in pod. INFO[0000] selected container: 'udm-ueau' INFO[0000] sniffing method: upload static tcpdump INFO[0000] sniffing on pod: 'udm-ueau-67944949f5-rwd82' [namespace: 'Test-udm1', container: 'udm-ueau', filter: '', interface: 'any'] INFO[0000] uploading static tcpdump binary from: '/root/.krew/store/sniff/v1.6.2/static-tcpdump' to: '/tmp/static-tcpdump' INFO[0000] uploading file: '/root/.krew/store/sniff/v1.6.2/static-tcpdump' to '/tmp/static-tcpdump' on container: 'udm-ueau' INFO[0000] executing command: '[/bin/sh -c test -f /tmp/static-tcpdump]' on container: 'udm-ueau', pod: 'udm-ueau-67944949f5-rwd82', namespace: 'Test-udm1' INFO[0000] command: '[/bin/sh -c test -f /tmp/static-tcpdump]' executing successfully exitCode: '0', stdErr :'' INFO[0000] file found: '' INFO[0000] file was already found on remote pod INFO[0000] tcpdump uploaded successfully INFO[0000] output file option specified, storing output in: '/tmp/udm.pcap' INFO[0000] start sniffing on remote container INFO[0000] executing command: '[/tmp/static-tcpdump -i any -U -w - ]' on container: 'udm-ueau', pod: 'udm-ueau-67944949f5-rwd82', namespace: 'Test-udm1' ^C root@node1:/# root@node1:/#
I am getting the same issue with nginx image, after trying to debug the issue I connected to the container and I ran the same command that kubectl sniff
is using
/tmp/static-tcpdump -i any -U -w -
Segmentation fault (core dumped)
After that, I installed tcpdump with apt update && apt install -y tcpdump
on the container and it is working
tcpdump -i any -U -w -
tcpdump: data link type LINUX_SLL2
?ò?tcpdump: listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
I am wondering if the way that the static-tcpdump is compiled is causing this issue.
Okies so if a POD has multiple containers then the issue is happening. Thanks for the information I will try to run this after installing the TCPDUMP on the container.
As far as I remember my pod only has 1 container
k get pods
NAME READY STATUS RESTARTS AGE
alpine-654bf79686-jbrjt 1/1 Running 0 24h
ksniff-j446x 1/1 Running 0 25h
ksniff-j6fv6 1/1 Running 0 25h
ksniff-vjmd4 1/1 Running 0 25h
nginx-77b4fdf86c-5qcc2 1/1 Running 0 25h
nginx-77b4fdf86c-5tcl8 1/1 Running 0 29h
nginx-77b4fdf86c-jhzdc 1/1 Running 0 25h
Then, when I run ksniff in any of the nginx pods it fails
k sniff nginx-77b4fdf86c-5qcc2 -n default
INFO[0000] using tcpdump path at: '/Users/scuevas/.krew/store/sniff/v1.6.2/static-tcpdump'
INFO[0000] no container specified, taking first container we found in pod.
INFO[0000] selected container: 'nginx'
INFO[0000] sniffing method: upload static tcpdump
INFO[0000] sniffing on pod: 'nginx-77b4fdf86c-5qcc2' [namespace: 'default', container: 'nginx', filter: '', interface: 'any']
INFO[0000] uploading static tcpdump binary from: '/Users/scuevas/.krew/store/sniff/v1.6.2/static-tcpdump' to: '/tmp/static-tcpdump'
INFO[0000] uploading file: '/Users/scuevas/.krew/store/sniff/v1.6.2/static-tcpdump' to '/tmp/static-tcpdump' on container: 'nginx'
INFO[0000] executing command: '[/bin/sh -c test -f /tmp/static-tcpdump]' on container: 'nginx', pod: 'nginx-77b4fdf86c-5qcc2', namespace: 'default'
INFO[0000] command: '[/bin/sh -c test -f /tmp/static-tcpdump]' executing successfully exitCode: '0', stdErr :''
INFO[0000] file found: ''
INFO[0000] file was already found on remote pod
INFO[0000] tcpdump uploaded successfully
INFO[0000] spawning wireshark!
INFO[0000] start sniffing on remote container
INFO[0000] executing command: '[/tmp/static-tcpdump -i any -U -w - ]' on container: 'nginx', pod: 'nginx-77b4fdf86c-5qcc2', namespace: 'default'
INFO[0001] command: '[/tmp/static-tcpdump -i any -U -w - ]' executing successfully exitCode: '139', stdErr :''
ERRO[0001] failed to start remote sniffing, stopping wireshark error="executing sniffer failed, exit code: '139'"
INFO[0001] starting sniffer cleanup
INFO[0001] sniffer cleanup completed successfully
Error: signal: killed
This does not happen with the alpine pod, Wireshark opened without any issue.
k sniff alpine-654bf79686-jbrjt -n default
INFO[0000] using tcpdump path at: '/Users/scuevas/.krew/store/sniff/v1.6.2/static-tcpdump'
INFO[0000] no container specified, taking first container we found in pod.
INFO[0000] selected container: 'alpine'
INFO[0000] sniffing method: upload static tcpdump
INFO[0000] sniffing on pod: 'alpine-654bf79686-jbrjt' [namespace: 'default', container: 'alpine', filter: '', interface: 'any']
INFO[0000] uploading static tcpdump binary from: '/Users/scuevas/.krew/store/sniff/v1.6.2/static-tcpdump' to: '/tmp/static-tcpdump'
INFO[0000] uploading file: '/Users/scuevas/.krew/store/sniff/v1.6.2/static-tcpdump' to '/tmp/static-tcpdump' on container: 'alpine'
INFO[0000] executing command: '[/bin/sh -c test -f /tmp/static-tcpdump]' on container: 'alpine', pod: 'alpine-654bf79686-jbrjt', namespace: 'default'
INFO[0000] command: '[/bin/sh -c test -f /tmp/static-tcpdump]' executing successfully exitCode: '0', stdErr :''
INFO[0000] file found: ''
INFO[0000] file was already found on remote pod
INFO[0000] tcpdump uploaded successfully
INFO[0000] spawning wireshark!
INFO[0000] start sniffing on remote container
INFO[0000] executing command: '[/tmp/static-tcpdump -i any -U -w - ]' on container: 'alpine', pod: 'alpine-654bf79686-jbrjt', namespace: 'default'
Just in case you want to reproduce here is the manifest for my test deployments
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "2"
labels:
app: alpine
name: alpine
namespace: default
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: alpine
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: alpine
spec:
containers:
- command:
- sleep
- infinity
image: alpine
imagePullPolicy: Always
name: alpine
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
---
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
labels:
app: nginx
name: nginx
namespace: default
spec:
progressDeadlineSeconds: 600
replicas: 3
revisionHistoryLimit: 10
selector:
matchLabels:
app: nginx
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: nginx
spec:
containers:
- image: nginx
imagePullPolicy: Always
name: nginx
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
For anyone looking for a workaround, what worked for me was to rebuild (the latest version of) static tcpdump and use that instead of what's shipped with the plugin:
- Start an alpine container:
podman run -v $PWD:/out --rm -it alpine
(or use docker instead of podman). - Install dependencies:
apk add --update alpine-sdk git libpcap libpcap-dev
. - Clone ksniff:
cd /tmp; git clone https://github.com/eldadru/ksniff; cd ksniff
. - Update tcpdump version in the Makefile (e.g. set
TCPDUMP_VERSION=4.99.4
). - Build it:
make static-tcpdump
. - Copy the binary to the host system:
cp static-tcpdump /out
and exit the container. - Overwrite static-tcpdump from the kubectl plugin:
cp static-tcpdump ~/.krew/store/sniff/<version>/
If the old version of static-tcpdump
is present at /tmp/static-tcpdump
in the pod container then you may need to remove it manually.
For anyone looking for a workaround, what worked for me was to rebuild (the latest version of) static tcpdump and use that instead of what's shipped with the plugin:
- Start an alpine container:
podman run -v $PWD:/out --rm -it alpine
(or use docker instead of podman).- Install dependencies:
apk add --update alpine-sdk git libpcap libpcap-dev
.- Clone ksniff:
cd /tmp; git clone https://github.com/eldadru/ksniff; cd ksniff
.- Update tcpdump version in the Makefile (e.g. set
TCPDUMP_VERSION=4.99.4
).- Build it:
make static-tcpdump
.- Copy the binary to the host system:
cp static-tcpdump /out
and exit the container.- Overwrite static-tcpdump from the kubectl plugin:
cp static-tcpdump ~/.krew/store/sniff/<version>/
If the old version of
static-tcpdump
is present at/tmp/static-tcpdump
in the pod container then you may need to remove it manually.
It works, thanks
Workaround for Debian based containers:
kubectl exec -i -t your-pod -- bash -c "apt update && apt install tcpdump -y && \
rm /tmp/static-tcpdump && \
ln /bin/tcpdump /tmp/static-tcpdump"