csi-driver-smb
csi-driver-smb copied to clipboard
Mangled HTTP output for static files on mountpoint using SMB CSI
What happened:
On a k3s cluster where the mountpoint is a SMB mountpoint.
All my static file requests have mangled responses (The autogenerated indices work good) using the httpd image - the HTTP response is truncated.
This seems to be specific to using httpd and the smb csi together.
Here is the output of trying to fetch /README.md ( Not working )
curl of `/README.md`
curl -iv --raw http://localhost:8080/README.md
* Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /README.md HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.84.0
> Accept: */*
>
* Received HTTP/0.9 when not allowed
* Closing connection 0
curl: (1) Received HTTP/0.9 when not allowed
Adding the --http0.9 flag reveals the rest of the malformed body.
curl of `/README.md` with --http0.9
curl --http0.9 -iv --raw http://localhost:8080/README.md --output -
* Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /README.md HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.84.0
> Accept: */*
>
2:56:26 GMT
ETag: "29-5e387146f7c14"
Accept-Ranges: bytes
Content-Length: 41
# Start of my readme
This is a README.md
DHnQQFid4��,A�u��6�
* Closing connection 0
��
What you expected to happen:
Expected result from curl (Seen with nginx image):
curl of `/README.md` using nginx image
curl -iv --raw http://localhost:8080/README.md --output -
* Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /README.md HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.84.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Server: nginx/1.23.0
Server: nginx/1.23.0
< Date: Mon, 11 Jul 2022 13:10:53 GMT
Date: Mon, 11 Jul 2022 13:10:53 GMT
< Content-Type: application/octet-stream
Content-Type: application/octet-stream
< Content-Length: 41
Content-Length: 41
< Last-Modified: Mon, 11 Jul 2022 12:56:26 GMT
Last-Modified: Mon, 11 Jul 2022 12:56:26 GMT
< Connection: keep-alive
Connection: keep-alive
< ETag: "62cc1dfa-29"
ETag: "62cc1dfa-29"
< Accept-Ranges: bytes
Accept-Ranges: bytes
<
# Start of my readme
This is a README.md
* Connection #0 to host localhost left intact
How to reproduce it:
- Use K3s and SMB CSI
- Copy some files into the smb mountpoint
- Setup a simple httpd deployment and use the SMB volume
- PORT FORWARD the deployment and curl/visit on browser
Deployment and files:
Anything else we need to know?:
- Here is the output of trying to fetch
/( Works OK )
curl of `/`
curl --http0.9 -iv --raw http://localhost:8080/ --output -
* Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET / HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.84.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Date: Mon, 11 Jul 2022 13:37:43 GMT
Date: Mon, 11 Jul 2022 13:37:43 GMT
< Server: Apache/2.4.54 (Unix)
Server: Apache/2.4.54 (Unix)
< Content-Length: 291
Content-Length: 291
< Content-Type: text/html;charset=ISO-8859-1
Content-Type: text/html;charset=ISO-8859-1
<
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html>
<head>
<title>Index of /</title>
</head>
<body>
<h1>Index of /</h1>
<ul><li><a href="Formats/"> Formats/</a></li>
<li><a href="README.md"> README.md</a></li>
<li><a href="le_books/"> le_books/</a></li>
</ul>
</body></html>
* Connection #0 to host localhost left intact
nginxworks, buthttpddoesnt work.- Using docker, the same issue doesn't occur (Whether i use the native fs or i mount a smb share and mount that as a volume).
- Related issues
- https://serverfault.com/questions/1090235/unable-to-serve-images-from-wordpress-site-response-is-malformed - Seemed to be a very similar case but using AKS.
- docker-library/httpd#220
Environment:
- CSI Driver version: v1.8.0
- Kubernetes version (use
kubectl version):Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.2", GitCommit:"f66044f4361b9f1f96f0053dd46cb7dce5e990a8", GitTreeState:"archive", BuildDate:"2022-06-18T07:33:51Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v4.5.4 Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.8+k3s1", GitCommit:"53f2d4e7d80c09a7db1858e3f4e7ddfa13256c45", GitTreeState:"clean", BuildDate:"2022-06-27T21:49:50Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"linux/arm64"} - OS (e.g. from /etc/os-release): Oracle Linux 7.9
- Kernel (e.g.
uname -a):- Master node -
Linux instance-20220711-0859 5.4.17-2136.308.9.el7uek.aarch64 #2 SMP Mon Jun 13 20:58:59 PDT 2022 aarch64 aarch64 aarch64 GNU/Linux - Worker node -
Linux instance-20220711-0902 5.4.17-2136.308.9.el7uek.x86_64 #2 SMP Mon Jun 13 20:40:51 PDT 2022 x86_64 x86_64 x86_64 GNU/Linux
- Master node -
- Install tools: k3s
- Others:
- k3s version v1.23.8+k3s1 (53f2d4e7),go version go1.17.5
- Cluster: 1 master node on arm64 and 1 worker on amd64.
nginxandhttpdpods are scheduled on worker node.
It seems to be an issue with mmaping.
A workaround is by adding the config to /etc/apache2/apache2.conf
EnableMMAP off
Source: https://stackoverflow.com/questions/65092742/how-do-i-debug-broken-response-headers-in-apache Is this expected behavious that mmap will cause such issues? It does not occur while using smb-mapped volume drives in docker.
so this is the os config issue on agent node?
It seems I'm having the same issue. I can reproduce this with the Bitnami Wordpress helm chart. If I use persistence for the wordpress image using this SMB CSI, wordpress is unable to serve any images. Everything else seems to work. It's not permissions - it can read and write fine. I can upload images, and they are intact on the PVC. It's just that the Bitnami Apache container will not serve images or downloads if they reside on a PVC created by this CSI. It returns an Error 400. I can deploy the same chart using any other CSI, and it works fine. I haven't yet been able to try adjusting the conf file for disabling MMAP yet.
Note: if you're testing this, do not try use SMB for persistence of the mariadb image (or any other database for that matter) - that does not work with SMB or NFS CSI.
I added a configmap with my custom httpd.conf with MMAP disabled, and all images/files are now being served correctly. As it states in the httpd.conf comments:
# EnableMMAP and EnableSendfile: On systems that support it,
# memory-mapping or the sendfile syscall may be used to deliver
# files. This usually improves server performance, but must
# be turned off when serving from networked-mounted
# filesystems or if support for these functions is otherwise
# broken on your system.
# Defaults: EnableMMAP On, EnableSendfile Off
#
EnableMMAP off
#EnableSendfile on
so this is the os config issue on agent node?
I don't think there is any issue with the host OS here, on any of the nodes
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.