Rootless podman 5.2 with pasta now publishes processes which only listen on 127.0.0.1 in the container
Issue Description
With previous rootless podman setups, having a process listen on 127.0.0.1 in the container and publishing that port to the host did not expose that process to the host. Or rather, while connection could be made, it was killed right away (Connection reset by peer when tested with curl). This was very similar to the the rootful podman behaviour (Couldn't connect to server).
With podman-5.2.2-1.fc40.x86_64 with passt-0^20240906.g6b38f07-1.fc40.x86_64 I see a change of behaviour -- the process in the container is reachable on the published port on the host even if the process in the container is supposed to only listen on 127.0.0.1.
Steps to reproduce the issue
Steps to reproduce the issue
- Have Dockerfile to get us some server where we can easily control where it listens:
FROM registry.fedoraproject.org/fedora
RUN dnf install -y python3-django
RUN django-admin startproject mysite
WORKDIR /mysite
ENTRYPOINT [ "python3", "manage.py", "runserver" ]
podman build -t localhost/django .podman rm -f django ; podman run --name django -d -p 8000:8000 localhost/django 127.0.0.1:8000curl -s http://127.0.0.1:8000/ | head- In case the above
curldoes not show anything,curl http://127.0.0.1:8000/
Describe the results you received
With rootless podman-5.2.2-1.fc40.x86_64 with passt-0^20240906.g6b38f07-1.fc40.x86_64 I see
<!doctype html>
<html lang="en-us" dir="ltr">
<head>
<meta charset="utf-8">
<title>The install worked successfully! Congratulations!</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
<style>
html {
Describe the results you expected
With rootless podman-4.9.4-1.fc39.x86_64 with rootlessport I see
curl: (56) Recv failure: Connection reset by peer
With rootful setup, both podman-4.9.4-1.fc39.x86_64 and podman-5.2.2-1.fc40.x86_64, I get
curl: (7) Failed to connect to 127.0.0.1 port 8000 after 0 ms: Couldn't connect to server
podman info output
host:
arch: amd64
buildahVersion: 1.37.2
cgroupControllers:
- cpu
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.1.12-2.fc40.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.1.12, commit: '
cpuUtilization:
idlePercent: 97.84
systemPercent: 0.53
userPercent: 1.63
cpus: 2
databaseBackend: sqlite
distribution:
distribution: fedora
version: "40"
eventLogger: journald
freeLocks: 2047
hostname: redacted.domain.com
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 524288
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 524288
size: 65536
kernel: 6.10.10-200.fc40.x86_64
linkmode: dynamic
logDriver: journald
memFree: 1263239168
memTotal: 3036377088
networkBackend: netavark
networkBackendInfo:
backend: netavark
dns:
package: aardvark-dns-1.12.2-2.fc40.x86_64
path: /usr/libexec/podman/aardvark-dns
version: aardvark-dns 1.12.2
package: netavark-1.12.2-1.fc40.x86_64
path: /usr/libexec/podman/netavark
version: netavark 1.12.2
ociRuntime:
name: crun
package: crun-1.17-1.fc40.x86_64
path: /usr/bin/crun
version: |-
crun version 1.17
commit: 000fa0d4eeed8938301f3bcf8206405315bc1017
rundir: /run/user/1000/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
os: linux
pasta:
executable: /usr/bin/pasta
package: passt-0^20240906.g6b38f07-1.fc40.x86_64
version: |
pasta 0^20240906.g6b38f07-1.fc40.x86_64
Copyright Red Hat
GNU General Public License, version 2 or later
<https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
remoteSocket:
exists: false
path: /run/user/1000/podman/podman.sock
rootlessNetworkCmd: pasta
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: true
serviceIsRemote: false
slirp4netns:
executable: ""
package: ""
version: ""
swapFree: 3035623424
swapTotal: 3035623424
uptime: 0h 36m 13.00s
variant: ""
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
- ipvlan
volume:
- local
registries:
search:
- registry.fedoraproject.org
- registry.access.redhat.com
- docker.io
store:
configFile: /home/test/.config/containers/storage.conf
containerStore:
number: 1
paused: 0
running: 1
stopped: 0
graphDriverName: overlay
graphOptions: {}
graphRoot: /home/test/.local/share/containers/storage
graphRootAllocated: 16039018496
graphRootUsed: 2594983936
graphStatus:
Backing Filesystem: xfs
Native Overlay Diff: "true"
Supports d_type: "true"
Supports shifting: "false"
Supports volatile: "true"
Using metacopy: "false"
imageCopyTmpDir: /var/tmp
imageStore:
number: 6
runRoot: /run/user/1000/containers
transientStore: false
volumePath: /home/test/.local/share/containers/storage/volumes
version:
APIVersion: 5.2.2
Built: 1724198400
BuiltTime: Wed Aug 21 02:00:00 2024
GitCommit: ""
GoVersion: go1.22.6
Os: linux
OsArch: linux/amd64
Version: 5.2.2
Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
No
Additional environment details
Tested with stock Fedora packages.
Additional information
Deterministic on a fresh Fedora server installation.
@sbrivio-rh @dgibson PTAL
With podman-5.2.2-1.fc40.x86_64 with passt-0^20240906.g6b38f07-1.fc40.x86_64 I see a change of behaviour -- the process in the container is reachable on the published port on the host even if the process in the container is supposed to only listen on 127.0.0.1.
If the process in the container listens on 127.0.0.1, it will only be accessible via 127.0.0.1 on the loopback interface, and not via the external interface:
$ (sleep 1; : | nc 127.0.0.1 1337) & ./pasta --config-net -- sh -c '( socat TCP-LISTEN:1337,bind=127.0.0.1 STDOUT & tshark -Pi lo)'
[2] 2036258
Running as user "root" and group "root". This could be dangerous.
Capturing on 'Loopback: lo'
** (tshark:2) 13:38:02.726935 [Main MESSAGE] -- Capture started.
** (tshark:2) 13:38:02.726976 [Main MESSAGE] -- File: "/tmp/wireshark_loDXLVU2.pcapng"
2024/09/24 13:38:03 socat[3] W address is opened in read-write mode but only supports write-only
1 0.000000000 127.0.0.1 → 127.0.0.1 TCP 74 57870 → 1337 [SYN] Seq=0 Win=65535 Len=0 MSS=65495 SACK_PERM TSval=358694077 TSecr=0 WS=4096
2 0.000006130 127.0.0.1 → 127.0.0.1 TCP 74 1337 → 57870 [SYN, ACK] Seq=0 Ack=1 Win=65483 Len=0 MSS=65495 SACK_PERM TSval=358694077 TSecr=358694077 WS=4096
3 0.000013970 127.0.0.1 → 127.0.0.1 TCP 66 57870 → 1337 [ACK] Seq=1 Ack=1 Win=65536 Len=0 TSval=358694077 TSecr=358694077
and this is the expected behaviour for pasta, because it handles both the loopback path (same as the rootlesskit port forwarder) in the container as well as the non-loopback path (same as the slirp4netns port forwarder).
So, if no address is given explicitly for --publish / -p, pasta binds the port to any address (which looks rather convenient and actually correct to me), and then picks the appropriate container interface depending on where the packet comes from.
But I see now that rootlessport and slirp4netns don't actually map host loopback traffic, so this is surely inconsistent.
I thought that since rootlessport uses 127.0.0.1 and ::1 as source addresses, it would also map the connections that should have those as source addresses, but no, it binds only all the other ones.
I guess we have four options:
- document this inconsistency in both Podman and pasta documentation. Easy, but it's still inconsistent (especially compared to containers started as root)
- document this inconsistency and change the behaviour for containers started as root, so that the inconsistency will become less relevant over time. I'm not sure how the bridge thing works and if it's possible to change this though
- also accept IP address exclusions in pasta's
--tcp-ports / -tand--udp-ports / -uoptions (not just port exclusion), and, from Podman, exclude loopback addresses by default, so that users can override this behaviour by explicitly listening to 0.0.0.0. This is slighly complicated by the fact that IPv4 has a whole /8 subnet of loopback addresses, so it's not feasible to implement this correctly, strictly speaking: pasta would bind to all available local addresses, without binding to loopback addresses at all if just one is excluded. If you have 192.0.2.1 and 198.51.100.1 configured on the host, and you exclude 127.0.0.2, only 192.0.2.1 and 192.51.100.1 will be mapped, but not the rest of 127.0.0.0/8 - implement a completely different option for loopback mappings from the host, say,
--loopback-tcp-ports, and-twouldn't imply it. Relatively easy, but it's very impractical for users who already got used to the fact that-tmagically works for both local host and remote hosts
What do you all think?
I thought that since rootlessport uses 127.0.0.1 and ::1 as source addresses, it would also map the connections that should have those as source addresses, but no, it binds only all the other ones.
There was CVE-2021-20199 about that as some application somehow trust localhost (even though this is not secure at all as all users can access localhost) so yeah since then we always make sure the source ip is not 127.0.0.1. Of course in that case it was bad because even remote connections appeared as 127.0.0.1. For pasta if only 127.0.0.1 on the host maps to 127.0.0.1 in the container then this is likely not a big deal.
- document this inconsistency and change the behaviour for containers started as root, so that the inconsistency will become less relevant over time. I'm not sure how the bridge thing works and if it's possible to change this though
It is not possible to change this, if the application bind 127.0.0.1 then there is simply no way to get packages there from another namespace AFAICT (well without a user space proxy). As we forward via firewall the packages always go to the eth0 address in the container.
Overall I think it is far assumption that binding to 127.0.0.1 means no external connections should be made to that address and pasta breaks this assumption by allowing connections from the host namespace 127.0.0.1. I guess it should not matter because a user still has to forward the port via podman/pasta in the first place and if the application listens to 127.0.0.1 this likely is a miss configuration and the user part. The only real point here could be that one app bind to 127.0.0.1:X and another binds to the interface address (i.e. 192.168.1.2:X) but this seems rather unlikely for container scenarios.
also accept IP address exclusions in pasta's --tcp-ports / -t and --udp-ports / -u options (not just port exclusion), and, from Podman, exclude loopback addresses by default, so that users can override this behaviour by explicitly listening to 0.0.0.0. This is slighly complicated by the fact that IPv4 has a whole /8 subnet of loopback addresses, so it's not feasible to implement this correctly, strictly speaking: pasta would bind to all available local addresses, without binding to loopback addresses at all if just one is excluded. If you have 192.0.2.1 and 198.51.100.1 configured on the host, and you exclude 127.0.0.2, only 192.0.2.1 and 192.51.100.1 will be mapped, but not the rest of 127.0.0.0/8
That doesn't seem reasonable to me.
The original report mentions that this used to work so can you clarify if pasta changed this behaviour or if pasta always worked that way?
The original report mentions that this used to work so can you clarify if pasta changed this behaviour or if pasta always worked that way?
It worked that way since the very beginning of pasta. The term of comparison is, quoting, "podman-4.9.4-1.fc39.x86_64 with rootlessport".
Overall I think it is far assumption that binding to 127.0.0.1 means no external connections should be made to that address and pasta breaks this assumption by allowing connections from the host namespace 127.0.0.1.
...for some definitions of "external", yes.
I guess it should not matter because a user still has to forward the port via podman/pasta in the first place and if the application listens to 127.0.0.1 this likely is a miss configuration and the user part. The only real point here could be that one app bind to 127.0.0.1:X and another binds to the interface address (i.e. 192.168.1.2:X) but this seems rather unlikely for container scenarios.
Right. In that case, by the way, a user can still bind ports to specific interfaces (using pasta-only options at the moment).
also accept IP address exclusions in pasta's --tcp-ports / -t and --udp-ports / -u options (not just port exclusion), and, from Podman, exclude loopback addresses by default, so that users can override this behaviour by explicitly listening to 0.0.0.0. This is slighly complicated by the fact that IPv4 has a whole /8 subnet of loopback addresses, so it's not feasible to implement this correctly, strictly speaking: pasta would bind to all available local addresses, without binding to loopback addresses at all if just one is excluded. If you have 192.0.2.1 and 198.51.100.1 configured on the host, and you exclude 127.0.0.2, only 192.0.2.1 and 192.51.100.1 will be mapped, but not the rest of 127.0.0.0/8
That doesn't seem reasonable to me.
Yes another option, perhaps more reasonable, would be to implement an option disabling "spliced" inbound connections altogether (something like an explicit, reversed, -T none). That doesn't break Podman and pasta users relying on the current behaviour, but gives the possibility to keep ports private without having to specify interfaces or addresses for each one.
Overall I think it is far assumption that binding to 127.0.0.1 means no external connections should be made to that address and pasta breaks this assumption by allowing connections from the host namespace 127.0.0.1.
...for some definitions of "external", yes.
Right it is arbitrary what external means here. different host or namespace. As long as different host is covered I don't see any security issues so I don't mind how it behaves
I guess it should not matter because a user still has to forward the port via podman/pasta in the first place and if the application listens to 127.0.0.1 this likely is a miss configuration and the user part. The only real point here could be that one app bind to 127.0.0.1:X and another binds to the interface address (i.e. 192.168.1.2:X) but this seems rather unlikely for container scenarios.
Right. In that case, by the way, a user can still bind ports to specific interfaces (using pasta-only options at the moment).
I am talking about the interface inside the container netns, that would totally depend on the application inside not on any podman/pasta options.
also accept IP address exclusions in pasta's --tcp-ports / -t and --udp-ports / -u options (not just port exclusion), and, from Podman, exclude loopback addresses by default, so that users can override this behaviour by explicitly listening to 0.0.0.0. This is slighly complicated by the fact that IPv4 has a whole /8 subnet of loopback addresses, so it's not feasible to implement this correctly, strictly speaking: pasta would bind to all available local addresses, without binding to loopback addresses at all if just one is excluded. If you have 192.0.2.1 and 198.51.100.1 configured on the host, and you exclude 127.0.0.2, only 192.0.2.1 and 192.51.100.1 will be mapped, but not the rest of 127.0.0.0/8
That doesn't seem reasonable to me.
Yes another option, perhaps more reasonable, would be to implement an option disabling "spliced" inbound connections altogether (something like an explicit, reversed,
-T none). That doesn't break Podman and pasta users relying on the current behaviour, but gives the possibility to keep ports private without having to specify interfaces or addresses for each one.
I guess there is good reason for the splice path, speed mostly?. I would think most users prefer that.
@adelton I like to understand better your actual use case here. Why are you forwarding the port but then bind to 127.0.0.1 inside and don't want the connection to work?
I was happily using the setup with 127.0.0.1 in the container on my Fedoras because I installed pasta a couple of releases back.
And then I spent three hours investigating why the thing which does work on my Fedoras (connect to that container from the host) does not work on GitHub Actions Ubuntu runners. Searching around and in man pages did not suggest it should be happening. It's only when I got a fresh Fedora 39 VM and tried the setup from scratch when I got the difference in behaviour demonstrated.
So it's not as much what I want, I frankly don't mind the rootless pasta behaviour. It's mainly the inconsistency with both the rootful and rootlessport behaviour that has bitten me. And given this could lead to some endpoints now being exposed where they previously were not, so a potential security overlap, I thought I'd report it as an issue. I guess some note in the documentation would work if functional parity with rootful setups is not desired or not practical.
Yes another option, perhaps more reasonable, would be to implement an option disabling "spliced" inbound connections altogether (something like an explicit, reversed,
-T none). That doesn't break Podman and pasta users relying on the current behaviour, but gives the possibility to keep ports private without having to specify interfaces or addresses for each one.I guess there is good reason for the splice path, speed mostly?. I would think most users prefer that.
Yes, that, as we get pretty much host-native throughput on that path.
Maybe we'll achieve something similar with VDUSE which might make that more or less obsolete (the tap interface is quite a hurdle for performance), but it will take time.
I was happily using the setup with 127.0.0.1 in the container on my Fedoras because I installed pasta a couple of releases back.
And then I spent three hours investigating why the thing which does work on my Fedoras (connect to that container from the host) does not work on GitHub Actions Ubuntu runners. Searching around and in man pages did not suggest it should be happening. It's only when I got a fresh Fedora 39 VM and tried the setup from scratch when I got the difference in behaviour demonstrated.
Oh, so things that are working now weren't working before. The inconsistency stands and needs to be solved somehow, but this is another bit of information showing us that we need to be careful to avoid breaking things.
[snip]
So, if no address is given explicitly for
--publish / -p, pasta binds the port to any address (which looks rather convenient and actually correct to me), and then picks the appropriate container interface depending on where the packet comes from.But I see now that rootlessport and slirp4netns don't actually map host loopback traffic, so this is surely inconsistent.
That might be true, but I think it's missing the point. The question is not about host loopback, but about container loopback. The point is that things bound to container loopback are accessible from outside the container, which is indeed surprising. It's mitigated because they're only accessible from host loopback, but it's still odd, and arguably is a security problem because it allows unrelated users on the host to access ports that the container thinks are private to itself.
However, I don't think it's as hard to fix as you outline. This is AFAICT, entirely about "spliced" connections - that's the only way we even can reach loopback bound ports within the container. So, I think all we need to do to fix it is:
- Make (inbound) spliced connections
connect()toaddr_seeninstead of to loopback.
Because addr_seen is a local address of the container the traffic will still go over the container's lo interface, but services bound to loopback addresses will no longer respond to it.
There are some real questions about access to the host loopback address via outbound spliced connections, but that's not what this issue is about.
[snip]
So, if no address is given explicitly for
--publish / -p, pasta binds the port to any address (which looks rather convenient and actually correct to me), and then picks the appropriate container interface depending on where the packet comes from. But I see now that rootlessport and slirp4netns don't actually map host loopback traffic, so this is surely inconsistent.That might be true, but I think it's missing the point. The question is not about host loopback, but about container loopback.
Well, it's about both in the sense I meant (and thought was desirable... and maybe it even is): you connect to host's loopback, and if it's mapped, it maps to the container's loopback as well. The other way, with -T, you have a symmetric behaviour.
The point is that things bound to container loopback are accessible from outside the container, which is indeed surprising.
Not to me! We splice using the loopback interface in the container. I think it's also implied by the "Handling of local traffic in pasta" section of the man page, even though surely not explicit.
It's mitigated because they're only accessible from host loopback, but it's still odd, and arguably is a security problem because it allows unrelated users on the host to access ports that the container thinks are private to itself.
...not so clearly in my opinion: the ports are exposed with --publish.
However, I don't think it's as hard to fix as you outline. This is AFAICT, entirely about "spliced" connections - that's the only way we even can reach loopback bound ports within the container. So, I think all we need to do to fix it is:
* Make (inbound) spliced connections `connect()` to `addr_seen` instead of to loopback.
That's a nice idea, and I guess it has relatively low chances of breaking things, but they would still break for users who assumed that binding to 127.0.0.1 in the container and exposing that port would make it visible from the host (see https://github.com/containers/podman/issues/24045#issuecomment-2371470404).
Because
addr_seenis a local address of the container the traffic will still go over the container'slointerface, but services bound to loopback addresses will no longer respond to it.
Right.
There are some real questions about access to the host loopback address via outbound spliced connections, but that's not what this issue is about.
Sure, that's another matter. But accessing ports bound to a loopback address in the container should be at least optional. I'm almost convinced we can make it an opt-in and it's unlikely that we'll break any usage, but we need to have a way to fix that quickly, in case.
Patch series and related discussion at https://archives.passt.top/passt-dev/[email protected]/ by the way.
[snip]
So, if no address is given explicitly for
--publish / -p, pasta binds the port to any address (which looks rather convenient and actually correct to me), and then picks the appropriate container interface depending on where the packet comes from. But I see now that rootlessport and slirp4netns don't actually map host loopback traffic, so this is surely inconsistent.That might be true, but I think it's missing the point. The question is not about host loopback, but about container loopback.
Well, it's about both in the sense I meant (and thought was desirable... and maybe it even is): you connect to host's loopback, and if it's mapped, it maps to the container's loopback as well. The other way, with
-T, you have a symmetric behaviour.
Well, obviously there could be usecases, but I really don't think this would be the expected behaviour. It's so completely unlike any other networking model (physical, rootful, and it seems slirp too). If you really want to share a lo with the host, that seems like a case where you don't want a network namespace.
The point is that things bound to container loopback are accessible from outside the container, which is indeed surprising.
Not to me! We splice using the loopback interface in the container. I think it's also implied by the "Handling of local traffic in pasta" section of the man page, even though surely not explicit.
I don't really think it's implied by that. As my draft patch demonstrates, it certainly need not be the case, even with traffic over lo, and again the pseudo-shared lo model is completely unlike the setups that are likely to form peoples' mental models.
It's mitigated because they're only accessible from host loopback, but it's still odd, and arguably is a security problem because it allows unrelated users on the host to access ports that the container thinks are private to itself.
...not so clearly in my opinion: the ports are exposed with
--publish.
Yeah, that also mitigates it. The container could still have different servers running on the same port on loopback and non-loopback addresses. Or it could have a server on 0.0.0.0 that changes behaviour depending on whether getpeername() reports loopback. In those cases -t auto would expose the loopback version to the host, which seems realliy surprising (different from passt as well as all other networking configs).
However, I don't think it's as hard to fix as you outline. This is AFAICT, entirely about "spliced" connections - that's the only way we even can reach loopback bound ports within the container. So, I think all we need to do to fix it is:
* Make (inbound) spliced connections `connect()` to `addr_seen` instead of to loopback.That's a nice idea, and I guess it has relatively low chances of breaking things, but they would still break for users who assumed that binding to 127.0.0.1 in the container and exposing that port would make it visible from the host (see #24045 (comment)).
Well, sure, but I'd argue that was a flawed assumption that just happened to work because of a pasta bug. Wtiness its total non-portability.
Because
addr_seenis a local address of the container the traffic will still go over the container'slointerface, but services bound to loopback addresses will no longer respond to it.Right.
There are some real questions about access to the host loopback address via outbound spliced connections, but that's not what this issue is about.
Sure, that's another matter. But accessing ports bound to a loopback address in the container should be at least optional. I'm almost convinced we can make it an opt-in and it's unlikely that we'll break any usage, but we need to have a way to fix that quickly, in case.
Sure, it's pretty easy to make it an option.
[snip]
So, if no address is given explicitly for
--publish / -p, pasta binds the port to any address (which looks rather convenient and actually correct to me), and then picks the appropriate container interface depending on where the packet comes from. But I see now that rootlessport and slirp4netns don't actually map host loopback traffic, so this is surely inconsistent.That might be true, but I think it's missing the point. The question is not about host loopback, but about container loopback.
Well, it's about both in the sense I meant (and thought was desirable... and maybe it even is): you connect to host's loopback, and if it's mapped, it maps to the container's loopback as well. The other way, with
-T, you have a symmetric behaviour.Well, obviously there could be usecases, but I really don't think this would be the expected behaviour. It's so completely unlike any other networking model (physical, rootful, and it seems slirp too). If you really want to share a
lowith the host, that seems like a case where you don't want a network namespace.
It's not shared in general, it's just one port being forwarded, for a specific Layer-4 protocol.
The point is that things bound to container loopback are accessible from outside the container, which is indeed surprising.
Not to me! We splice using the loopback interface in the container. I think it's also implied by the "Handling of local traffic in pasta" section of the man page, even though surely not explicit.
I don't really think it's implied by that. As my draft patch demonstrates, it certainly need not be the case, even with traffic over
lo, and again the pseudo-sharedlomodel is completely unlike the setups that are likely to form peoples' mental models.
...unless you see the "spliced" path as a loopback bypass, which is, at least, what I had in mind when I implemented it, and how I use it sometimes. This plus https://github.com/containers/podman/issues/24045#issuecomment-2371470404 already makes two users...
It's mitigated because they're only accessible from host loopback, but it's still odd, and arguably is a security problem because it allows unrelated users on the host to access ports that the container thinks are private to itself.
...not so clearly in my opinion: the ports are exposed with
--publish.Yeah, that also mitigates it. The container could still have different servers running on the same port on loopback and non-loopback addresses. Or it could have a server on
0.0.0.0that changes behaviour depending on whethergetpeername()reports loopback. In those cases-t autowould expose the loopback version to the host, which seems realliy surprising (different from passt as well as all other networking configs).
True, in this case it's definitely surprising.
However, I don't think it's as hard to fix as you outline. This is AFAICT, entirely about "spliced" connections - that's the only way we even can reach loopback bound ports within the container. So, I think all we need to do to fix it is:
* Make (inbound) spliced connections `connect()` to `addr_seen` instead of to loopback.That's a nice idea, and I guess it has relatively low chances of breaking things, but they would still break for users who assumed that binding to 127.0.0.1 in the container and exposing that port would make it visible from the host (see #24045 (comment)).
Well, sure, but I'd argue that was a flawed assumption that just happened to work because of a pasta bug. Wtiness its total non-portability.
It's a bug I added tests for... I'd call it a feature, really. Originally, I was thinking of adding something symmetric to -T, which would only work for the loopback bypass, separated from -t, but then I thought that three options to forward ports would be too many.
Because
addr_seenis a local address of the container the traffic will still go over the container'slointerface, but services bound to loopback addresses will no longer respond to it.Right.
There are some real questions about access to the host loopback address via outbound spliced connections, but that's not what this issue is about.
Sure, that's another matter. But accessing ports bound to a loopback address in the container should be at least optional. I'm almost convinced we can make it an opt-in and it's unlikely that we'll break any usage, but we need to have a way to fix that quickly, in case.
Sure, it's pretty easy to make it an option.
Okay, yes, I would be fine with it, and I'm convinced it's an improvement over the current situation especially given the scenario where one might bind the same port to loopback and non-loopback addresses in the container, which is not supported at the moment.
That's a nice idea, and I guess it has relatively low chances of breaking things, but they would still break for users who assumed that binding to 127.0.0.1 in the container and exposing that port would make it visible from the host (see #24045 (comment)).
Well, sure, but I'd argue that was a flawed assumption that just happened to work because of a pasta bug. Wtiness its total non-portability.
I confirm that in my case, rather than explicitly assuming something about the 127.0.0.1 in the container exposure, I did not really think of it when it happened to work on my Fedora setup without modifications. The use of 127.0.0.1 is the default which Kind uses to expose its API server by default, and in my work on https://github.com/adelton/kind-in-pod I just went with the minimal changes to the defaults.
Patches to change this behaviour are now merged into pasta upstream and should be in the next release.
This should be fixed now in 2024_10_30.ee7d0b6 and its corresponding Fedora 40 update.
+1