opencbdc-tx icon indicating copy to clipboard operation
opencbdc-tx copied to clipboard

Any misbehaving client can trivially crash the Coordinator

Open HalosGhost opened this issue 3 years ago • 1 comments

Affected Branch

trunk

Basic Diagnostics

  • [X] I've pulled the latest changes on the affected branch and the issue is still present.

  • [X] The issue is reproducible in docker

Description

This issue was initially found and reported by @toddfratello, and has been confirmed using the procedure below.

To reproduce in docker, we apply the following patch to enable use of nc (netcat) inside the containers (this is only for ease-of-testing, the bug is present without this patch):

diff --git a/Dockerfile b/Dockerfile
index 6fe2b54..41c06ff 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -12,6 +12,7 @@ RUN apt update && \
       libgtest-dev \
       libgmock-dev \
       net-tools \
+      netcat \
       git

 # Args

The above will require rebuilding the docker image:

# docker build . -t opencbdc-tx

After rebuilding the docker image, the following can reliably reproduce the issue

  1. Run the 2PC architecture:
    # docker-compose --file docker-compose-2pc.yml up
    
  2. Run a separate container, connected to the 2PC network:
    # docker run --network 2pc-network -ti opencbdc-tx /bin/bash
    
  3. In the separate container, send the coordinator a stream of random bytes:
    # cat /dev/urandom | nc 172.25.0.3 7777
    
    NB: the IP address used above may be dependent on your docker configuration; but you can find the specific IP address of your coordinator in your docker setup using docker network inspect 2pc-network.
  4. Note that the coordinator errors out and restarts.

In particular, the issue appears to be caused by this line: https://github.com/mit-dci/opencbdc-tx/blob/ef541b5ecf2c851863ec44fdcfa7f0fc0da31ba2/src/util/network/tcp_socket.cpp#L108

In the PoC above, pkt_sz is very large, and attempting to allocate so much space results in the Coordinator dying with the following error:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

HalosGhost avatar Feb 07 '22 21:02 HalosGhost

It's worth pointing out that in a real-world deployment, end-users would only be able to send packets to the sentinels and watchtowers. Therefore, I think it makes sense to focus efforts hardening the external interface to those components. This particular issue applies to all components though because the socket layer does not protect against malicious packets.

metalicjames avatar Feb 07 '22 21:02 metalicjames