Execute ansible in a sandbox container

Open trihoangvo opened this issue 4 years ago • 1 comments

Pull Request description

Description of the change

In the current implementation, yorc executes an ansible-playbook on a target host with the command-line parameter connection=ssh from the host machine. However, because any playbook keyword will override any command-line option and any configuration setting [1], attackers can set the playbook keyword connection: local in the ansible plays explicitly. As a result, they can execute an ansible play on the host machine, where yorc is running (instead of the target host).

In this PR we start a sandbox container first and execute the ansible from inside the container. The following changes keep the current implementation. It means we can choose, whether to execute ansible in a sandbox, or on the host machine for backward compatibility. If we provide the configuration for the sandbox image (in config.yorc.yaml), then ansible is executed in the sandbox. Otherwise, it is executed on the host machine.

The TOSCA's hosted operation is executed in the sandbox container the same way as the TOSCA's operation but with the command-line parameter set to connection=local.

What I did

The ansible execution have 3 steps:

Step 1: generate all ansible configuration files in deployments/<deployment-id>/ansible. Step 2: start the container and bind mount the ansible configuration files and the overlay. Step 3: execute ansible-playbook from inside the container.

How I did it

Step 1:

In this step, all ansible configuration files for the execution (i.e., ansible.cfg, hosts, run.ansible.yml, wrapper, .vault_pass) are generated in the deployments/<deployment-id>/ansible on the host machine as usual. The generation for hosts, wrapper, and .vault_pass remains unchanged.

For the generation of ansible.cfg, we added the following changes:

By default, ansible writes its local temp files and the SSH ControlPath sockets (if OpenSSH is enabled) to ~/.ansible/tmp and ~/.ansible/cp on the host machine, respectively. This may cause a permission denied error when ansible runs inside the container and has no write permission to the given paths. Therefore, we added two configs in the function generateAnsibleConfigurationFile(), which specify the location where ansible can write these files in the default workdir of the sandbox (i.e., /work/ansible/tmp and /work/ansible/cp).
Also, in the current implementation, the config fact_caching_connection specifies ansible to write the gathering fact caches of all yorc tasks at a common path deployments/<deployment-id>/facts_caches. During our tests, we have noticed that each yorc task should have its own facts cache (i.e., different tasks should not share a common path for gathering facts). Otherwise, one task may overwrite the gathering facts of the other ones unexpectedly. For example, if users set become: true in one task, it will overwrite the USER fact from ubuntu to root in another task. Therefore, we do not setup the containers to share a common path on the host machine for facts cache. Insteads, we added a config to specify the location where ansible can write fact caches in the workdir of the container (by default at /work/ansible/facts_cache).

For the generation of run.ansible.yml, the execution_ansible.go and execution_scripts.go is updated so that the path to the overlay and the DestFolder is a location from within the container.

Step 2:

In this step, we start the sandbox container and bind mount the deployments/<deployment-id>/ansible to the workdir of the container (/work/ansible). Also, we bind mount the deployments/<deployment-id>/overlay, which contains deployment artifacts, in the sandbox (by default at /work/overlay).

In order for the ansible execution to access the target host, we also bind mount the ssh agent socket, which holds the private keys to access the target hosts, from the host machine to the sandbox (by default at /work/ssh-agent), and pass SSH_AUTH_SOCK to the sandbox as an environment variable of the starting container.

In the current implementation, the sandbox container still miss the security hardening. Therefore, we added some configs for hardening the security of the container. For instance, we:

specify a non-root user to run the container.
do not allow new privileges escalation
remove ALL capabilities
add config to limit cpus and memory to avoid DoS. By default, we set the cpus and memory limit to 0.5 ratio and 256m, respectively. This limit is configurable via the yorc config. If you think the default limits is too low, then set it to a higher one (e.g., 1 cpu, 512m).

Step 3:

After the sandbox is started, we reuse the current implementation of CMD to execute ansible inside the sandbox container (i.e., docker exec ). Ansible then writes output logs back to the file system (by default at /work/ansible/*-out.csv) and the output handler of the CMD can read the logs. The implementation of the output handler for ansible and script remain unchanged.

How to verify it

Step 1: Build the sandbox image

Go to pkg/ansible and build:

docker build -t otc-ansible:2.9.9 .

Step 2: update config.yorc.yaml

ansible:
  hosted_operations:
    default_sandbox:
      image: "otc-ansible:2.9.9"
      cpus: "1"
      memory: "512m"
      entrypoint: ["/bin/sh", "-c"]
      command: ["sleep 300"]
      user: "1000:1000"

Step 3: Start yorc.

Step 4: Tests

We already tested a topology with your provided python software component in [2] and the playbook mongodb in [3].

Description for the changelog

We will update the changelog and yorc documentation after you review the code changes and agree to it.

Applicable Issues

In the current implementation (i.e., before this PR), we have noticed that, when we enable OpenSSH in ansible, we got the following errors from SSH:

mux_client_read_packet: read header failed: Broken pipe
Received exit status from master 0

It means, SSH multiplexing does not work. For every command, it tries to reuse an opened SSH connection, but the master already closes it. We also tried to increase the ControlPersist from 60s to 10m but still got the same error. Even the ansible execution is succeed, but a new SSH connection is opened for every command reduces the performance significantly. Do you have the same observation?

References

[1] https://docs.ansible.com/ansible/latest/reference_appendices/general_precedence.html [2] https://github.com/ystia/tosca-samples/tree/develop/org/ystia/yorc/samples/python [3] https://github.com/ystia/forge/tree/develop/org/ystia/mongodb/linux/ansible

Jun 02 '20 09:06 trihoangvo

Code Climate has analyzed commit f0aa1f83 and detected 11 issues on this pull request.

Here's the issue category breakdown:

Category	Count
Complexity	5
Duplication	6

View more on Code Climate.

Jun 16 '20 07:06 codeclimate[bot]

yorc yorc copied to clipboard

Execute ansible in a sandbox container

Pull Request description

Description of the change

What I did

How I did it

Step 1:

Step 2:

Step 3:

How to verify it

Description for the changelog

Applicable Issues

References

yorc
yorc copied to clipboard