salt icon indicating copy to clipboard operation
salt copied to clipboard

[BUG] file.managed should support binary files

Open dehnert opened this issue 2 years ago • 6 comments

Description Currently when a binary contents are passed to file.managed, the contents become corrupted. I'm not entirely sure of what's going on, but it looks like the bytes get converted to utf-8 and then back to bytes, which might be a lossy process?

Setup

I'm using a snippet like:

/path/daemon.keytab:
  file.managed:
    - mode: 600
    - contents: !!binary {{keytab_key.replace('\n', '')}}

Steps to Reproduce the behavior (Include debug logs if possible and relevant)

Expected behavior daemon.keytab should contain the base64-decoded contents of the keytab_key variable

Versions Report

salt --versions-report
Salt Version:
          Salt: 3004.1

Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.8.1
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.0.3
       libgit2: Not Installed
      M2Crypto: 0.38.0
          Mako: Not Installed
       msgpack: 1.0.3
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: Not Installed
  pycryptodome: 3.11.0
        pygit2: Not Installed
        Python: 3.10.6 (main, May 29 2023, 11:10:38) [GCC 11.3.0]
  python-gnupg: Not Installed
        PyYAML: 5.4.1
         PyZMQ: 22.3.0
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: ubuntu 22.04 Jammy Jellyfish
        locale: iso8859-1
       machine: x86_64
       release: 5.15.0-71-generic
        system: Linux
       version: Ubuntu 22.04 Jammy Jellyfish

The master is running 3005.1.

Additional context I think this issue was previously fixed by https://github.com/saltstack/salt/pull/30268, and has regressed.

Based on cursory testing, something like this might work:
commit bb6d0a8262b78a92df6e06fb2d93e53d31198d32 (binary)
Author: Alex Dehnert <[email protected]>
Date:   Fri Jul 14 01:05:37 2023 -0400

    let file.managed handle binary files

diff --git a/salt/modules/file.py b/salt/modules/file.py
index e8db4a0336..c143983027 100644
--- a/salt/modules/file.py
+++ b/salt/modules/file.py
@@ -6040,7 +6040,10 @@ def manage_file(
                 prefix=salt.utils.files.TEMPFILE_PREFIX, text=True
             )
             with salt.utils.files.fopen(tmp, "wb") as tmp_:
-                if encoding:
+                log.error("contents type=%s", type(contents))
+                if isinstance(contents, bytes):
+                    tmp_.write(contents)
+                elif encoding:
                     if salt.utils.platform.is_windows():
                         contents = os.linesep.join(
                             _splitlines_preserving_trailing_newline(contents)
diff --git a/salt/states/file.py b/salt/states/file.py
index 75b606ddc2..a998011e2a 100644
--- a/salt/states/file.py
+++ b/salt/states/file.py
@@ -2840,8 +2841,11 @@ def managed(

     else:
         use_contents = None
+    log.error("use_contents=%r (%s)", use_contents, type(use_contents))

-    if use_contents is not None:
+    if isinstance(use_contents, bytes) and use_contents:
+        contents = use_contents
+    elif use_contents is not None:
         if not allow_empty and not use_contents:
             if contents_pillar:
                 contents_id = "contents_pillar {}".format(contents_pillar)

But I haven't done significant testing and obviously there's style issues there.

file.decode would be great, except it doesn't support setting permissions, limiting the value for secrets.

dehnert avatar Jul 14 '23 05:07 dehnert

YAML doesn't support binary content. It's not file.managed that has broken it, it's your template rendering. If you have to pass data from a variable to the contents, and not via pillar or a source file, then you need to use a different renderer, e.g. py.

Alternatively, use file.decode to manage the contents and file.managed to manage the permissions.

OrangeDog avatar Jul 14 '23 13:07 OrangeDog

YAML doesn't support binary content.

I don't think this is true? The latest spec uses the !!binary tag in an example, there's a spec for !!binary, and perhaps most importantly from a pragmatic perspective, the pyyaml parser (and whatever yaml parser Salt uses, if that's different) supports it.

Here's a very simple example of the Python YAML behavior:

$ cat > /tmp/foo.yaml
foo: "!!binary foo"
bar: !!binary quux
$ python3 -c 'import yaml; print(yaml.load(open("/tmp/foo.yaml")))'
<string>:1: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
{'foo': '!!binary foo', 'bar': b'\xaa\xeb\xb1'}

(Note that bar ends up with a value that's typed as bytes, rather than string.)

It's not file.managed that has broken it, it's your template rendering. If you have to pass data from a variable to the contents, and not via pillar or a source file, then you need to use a different renderer, e.g. py.

I've confirmed that the file.managed implementation gets passed a bytes object (rather than a string), and just passing that all the way through to the write call works fine. I'm pretty confident it's file.managed, not the SLS file rendering.

Alternatively, use file.decode to manage the contents and file.managed to manage the permissions.

This seems introduce a race -- file.decode will drop the contents with default (broad) permissions, and then file.managed will fix them up, which means that any process with access to the location will be able to read the secret, which is obviously undesirable.

dehnert avatar Jul 14 '23 16:07 dehnert

I can confirm that file.managed did not support byte contents even when called from Python directly when I wrote the x509_v2 modules end of last year. I do not remember exactly what went wrong, but I think it was decoded as utf-8 at some point, which obviously threw a UnicodeDecodeError.

lkubb avatar Jul 14 '23 17:07 lkubb

The latest spec uses the !!binary tag

Apologies, that's new to me. I assume the version of the yaml parser you have installed supports it?

This seems introduce a race

Do it the other way around. Create an empty file with the correct permissions (and replace: false), then assert the content.

OrangeDog avatar Jul 14 '23 18:07 OrangeDog

@dehnert Thanks for the report. If Salt doesn't currently support binary data in YAML but the newest YAML Python library does, this is not something that was done on purpose. That being said, it seems like there is a better approach to distributing binary data than including it directly in the state file. For example, including the binary data in a file on the Salt file server and including a reference to that location would be one approach. What is the need to include it directly in the state file? Thanks!

garethgreenaway avatar Jul 17 '23 17:07 garethgreenaway

The latest spec uses the !!binary tag

Apologies, that's new to me. I assume the version of the yaml parser you have installed supports it?

Yup. And, to be clear, I just picked recent versions because they seemed handy -- the original 2004 spec has a similar example, and the oldest pyyaml version I could get from pip (3.10 for 2011) supports it (and 3.01 from 2006 has basically the same snippet of relevant-seeming code).

Do it the other way around. Create an empty file with the correct permissions (and replace: false), then assert the content.

Ah, good to know, thanks. I'll need to try that.

That being said, it seems like there is a better approach to distributing binary data than including it directly in the state file. For example, including the binary data in a file on the Salt file server and including a reference to that location would be one approach. What is the need to include it directly in the state file? Thanks!

I'm not putting the binary data directly in the state file on disk -- I'm sticking it in Vault, and using vault.read_secret to insert it into the rendered state file. I'd rather not put it in a file because Vault is designed for secret management and disk on my salt master is a bit less so. Also even if I wanted it in a file on the Salt master disk, it's only a couple hundred bytes, so I'd maybe prefer to put it inline rather than have another file to keep track of.

(Also, FWIW, another obvious location to put it would be a pillar. AFAICT from code inspection, the issue here is that _validate_str_list converts to Unicode and then it needs to get converted back which does not round trip cleanly, and contents_pillar, contents_grains, and 'contents all seem to go through that function and presumably get corrupted. I assume source isn't corrupted though.)

dehnert avatar Jul 21 '23 07:07 dehnert