community.general icon indicating copy to clipboard operation
community.general copied to clipboard

puppet_utils: Do not force lang for cmd

Open Knalltuete5000 opened this issue 1 year ago • 14 comments

SUMMARY

The shell environment used forces by default the lang to be "C". This can result that puppet fails with the error message invalid byte sequence in US-ASCII.

Fixes #8000

ISSUE TYPE

Bugfix Pull Request

COMPONENT NAME

puppet

Knalltuete5000 avatar Feb 21 '24 14:02 Knalltuete5000

cc @russoz click here for bot help

ansibullbot avatar Feb 21 '24 15:02 ansibullbot

Also please note that you have to adjust the tests. They currently fail.

felixfontein avatar Feb 22 '24 19:02 felixfontein

Please note that this change is potentially problematic since C.UTF-8 is not always available, which can cause the module to stop working for some users. (https://unix.stackexchange.com/questions/597962/how-widespread-is-the-c-utf-8-locale)

Maybe it's better to make this configurable, with C.UTF-8 as a new default, so that at least users can explicitly provide another locale to fix when this breaks for them?

This would be one solution I had in mind. Another one would be to pass in None as the lang. From my understanding this would then default to the system language, am I right? What is the preferred approach for this fix

Knalltuete5000 avatar Feb 23 '24 11:02 Knalltuete5000

This would be one solution I had in mind. Another one would be to pass in None as the lang. From my understanding this would then default to the system language, am I right?

Yes. But that's also a potentially breaking change, since it could be that the output changes depending on the system language.

I think providing the locale as a parameter is probably the best way to fix this, or alternatively writing code which detects whether C.UTF-8 is available.

felixfontein avatar Feb 23 '24 19:02 felixfontein

(Assuming of course there isn't a way to tell puppet to not emit non-ASCII...)

felixfontein avatar Feb 23 '24 19:02 felixfontein

cc @emonty click here for bot help

ansibullbot avatar Feb 26 '24 09:02 ansibullbot

I have added a option via environment_lang to set the lang to force to. It will default to C which is the original value puppet/the cmdrunner was forced to

Knalltuete5000 avatar Feb 26 '24 09:02 Knalltuete5000

The test ansible-test sanity --test compile --python 2.7 [explain] failed with 1 error:

plugins/module_utils/puppet.py:56:25: SyntaxError: def puppet_runner(module:AnsibleModule):

The test ansible-test sanity --test import --python 2.7 [explain] failed with 2 errors:

plugins/module_utils/puppet.py:56:25: traceback: SyntaxError: invalid syntax
plugins/modules/puppet.py:175:0: traceback: SyntaxError: invalid syntax (at plugins/module_utils/puppet.py:56:25)

The test ansible-test sanity --test pep8 [explain] failed with 1 error:

plugins/module_utils/puppet.py:56:25: E231: missing whitespace after ':'

The test ansible-test sanity --test compile --python 2.7 [explain] failed with 1 error:

plugins/module_utils/puppet.py:56:25: SyntaxError: def puppet_runner(module:AnsibleModule):

The test ansible-test sanity --test import --python 2.7 [explain] failed with 2 errors:

plugins/module_utils/puppet.py:56:25: traceback: SyntaxError: invalid syntax
plugins/modules/puppet.py:175:0: traceback: SyntaxError: invalid syntax (at plugins/module_utils/puppet.py:56:25)

The test ansible-test sanity --test pep8 [explain] failed with 1 error:

plugins/module_utils/puppet.py:56:25: E231: missing whitespace after ':'

The test ansible-test sanity --test pep8 [explain] failed with 1 error:

plugins/module_utils/puppet.py:56:25: E231: missing whitespace after ':'

The test ansible-test sanity --test compile --python 2.7 [explain] failed with 1 error:

plugins/module_utils/puppet.py:56:25: SyntaxError: def puppet_runner(module:AnsibleModule):

The test ansible-test sanity --test import --python 2.7 [explain] failed with 2 errors:

plugins/module_utils/puppet.py:56:25: traceback: SyntaxError: invalid syntax
plugins/modules/puppet.py:175:0: traceback: SyntaxError: invalid syntax (at plugins/module_utils/puppet.py:56:25)

The test ansible-test sanity --test pep8 [explain] failed with 1 error:

plugins/module_utils/puppet.py:56:25: E231: missing whitespace after ':'

click here for bot help

ansibullbot avatar Feb 26 '24 09:02 ansibullbot

Hi @Knalltuete5000

Thanks for your contribution, sorry for not commenting sooner.

We had other cases of modules for which the default encoding was a moving target. Please take a look at other modules using CmdRunner for ideas.

russoz avatar Feb 26 '24 18:02 russoz

Thanks for the hit. I will look into it

Knalltuete5000 avatar Feb 27 '24 08:02 Knalltuete5000

@Knalltuete5000 this PR contains the following merge commits:

  • https://github.com/ansible-collections/community.general/commit/65ab12e9b480ffb1a036396d2a59c62ddf8f97cc

Please rebase your branch to remove these commits.

click here for bot help

ansibullbot avatar Feb 27 '24 08:02 ansibullbot

We had other cases of modules for which the default encoding was a moving target. Please take a look at other modules using CmdRunner for ideas.

I have looked through the code, where the CmdRunner is used. I have found some examples but they will not solve the problem. Either the parameter environ_update is set, in near all cases, to {'LANG': 'C', 'LC_ALL': 'C', 'LC_MESSAGES': 'C'} or in the ansible_galaxy_install.py the force_lang is used. Here the LANG is set to C.UTF-8, if the command fails a retry is in place with the LANG set to en_US.UTF-8 .

So what is the best way to solve the problem. With LANG C the problem still exist, to make it adjustable or to force it to an other LANG are the best options in my opinion.

I am open for any possible solutions how to solve the problem

Knalltuete5000 avatar Feb 27 '24 10:02 Knalltuete5000

We had other cases of modules for which the default encoding was a moving target. Please take a look at other modules using CmdRunner for ideas.

@russoz I know this the beyond this very issue here, but please let me say that having an option force_lang that actually is not an option which allows to optionally force a certain lang, but is actually actively forcing the lang to be "C" (https://github.com/ansible-collections/community.general/blob/21404851485e2c19f3daac4819fbbece7ada2e94/plugins/module_utils/cmd_runner.py#L194) if you don't force it to something else is a little odd and unexpected.

Be it as it may, I believe the approach used for the Ansible Galaxy install that @Knalltuete5000 mentioned (https://github.com/ansible-collections/community.general/blob/21404851485e2c19f3daac4819fbbece7ada2e94/plugins/modules/ansible_galaxy_install.py#L254) could actually be generalized inside the CmdRunner. So why not use C.UTF-8 if available and only fallback to "C" if not. So kinda extending the "default" to be the best of ['C.UTF8', 'C'].

But only in case there is no explicit override done via force_lang to still allow forcing it.

frittentheke avatar Feb 27 '24 16:02 frittentheke

We had other cases of modules for which the default encoding was a moving target. Please take a look at other modules using CmdRunner for ideas.

I have looked through the code, where the CmdRunner is used. I have found some examples but they will not solve the problem. Either the parameter environ_update is set, in near all cases, to {'LANG': 'C', 'LC_ALL': 'C', 'LC_MESSAGES': 'C'} or in the ansible_galaxy_install.py the force_lang is used. Here the LANG is set to C.UTF-8, if the command fails a retry is in place with the LANG set to en_US.UTF-8 .

So what is the best way to solve the problem. With LANG C the problem still exist, to make it adjustable or to force it to an other LANG are the best options in my opinion.

I am open for any possible solutions how to solve the problem

IMHO, the Best Wayâ„¢ would be to find a way (a parameter or an env variable) to disable the fancy output in puppet. That's what I did in a couple of other modules. When that is achievable, the encoding problem disappears as the output's encoding becomes 100% compatible with C.

Failing that, the only way I can see for the time being is for the code to perform tests with the command, using encoding A, if it fails, then try B, then C, etc... until hitting one that works. Something similar to what @frittentheke has suggested - although...

... he suggests making that a generic feature of the CmdRunner code, which is kinda tricky, as it would require a "no-op" command and that depends entirely on which executable is being used in that particular module. Many (most?) executables won't be idempotent, so running the real command twice or more is not an option for such cases, and that complicates a generic solution.

russoz avatar Mar 08 '24 22:03 russoz

@russoz @felixfontein I'd really like for this issue / PR to move towards a fix and merge and not risk this to become stale.

That said .... Puppet agent does support different locales, see https://www.puppet.com/docs/puppet/7/config_about_settings.html#configuring_locale_settings on how that is done.

If I may quote the relevant section ...

You can use environment variables to set your locale for processes started on the command line. For most Linux distributions, set the LANG variable to your preferred locale, and the LANGUAGE variable to an empty string. On SLES, also set the LC_ALL variable to an empty string.

I am still wondering why it does not seem to respect C. Looking at https://github.com/ansible-collections/community.general/blob/21404851485e2c19f3daac4819fbbece7ada2e94/plugins/module_utils/cmd_runner.py#L255 only LANGUAGE and LC_ALL are set, so LANG remains untouched / unset? Also looking through the code via https://github.com/search?q=org%3Apuppetlabs+LC_ALL&type=code it seems those two variables usually go together.

Maybe that could be yet another way to potentially fix the issue @Knalltuete5000 and likely @skhan28 reported?

... he suggests making that a generic feature of the CmdRunner code, which is kinda tricky, as it would require a "no-op" command and that depends entirely on which executable is being used in that particular module. Many (most?) executables won't be idempotent, so running the real command twice or more is not an option for such cases, and that complicates a generic solution.

No @russoz, this is not what I meant when referencing https://github.com/ansible-collections/community.general/blob/21404851485e2c19f3daac4819fbbece7ada2e94/plugins/modules/ansible_galaxy_install.py#L254 I suggested to actually implement some sort of auto-discovery to first determine if UTF-8 is available on the system and then use this for any CmdRunner invoked commands, no attempts ... if it's available use it. And only if UTF-8 was not available, fall back to C. To say this bluntly, it's 2024 and it might even be more helpful to use UTF-8 as that is likely what most users would use interactively.

  1. But also the suggested change with the introduction of a variable seems very sensible. So if 1) was not a surprise success to really force C on the puppet agent, I'd rather have a switch to flip to make it work.

In any case, please let's fix this issue here in any way you are happy with. Not being able to randomly invoke the Puppet agent in some cases is the worse condition to keep this matter in.

frittentheke avatar Apr 02 '24 15:04 frittentheke

Thanks for coming back to the issue and the review.

I have added the recommended changes by @felixfontein and removed the unrelated changes.

Knalltuete5000 avatar Apr 15 '24 07:04 Knalltuete5000

This looks good to me. While I agree that an auto-detection would be better, this is a lot harder to get right, and we can still do it later (like do the auto-detection if environment_lang=auto, and eventually change the default from C to auto).

If nobody objects, I'll merge this in ~a week.

felixfontein avatar Apr 15 '24 12:04 felixfontein

Backport to stable-7: 💚 backport PR created

✅ Backport PR branch: patchback/backports/stable-7/1b8e6bc95bd3cc749ea65b8d713b1facc1db0c7e/pr-8001

Backported as https://github.com/ansible-collections/community.general/pull/8241

🤖 @patchback I'm built with octomachinery and my source is open — https://github.com/sanitizers/patchback-github-app.

patchback[bot] avatar Apr 20 '24 07:04 patchback[bot]

Backport to stable-8: 💚 backport PR created

✅ Backport PR branch: patchback/backports/stable-8/1b8e6bc95bd3cc749ea65b8d713b1facc1db0c7e/pr-8001

Backported as https://github.com/ansible-collections/community.general/pull/8242

🤖 @patchback I'm built with octomachinery and my source is open — https://github.com/sanitizers/patchback-github-app.

patchback[bot] avatar Apr 20 '24 07:04 patchback[bot]

@Knalltuete5000 thanks for your contribution! @frittentheke @russoz thanks for reviewing and discussing this!

felixfontein avatar Apr 20 '24 07:04 felixfontein