nix
nix copied to clipboard
Tabs silently break indentation stripping
Describe the bug
Tabs are not supported in the indentation stripping parser. This is the right behavior in principle, but when a user does insert a tab, the behavior does not match their indentation. The right solution is to warn and let the user fix the problem. Supporting tabs is wrong, because tab width is undefined and will never be defined consistently.
Adding a warning may make '' '' strings annoying for users who have to write strings with tabs at the start of lines, but their solution is already fragile, so they should move to explicit tabs ("\t") or put the string in a separate file.
Steps To Reproduce
- Write a file with a
''string indented with tabs or a mix of spaces and tabs. - Parsed string contains tab characters.
Expected behavior
Warn when the first tab is encountered in the indentation stripping logic. Suggest alternatives, for both cases: user wants tabs stripped, or (less likely, current behavior) user wants to keep tabs in their string.
nix-env --version output
Additional context
Removing tabs from indented strings is a hash-changing breaking change. (edit: 2024-10)
- see also https://github.com/NixOS/nix/pull/9971#issuecomment-1986903150
Priorities
Add :+1: to issues you find important.
(less likely, current behavior) user wants to keep tabs in their string.
If not at the start ?
Supporting tabs is wrong, because tab width is undefined and will never be defined consistently.
I don't understand this. Tabs having a configurable width is the point of using tabs, why should this be wrong?
I do not see how tabs themselves can cause any issues. What can cause issues is mixing tabs and spaces in indentation, which should definitely be prohibited, but otherwise the parser should be fine with tabs, since it does not actually depend on indentation width, but rather on amount of indentation levels.
I really hope we won't end up forbidding tabs.
Well, my entire system configuration uses tabs and my only actual complaint is that one needs to use a workaround to manually strip indented strings when dealing with inline Python (part of actual configuration used in production):
let
# Helper: Workaround to remove leading indentation from generated files,
# since otherwise the Python scripts do not work
fixIdentWorkaround =
prefix: string: lib.strings.concatMapStringsSep "\n" (line: lib.strings.removePrefix prefix line) (lib.strings.splitString "\n" string);
in {
# …
siteSettings = pkgs.writeText "modoboa-site-settings.py" ((fixIdentWorkaround "\t\t" ''
# Define special mode, induced by environment variable, that avoids
# any external file access during the instance derivation build
import os
MODOBOA_BUILDING_INSTANCE = os.environ.get("MODOBOA_BUILDING_INSTANCE") == "1"
# Secret key used as a salt in many operations
#
# Generated on first-run, since it needs to be managed and
# persisted by the system between Modoboa versions, so it won’t
# necessarily be around when we generate the instance derivation.
if not MODOBOA_BUILDING_INSTANCE:
with open(${escapePyString secretKeyFile}, "r") as secret_key_file:
SECRET_KEY = secret_key_file.read().rstrip("\r\n")
else:
# Static dummy secret key
SECRET_KEY = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
# …
'';
# …
};
Adding a warning to “make '' '' strings annoying for users who (have to) write strings with tabs at the start of lines” would indeed be highly unwelcome, yes. Of course some Nix devs seem to have made it their personal mission to banish every last tab character in the world, since it is obviously better if we all use 2 spaces of indentation. That way, figuring out where that large opened block in that longer configuration ends stays extra hard for everyone… (Oh the terror if we had variable tab size and could make the indentation larger or smaller depending on space availability and reading requirements…!)
Please, bring tab indentation support to Nix!
Not only tabs break indented strings, CRLF does too. Whenever you make Nix support tabs or warn about them, please make the corresponding adjustment for CRLF as well.
I did already mention that in previous discussions (e.g. #3759), but since I haven't seen it mentioned here, I thought I highlight it here again.
The reason why CRLF breaks indented strings becomes relatively obvious once you replace every carriage return with \r. Note that the first line is usually empty (and ignored), but does now contain a carriage return as the first character.
''\r
a\r
b\r
''\r
CR handling is indeed broken, although I'd consider it a separate issue. After some experimentation, I've come to the conclusion that the least bad solution to dealing with this is 1. disabling CR-only line endings as parse error, and 2. normalizing CRLF as \n in all strings at parse time.
normalizing CRLF as \n in all strings at parse time.
This could also be done with tabs then. Treat a tab as a 2 space while evaluating.
That would surely be possible, but (ignoring backwards compatibility concerns) would not allow putting raw tabs into strings. There are better solutions to this problem, as seen in other languages and implemented in https://gerrit.lix.systems/c/lix/+/2105/5
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/satisfaction-survey-from-the-new-rfc-166-formatting/49758/37