MSYS2-packages icon indicating copy to clipboard operation
MSYS2-packages copied to clipboard

bash: Fix CR handling

Open k-takata opened this issue 1 year ago • 8 comments

Fixes #1839

0005-bash-4.3-msys2-fix-lineendings.patch adds CRLF support. However, 0001-bash-4.4-cygwin.patch already added igncr option to Bash to support CRLF.

I confirmed that the Cygwin version of Bash has also the same issue with #1839 when the igncr option is set.

After debugging, I found that there is an issue in rewind_input_string() in parser.y that it doesn't take the CR into account.

This PR adds the following changes:

  • ~~Set the igncr option by default.~~
  • ~~Remove unnecessary changes in yy_input_name() in parser.y and use the igncr implementation.~~
  • Modify rewind_input_string() to take the CR into account. (It might be better to apply a similar change to the Cygwin version of Bash.)
  • Remove all the changes from y.tab.c. This file should be automatically generated from parser.y.
  • Set LC_ALL=C.UTF-8 when running make check for running on non-English locales.
  • Add two tests: run-ps1lf and run-crlf.

k-takata avatar Mar 31 '24 05:03 k-takata

Sorry for the delay. I'll try to find some time/motivation for a more in-depth look.

Do you see any potential problems with the default changing for existing users?

lazka avatar Apr 15 '24 07:04 lazka

0005-bash-4.3-msys2-fix-lineendings.patch and the igncr option of 0001-bash-4.4-cygwin.patch had duplicated functionality. MSYS2's bash has already done almost the same thing as igncr is set to 1. So, changing the default of igncr (and removing the duplicated part) doesn't have potential issues, I think. (If a user explicitly unset igncr, the behavior will be changed, though.)


In this PR, the most important part is here:

diff --git a/parse.y b/parse.y
index 8fd24a1c..232a33dc 100644
@@ -1678,7 +1684,16 @@ rewind_input_string ()
      into account, e.g., $(...\n) */
   xchars = shell_input_line_len - shell_input_line_index;
   if (bash_input.location.string[-1] == '\n')
-    xchars++;
+    {
+      xchars++;
+#ifdef __CYGWIN__
+      {
+	extern int igncr;
+	if (igncr && bash_input.location.string[-2] == '\r')
+	  xchars++;
+      }
+#endif
+    }
 
   /* XXX - how to reflect bash_input.location.string back to string passed to
      parse_and_execute or xparse_dolparen? xparse_dolparen needs to know how

If you have concerning about changing igncr, then this part can be like this:

diff --git a/parse.y b/parse.y
index 8fd24a1c..232a33dc 100644
@@ -1678,7 +1684,15 @@ rewind_input_string ()
      into account, e.g., $(...\n) */
   xchars = shell_input_line_len - shell_input_line_index;
   if (bash_input.location.string[-1] == '\n')
-    xchars++;
+    {
+      xchars++;
+#ifdef __MSYS__
+      {
+	if (bash_input.location.string[-2] == '\r')
+	  xchars++;
+      }
+#endif
+    }
 
   /* XXX - how to reflect bash_input.location.string back to string passed to
      parse_and_execute or xparse_dolparen? xparse_dolparen needs to know how

(Some other hunks need to be reverted in this case.)

k-takata avatar Apr 15 '24 08:04 k-takata

I gave it another try. One thing igncr also changes is it ignores "\r" in text in general, not just at the end:

test3() {
    local input="Line with\r Carriage Return"
    local expected="Line with Carriage Return"
    local result=$(echo -e "$input")
    [ "$result" != "$expected" ] && echo "OK" || echo "FAILED"
}

test3

that feels a bit risky, so I'd prefer the non-igncr variant, for now at least.

I'll try to write some tests covering the changes we currently have compared to upstream. Do you also have some example code that tests your changes, so we can add it to our integration tests?

After that we can think about enabling igncr again.

lazka avatar May 31 '24 13:05 lazka

OK, I've updated the PR not to use igncr.

Do you also have some example code that tests your changes, so we can add it to our integration tests?

Unfortunately, I have no ideas how to test the PS1 issue. I currently set PS1='$(date)\n\$ ' manually to see if it works.

Another test that might be better to add is to test a script file with CRLF line endings.

k-takata avatar Jun 03 '24 09:06 k-takata

(Sorry, closed by mistake. Reopened.)

k-takata avatar Jun 03 '24 09:06 k-takata

We received another report in oh-my-bash caused by this issue. May I ask what is the current blocker of this fix? Is it blocked because no one knows how to test the change? What is the current progress for the tests?

akinomyoga avatar Jun 14 '24 11:06 akinomyoga

Added a test named run-ps1lf. If this test is run on the bash without the patch, the following differences are shown:

run-ps1lf
1,3c1
< bash: command substitution: line 1: syntax error near unexpected token `)'
< bash: command substitution: line 1: `echo foo)'
<
---
> foo
5,7c3
< bash: command substitution: line 1: syntax error near unexpected token `)'
< bash: command substitution: line 1: `echo foo)'
<
---
> foo

On the patched version, only the test name is shown:

run-ps1lf

k-takata avatar Oct 02 '24 08:10 k-takata

@lazka Rebased on top of the latest master branch. I think this is ready for merging.

k-takata avatar Feb 10 '25 07:02 k-takata