bats-support
bats-support copied to clipboard
How Should `batslib_count_lines` Handle Trailing Newlines?
Currently, batslib_count_lines
is written to try and ignore the last newline to be equivalent to it not being there.
This leads ''
and '\n'
as both being "0 line". Similarly, 'a'
and 'a\n'
will both be considered as "1 line". This is explicitly checked in tests.
Issue
However, when using run --keep-empty-lines
, assertions will check trailing newlines. However, this leads batslib_is_single_line
to still return 0, and the error messages will be printed by batslib_print_kv_single
instead of batslib_print_kv_multi
(which shows the line numbers, a helpful indicator for debugging).
Example (using bats-assert):
@test "trailing empty line" {
run --keep-empty-lines printf 'a\n'
assert_output "$(printf 'a')"
}
prints:
-- output differs --
expected : a
actual : a
--
The new line is there, but it's hard to really notice. More egregiously, a multi-line output will be very misleading. Example:
@test "trailing empty line on multiline" {
run --keep-empty-lines printf 'a\nb\n'
assert_output "$(printf 'a\nb')"
}
prints:
-- output differs --
expected (2 lines):
a
b
actual (2 lines):
a
b
--
This both hides the extra new line and states that the line counts are the same.
Discussion
I'm writing out this GH Issue (and not just sending some PR(s)) because there's a few ways that this could possibly be addressed. I wanted to chat this over and see which way you'd think would be best. I'm happy to put together a PR for wherever we land.
Here's some options that I've considered, feel free to suggest more:
1. Update the impl for batslib_count_lines
.
This could be done by changing
batslib_count_lines() {
local -i n_lines=0
local line
while IFS='' read -r line || [[ -n $line ]]; do
(( ++n_lines ))
done < <(printf '%s' "$1")
echo "$n_lines"
}
to
batslib_count_lines() {
local -r input="$1"
local -i n_lines=0
local line
if [ -n "$input" ]; then
while IFS='' read -r line; do
(( ++n_lines ))
done < <(printf '%s' "$input")
# Increment for the EOF (not captured by the loop above).
(( ++n_lines ))
fi
echo "$n_lines"
}
There are some related extra changes that are necessary with this approach. batslib_prefix
needs to be updated in a similar way (else batslib_print_kv_multi
in batslib_print_kv_single_or_multi
will print the incorrect number of lines).
Potential impl change from
batslib_prefix() {
local -r prefix="${1:- }"
local line
while IFS='' read -r line || [[ -n $line ]]; do
printf '%s%s\n' "$prefix" "$line"
done
}
to
batslib_prefix() {
local -r prefix="${1:- }"
local line=''
local -i lines_read=0
while IFS='' read -r line; do
printf '%s%s\n' "$prefix" "$line"
(( ++lines_read ))
done
# Print for the line with EOF (not captured by the loop above).
# Do not print if stdin was completely empty.
if (( lines_read > 0 )) || [[ -n "$line" ]]; then
printf '%s%s\n' "$prefix" "$line"
fi
}
batslib_mark
would likely also need to be updated in a similar way (these 3 functions all use while IFS='' read -r line || [[ -n $line ]]; do
).
While I think this is pretty straightforward (and my personal preference), I understand that this may have backwards compatibility issues and may have implications for other functionality built upon this function (which may really care about ignoring trailing newlines).
I can see that bats-assert makes use of these 3 functions in refute_line
. The suggested changes to batslib_prefix
and batslib_mark
wouldn't really change anything. Though the indirect change to batslib_is_single_line
(via the change to batslib_count_lines
) might lead to some more calls being made to batslib_print_kv_multi
instead of batslib_print_kv_single
. I'd suspect that wouldn't affect most cases, as the trailing newline is removed unless explicitly forced to be kept (which is likely desirable).
2. Add --keep-empty-lines
(or similar) to batslib_count_lines
.
This option keeps the old functionality and allows the caller to change into tracking all newlines. This could either be done as opt-in with --keep-empty-lines
or --include-trailing-newline
; or it could be opt-out, where the default is to include it and --ignore-trailing-newline
is added.
This option (specifically the opt-in version) is more backwards compatible with existing logic. However, it does require adding/piping the flag through various functions, which can be a bit verbose.
Concerns about differentiating --keep...
from intended input can be worked-around via there should always be an even number of positional arguments to these specific functions.
3. Create alternative batslib_count_lines
function.
In this option, a new function (e.g. batslib_count_all_lines
) can be added to use the new logic.
This is very backwards compatible, as it does not touch the existing batslib_count_lines
function. However, it would require doing the same for other functions like batslib_is_single_line
, batslib_print_kv_multi
, batslib_print_kv_single_or_multi
, etc.
Those are the suggestions that I have thus far. As I mentioned, I think option 1 is the best option. Let me know what you think.
Thanks!
P.S. There's a psudo-related TODO(ztombol) on batslib_count_lines
about fixing inconsistencies vs ${#lines[@]}
. This isn't really the same issue, but it's somewhat related. Thought I'd mention it.
As another alternative option:
4. Auto-detect which logic to use.
batslib_count_lines
could potentially auto-detect whether to keep empty lines or not based on the input string. If the input variable has trailing newline(s), then they must have been manually kept (via something like --keep-empty-lines
).
This could be checked by something like [[ "${input: -1}" = $'\n' ]]
.
If there is a newline, then use the logic given in Option 1. If there is no newline, use the existing logic.
I think this might fix the issue and be backwards compatible.
While this option offers a more seamless approach for batslib_count_lines
, there still remains the issue of batslib_prefix
(which affects batslib_count_lines
in batslib_print_kv_single_or_multi
). To address this issue, either Option 1 (always print trailing new line if non-empty string) or Option 2 (add a --keep-empty-lines
option) would be a fine choice to update batslib_prefix
. Option 2 is likely better for this scenario, as batslib_print_kv_single_or_multi
can optionally pass it for each value based on the same auto-detect logic ([[ "${pairs[$i]: -1}" = $'\n' ]]
).
Again, batslib_mark
should be updated in the same way as batslib_prefix
.