rust icon indicating copy to clipboard operation
rust copied to clipboard

[grep] Empty lines not matching when -v flag used

Open targrub opened this issue 2 years ago • 6 comments

The problem description says the -v flag should collect all lines that fail to match the pattern. An empty line does not match a pattern of, say, "Of". Yet the test case test_one_file_several_matches_inverted_flag does not match the final line in the file paradise_lost.txt, which is an empty one. Similar issue in test case test_one_file_several_matches_inverted_and_match_entire_lines_flags.

Preferred solution: either remove the blank line from each of the test texts in grep.rs, or add an empty string to the expected results of the tests that ought to have them. Not preferred solution: leaving tests as-is and documenting that empty lines should not be treated as matches when the inverted flag is used. Either preferred solution would be consistent with the documentation and isn't that grep's regular practice?

Note: there may be other test cases involving the -v flag that are similarly incorrect. Also, I am willing to submit a PR for this issue if one of the preferred solutions is decided on.

targrub avatar Jul 25 '22 13:07 targrub

The intention is that there should not be any empty lines in any of the contents. There are not currently any as of the current code in https://github.com/exercism/rust/blob/main/exercises/practice/grep/tests/grep.rs.

But, we must have a terminating newline at the end of a file because the standard definition of a line given in https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/basedefs/V1_chap03.html#tag_03_206 is that it must end in a newline. The fact that the file's last character in a newline must not be interpreted to mean that there is an additional empty line after the non-empty lines. Lines end in newlines, rather than being separated by newlines. If we removed that newline, we would be non-compliant.

Even our example solution in https://github.com/exercism/rust/blob/main/exercises/practice/grep/.meta/example.rs gets this wrong. It does a split('\n') and then works around its own mistake by doing a filter(|(_, line)| !line.is_empty()). In an ideal world, the example solution should be corrected to remove this mistake, but that isn't required for solving this issue.

If the decision is to remove the trailing newline and therefore become non-compliant, I won't stand in the way of that decision, though it wouldn't be preferred.

An empty line must not be added to the tests, since one doesn't exist in the input files.

petertseng avatar Jul 26 '22 05:07 petertseng

I do think some clarity would be good here. So perhaps to note in the tests that because lines are terminated by newlines instead of separated by newlines, the trailing newline does not mean there is an empty line.

petertseng avatar Jul 26 '22 05:07 petertseng

Completely agree with all you said, Peter. And now I learned about the existence of lines().

targrub avatar Jul 26 '22 17:07 targrub

Thanks @targrub for bringing up this issue and @petertseng for resolving it!

Is there anything unresolved here for which we need to leave this issue open, or can we close it now?

coriolinus avatar Jul 26 '22 19:07 coriolinus

mainly it would be adding an explanatory comment to the test file about the trailing newline, or convincing ourselves that such a thing is not necessary

petertseng avatar Jul 26 '22 21:07 petertseng

I think that an explanatory comment makes sense.

coriolinus avatar Jul 26 '22 21:07 coriolinus

I think explaining how lines work in grep and elsewhere is outside the scope of this exercise. If anything, such an explanation might be added upstream in problem-specifications.

senekor avatar Dec 15 '23 20:12 senekor