coreutils icon indicating copy to clipboard operation
coreutils copied to clipboard

ptx: special char breaks it with "thread 'main' panicked at 'assertion failed: end <= s.len()', src/uu/ptx/src/ptx.rs:291:5"

Open sylvestre opened this issue 4 years ago • 6 comments

With foo.txt containing:

it’s disabled

The char is key here

ptx -G foo.txt
thread 'main' panicked at 'assertion failed: end <= s.len()', src/uu/ptx/src/ptx.rs:291:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

sylvestre avatar Apr 07 '21 20:04 sylvestre

So the root cause seems to be that read_input is producing a FileContent that's considering string as Vec, but Regex used in create_word_set expose match locations as byte indices.

The is a UTF-8 char. That's a single char but it's several byte long so this is breaking some assumptions.

I will see if I can fix that while keeping the optimizations recently introduced. I plan to work on that in the next few days.

ttrunck avatar Apr 08 '21 20:04 ttrunck

I have an idea on how to fix this: use iterators instead of Vec<char>. Trimming whitespaces, for example, could be done simply with skip_while. Trimming broken words is more complicated, as we need to check for whitespace beyond the edge of the string. We need our own type that is like std::str::Chars but also keep the character beyond the beginning/end of the iterator. For the jumping backward and trimming from the right, we can leverage std::str::Chars being DoubleEndedIterator (utf-8 can be walked backward too, I just discovered).

@sylvestre I've been making test cases. Should I add Unicode cases too or skip it until this is fixed? (adding it would cause failures even for things unrelated to ptx - I worry that might be annoying).

wishawa avatar Apr 09 '21 17:04 wishawa

@wishawa are you working on your idea? You have more context than me, so if you are taking care of that I will work on something else.

ttrunck avatar Apr 11 '21 15:04 ttrunck

@wishawa are you working on your idea? You have more context than me, so if you are taking care of that I will work on something else.

No. I'm still working on the tests (and am slow at that lol). Feel free to go ahead and work on this one😃

wishawa avatar Apr 12 '21 01:04 wishawa

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 12 '22 05:04 stale[bot]

It is still happening. As I don't think @wishawa is still working on it, others can work on it!

sylvestre avatar Apr 12 '22 06:04 sylvestre