Improvements to `unfill` and `refill` from the runwrap crate
In #224, @veikman mentioned his new runwrap library and it has this comment:
/// Preserve initial indentation on unwrapping.
/// This is a workaround for textwrap’s tendency to interpret non-alphanumeric leading characters
/// as indentation (e.g. comment syntax) and destroy it. What textwrap calls “subsequent_indent” is
/// destroyed without comment.
I would love to hear more and discuss how we can improve this.
A few weeks ago, I downloaded all public crates which depend on Textwrap and none of them use unfill or refill. In other words, these functions are pretty new and unproven. I'm sure we can make them better!
I found the behaviour in question when I ran textwrap on an ordinary Markdown heading. There is a unit test case for this here. In that particular use case (Markdown), I don’t think of leading hash characters as indentation, but I suppose that could be a matter of opinion.
I imagine it would be possible to expand the Options struct with limitations on what can be considered indentation for the purpose of unfilling, but I don’t yet know enough about the problem to specify how, and I’m not sure it’s important. The workaround does what it should for now.
Thanks for the explanation! So the problem is that unfill sees
# A heading
as "A heading" with "# " as initial_indent. That does indeed seem rather silly :smile:
I guess it would work much better if we restrict the heuristic a bit:
-
only look for initial and subsequent indentation in multi-line strings. That should prevent a lot of misinterpretations.
-
perhaps only set
initial_indentif it is equal to what we detect assubsequent_indent?