syntect Abandon `nonewlines` for v6.0.0

Practically since the beginning, syntect has had both newlines and nonewlines.

However, it seems to me as if nonewlines is more of a problem than a solution (see e.g. #97). It was called a 'hack' in https://github.com/trishume/syntect/issues/196#issue-347643743.

So, why not remove nonewlines for the v5.0.0 release (whenever that will be)? I would happily make a PR for that. I guess deprecating this for the next v4 release would be a good first step.

I deeply apologize in advance for any ignorance on my part. I'm sure I'm missing something crucial.

Nov 10 '21 06:11 Enselic

Yah seems plausible that this would be net good. I mainly wanted to have it around for use by people who already had strings without a newline on the end, and couldn't easily change. It does cause a lot of mysterious highlighting issues though.

I wonder if there's a way to just make it more annoying or clear that if you choose to use it you better be real sure you're not better off just adding newlines, and that you're opening yourself up to mysterious highlighting issues even though it'll mostly work.

Nov 13 '21 22:11 trishume

Thank you for providing additional background information.

What I still don't quite understand is why someone would want buggy highlighting at all. Making sure lines includes \n seems to me to be a very low price to pay for getting highlighting that works. Especially considering that there is stable Rust APIs that preserves \n (and that we recommend ourselves in the docs). It is of course very possible that there is a use case that I just don't consider. I would love to be enlightened.

I do however think you make a very strong point in that it would be great if it was easy to do the right thing (include \n) and hard to do the wrong thing (not include \n). That holds true even if we remove the nonewlines APIs. For example, we could introduce a zero-cost type LineWithNewline instead of using &str. With appropriate scaffolding, it would be very hard to mistakenly not include \n in lines, and very easy to do include it. There are also other venues that could be explored. Such as an opt-in cargo feature that removes a runtime check for the presence of \n.

I think removing nonewlines APIs in addition to making correct use of the API more type-safe has great long term benefits for both users of and contributors to syntect.

Users would:

be confronted with a simpler API with less tricky (for them) tradeoff choices
not have to encounter nonewlines specific bugs
Not have their binary bloated by both the nonewlines and newlines dumps if enabling default-syntaxes

Contributors would:

not have to spend time maintaining nonewlines code paths
not have to spend time triaging bugs caused by nonewlines
not have to spend time debugging nonewlines-specific problems

So, what would the next steps be here? I would like to propose the following:

We come up with and deploy/publish/commit a type safe API to help enforce correct use of the API (something along the lines of what I proposed above)
Once that is done and established, we deprecate nonewlines APIs
We eventually remove nonewlines APIs

Do you think this sounds like a good plan? If yes, I would be happy to explore step 1 above and come up with Draft PRs that we can discuss further.

I hope I don't come across as unreasonable. I am very open to be convinced of that nonewlines should stay. I very much appreciate us discussing this.

(As a bonus, we could also take care of https://github.com/trishume/syntect/issues/314 as part of this work.)

Nov 16 '21 13:11 Enselic

Seems like lazy-loading changes require us to release 5.0.0, so I am tentatively setting 6.0.0 as target for this work since this is not a small task.

Jan 03 '22 09:01 Enselic

syntect syntect copied to clipboard

Abandon `nonewlines` for v6.0.0

syntect
syntect copied to clipboard