commonmark-hs icon indicating copy to clipboard operation
commonmark-hs copied to clipboard

`show :: SourceRange -> String` is ambiguous

Open lehmacdj opened this issue 3 years ago • 2 comments

For admittedly uncommon filenames & degenerate ranges there is ambiguity in the implementation of show :: SourceRange -> String. Consider the following two examples:

λ> import Text.Parsec.Pos
λ> show $ SourceRange [(newPos "asdf" 0 0, newPos "asdf" 0 0), (newPos "asdf" 0 0, newPos "" 0 0)]
"asdf@0:0-0:0;0:0-@0:0"
λ> show $ SourceRange [(newPos "asdf" 0 0, newPos "0:0;0:0-" 0 0)]
"asdf@0:0-0:0;0:0-@0:0"

I discovered this while writing a parser for such ranges. This is probably sufficiently degenerate that it doesn't necessarily need to be fixed, but it should be possible to escape the filename without complicating the implementation or impacting performance too much.

I would suggest simply enforcing that filenames containing : or - are printed escaped as show normally does. That is the second range here would instead be shown as the string: "asdf@0:0-\"0:0;0:0-\"@0:0". The first case would remain unchanged.

lehmacdj avatar Mar 05 '22 21:03 lehmacdj

I suppose then we'd still have an ambiguity involving filenames that start and end with ", wouldn't we? Maybe another approach would be to percent-escape : and - (and I suppose %)?

jgm avatar Mar 05 '22 21:03 jgm

I was imagining that filenames starting with a quote would themselves be escaped like in strings so "asdf@0:0-\"\\"\"@0:0" would be the representation of SourceRange [(newPos "asdf" 0 0, newPos "\"" 0 0)].

Percent escapes seem a lot simpler / concise, and their specification doesn't depend on the specification of Haskell so that seems better.

Will put together a small PR with that behavior.

lehmacdj avatar Mar 05 '22 21:03 lehmacdj