purescript-language-cst-parser
purescript-language-cst-parser copied to clipboard
let SourcePos include start and end in characters
line numbers and column positions are great for humans. I find that when indexing source code for automatic editing its easier to interface with the file or string using index.
Is this within the scope of this project?
I think it would be valuable to track index by code unit. The positions now are line/col by code point. I'm not sure anyone wants index by code point.
I have not worked that much with Unicode strings in a while, let me try to understand the difference of unit and point.
One code point can consist of multiple code units? You always know how many bits a unit is.
Then in that case, yes it's code units I'm interested in. And less so code points.
I think it would be fine to add an index/offset field to https://github.com/natefaubion/purescript-language-cst-parser/blob/main/src/PureScript/CST/Types.purs#L18-L21