Ability to use cartesian positions directly
Hello,
Unless I cannot find the correct API, it looks like it is only possible to create diagnostics using byte indexes within the source. What I currently have access to is only cartesian positions, so I have to open the file and transform that line-offset into byte index. Then, the renderer re-opens the file to do the printing, and to transform it back to cartesian coordinates.
Would it be possible add an API to create Ranges from cartesian coordinates?
I'm happy to do a PR if you are ok with this, it will take time because I have very limited time right now.
Hey 👋
Unless I cannot find the correct API, it looks like it is only possible to create diagnostics using byte indexes within the source.
Yes, only byte-oriented positions are supported at the moment.
What I currently have access to is only cartesian positions
What parsing library are you using? Both Menhir and Sedlex will be able to give you byte-oriented positions
so I have to open the file and transform that line-offset into byte index. Then, the renderer re-opens the file to do the printing, and to transform it back to cartesian coordinates.
If its any better, I could easily make a patch to avoid re-reading the file for line offsets
by allowing the pp_diagnostic functions to take a custom line_starts_fn when
configuring the Source_reader, see here for more details.
I'm happy to do a PR if you are ok with this, it will take time because I have very limited time right now.
Contributions are definitely welcome! Though it might be easier to try recover the original byte offsets you might've had.
What parsing library are you using? Both Menhir and Sedlex will be able to give you byte-oriented positions
I did some digging -- I assume this issue is related to your contribution on https://github.com/soteria-tools/soteria/pull/69.
Looking at it, it seems you're using the Cerberus parser (which uses Menhir under the hood). The
function you'll want to convert Cerb_position.t to Grace's Byte_index.t is:
let cerb_pos_to_idx cpos = Byte_index.of_lex (Cerb_position.to_file_lexing cpos)
Oh thank you for looking into this!! I appreciate your time :)
Looking at it, it seems you're using the Cerberus parser (which uses Menhir under the hood). The function you'll want to convert Cerb_position.t to Grace's Byte_index.t is:
I did try that before writing my byte-index retrieval, but unfortunately Cerberus doesn't parse the C source file, it parses the file that the C preprocessor outputs. If one tries to us Byte_index.of_lex, it just yields an out-of-bound exception.
E.g. take the C file
#include <stdlib.h>
int main() { return *NULL }
The preprocessor will add the entirety of stdlib.h (several hundred lines) before that, and the byte_index of *NULL will be a few thousands.
Cerberus keeps track of the correct line of code before pre-processing, though it might keep track of the wrong column number if there is a preprocessing happening within a line but heh 🤷 Anyway, one cannot trust the byte-offset given by Cerberus's lexer :(
If its any better, I could easily make a patch to avoid re-reading the file for line offsets by allowing the
pp_diagnosticfunctions to take a customline_starts_fnwhen configuring theSource_reader, see here for more details.
Definitely! That being said, printing error messages is currently not a bottleneck of our tool at all, so no pressure on our end to investigate.
I was thinking about this, what if we parameterized the range type in diagnostics e.g.
module type Range = sig
type t
include Pretty_printer.S with type t := t
(** Able to use [Source_reader] in here. *)
val to_range : t -> Range.t
end
This would allow you to use a custom range representation and only need the translation when printing?
Very interesting you should say that! 2 days ago I made the Range notion in our tool a parameter type!
At first, I tried using Grace's notion of range, but I quickly realised that no notion of range is perfect, and all tools will have their own kind of range. Anyway, I fully support this :)