ion-java
ion-java copied to clipboard
Expose the current line number when reading Ion text
There is no way I have been able to find to get an accurate line number from a supported API of IonReader when reading Ion text.
There is the SpanProvider but it must not be meant for for this purpose because it doesn't provide accurate line and column numbers.
Is there another way to do this that I'm not aware of?
The work-around for the time being is a dirty hack such as this one (written in Kotlin):
private fun dirtyHackToGetTheLineNum(): Long {
//Attempt to get the line number using the dirty hack first (this way provides accurate line numbers)
try {
//"reader" is an IonReader
val scannerField = reader.javaClass.superclass.superclass.getDeclaredField("_scanner")
scannerField.isAccessible = true
val scanner = scannerField.get(reader)
val lineCountField = scanner.javaClass.getDeclaredField("_line_count")
lineCountField.isAccessible = true
val lineNum = lineCountField.get(scanner) as Long
return lineNum
} catch(_: NoSuchFieldException) {
return -1
}
}
reader has been initialized thusly:
val inputStream = FileInputStream(...)
val reader = ionSystem.newReader(inputStream)
SpanProvider is indeed the correct API. If a resulting TextSpan is not providing accurate line numbers, please provide a failing test case, or at least an example. I am skeptical of the claim since my higher-level tool gives me line numbers every day and I've not noticed a problem.
BTW the easiest way to get a TextSpan is viaSpans.currentSpan(TextSpan.class, reader)
There are two problems with using SpanProvider.
- If the reader is not currently "on a value" then an exception is thrown when an attempt is made to fetch a
Span, however there is no reason the value of that_line_countvariable that my dirty hack reads can't be returned (possibly from a different API method) except that the API doesn't expose it. I just want the reader to tell me how many lines it has counted thus far--I don't care if I've just opened theIonReaderor if I've calledIonReader.stepIn()but haven't calledIonReader.next()or not. (Either condition will cause anIllegalStateException.) - The accuracy issue actually seems like a bug to me. Here is my reproduction case. A couple of observations about the output of my program:
- The line number inaccuracy seems to occur after reading the second field in the struct. Afterward, the line number is off by one for every field.
span.finishLineis always -1 for some reason.
Line counting is tricky. The private _line_count can't be exposed with any robust semantics, because in some cases the parser/lexer needs to read further ahead than the cursor. For example, if we encouter the text sequence ('a' we have to keep reading to see if the next token is :: and that can be far ahead, making the internal _line_count completely wrong wrt the symbol. Or at least it means something else that doesn't seem particularly useful.
If the cursor isn't positioned on a value, what line number would be returned? There can be any amount of space between (and inside!) values, so there's no well-defined answer.
The test case does seem to show a bug in the line number. I think that finishLine support is just not implemented, IIRC from quite a long time ago.
There are definitely gaps in the API, and lots of tokens that we can't "see" through them (field names and annotations, for example). A raw "line number" without any context seems unproductive, but I'd love to see proposals to get offsets for more parts of the stream.
I isolated the bug you found as #233 so that can be fixed separately from discussion of API additions.