circt
circt copied to clipboard
[FIRRTL][FIRLexer] add bufferID parameter to FIRLexer
getLineAndColumn expects an optional bufferID. getMainFileID always returns 1, so in cases where the bufferID is not 1, this can cause a crash. Remove the unnecessary call to getMainFileID and leave out the bufferID argument (getLineAndColumn will find the buffer). mainFileID and bufferID are not the same.
Are you seeing this issue in the wild somehow, or is there a test we can add? It looks like the curBuffer is always the main bufffer. Could you elaborate on how these things diverge? I hope that FIRLexer::translateLocation is only ever called on a SMLoc from the file its lexing.
If you need this for some custom code, I think there could be a performance impact to infer the correct bufferID. Maybe it would make more sense to have the lexer take the buffer ID as a constructor parameter along with the source manager?
Are you seeing this issue in the wild somehow, or is there a test we can add?
Not sure how to reproduce this on command with firtool, but I had copied firtool and its FIRLexer and FIRParser pretty closely for a project. I was running into a crash every so often (not every run but often enough) when I would call translateLocation on a valid SMLoc but it would crash on this assertion at line 105 of SourceMgr.cpp: assert(Ptr >= BufStart && Ptr <= Buffer->getBufferEnd());
When I added manual checks that my SMLoc was valid before calling getLineAndColumn, I could tell that my SMLoc pointer was within the buffer I used. I concluded that somehow, the buffer of the main file was not the first buffer. I never figured out how, but by making the change I'm suggesting it fixed the issue (since now the SourceMgr finds the correct buffer for me or will fail a different assert if it can't find it).
Currently, the FIRLexer assumes that any location it needs to translate is always in the first buffer (which I believe is the "main file" for SourceMgr). I think you make a good point in suggesting to store the buffer ID as a member in the lexer class.
Maybe it would make more sense to have the lexer take the buffer ID as a constructor parameter along with the source manager? I think this would make the lexer design more extensible, so this is a good idea. When adding a new file to a SourceMgr via
AddNewSourceBuffer, the return value is the buffer ID that you can then pass to the lexer when constructing it. I like that this approach makes it explicit when constructing a lexer exactly which buffer to use. That way another lexer could be constructed on a different buffer within the same source manager (by some parser or other code).
I'll change my commit to reflect that with a bufferID parameter. This should also remove unnecessary calls to getMainFileID.
I also added a getBufferID() method so that FIRParser could still easily make a new lexer for parseModuleBody.
importFIRFile still assumes that the "main file ID" is the correct buffer to use.
Do I need to wait for a codeowner to merge or ping someone? Lattner is still listed as the only codeowner here: https://github.com/llvm/circt/blob/6c33d7cec3eb1e31ff07d0387d514c1fc9a10e30/codeowners/firrtl.json#L5
I'm not familiar with the procedure here, so I apologize for my ignorance.
@seldridge requesting review
Status? 2 questions:
- Can this be reached from the code base (hence needs a test)?
- Is there any objection even if not? It doesn't look like it should impact performance.