grammars-v4 icon indicating copy to clipboard operation
grammars-v4 copied to clipboard

VBA incorrectly parses attributes and variables

Open Xenios91 opened this issue 1 year ago • 11 comments
trafficstars

When a document has attributes mixed with variables, it fails to parse correctly, as it believes module level declarations should occur after all attribute declarations. This seems to be not uncommon in documents ive scrapped off the web, resulting in parsing errors.

Xenios91 avatar Jan 20 '24 02:01 Xenios91

Can you provide examples? Is it possible they do not parse correctly because they are, in fact, not proper syntax?

https://learn.microsoft.com/en-us/openspecs/microsoft_general_purpose_programming_languages/ms-vbal/d5418146-0bd2-45eb-9c7a-fd9502722c74

The MS-VBAL specification states that the module body must occur after the module header.

Beakerboy avatar Jan 24 '24 20:01 Beakerboy

Can you provide examples? Is it possible they do not parse correctly because they are, in fact, not proper syntax?

https://learn.microsoft.com/en-us/openspecs/microsoft_general_purpose_programming_languages/ms-vbal/d5418146-0bd2-45eb-9c7a-fd9502722c74

The MS-VBAL specification states that the module body must occur after the module header.

I agree they are improper, however, I have come across documents in the wild that have this done. I am wondering if they still run and are like this due to document generation libraries? Either way its weird.

Xenios91 avatar Jan 26 '24 01:01 Xenios91

Let me run some of these through a sandbox again to confirm they are indeed running properly.

Xenios91 avatar Jan 26 '24 01:01 Xenios91

Excuse my ignorance, I am new to grammar Lexer/parser stuff, but why would you want improper documents to parse correctly. I thought the “proper” way was to create error handlers which intercept the parsing errors, and correct them to create valid files. If Apache POI is creating invalid files, isn’t that a problem that should be fixed on their end?

Beakerboy avatar Jan 26 '24 14:01 Beakerboy

I guess it depends on what you want, from my perspective im interesting in using this tool to parse malware, so when a document is inproper but runs it still interest me.

Xenios91 avatar Jan 27 '24 01:01 Xenios91

If this is a democracy, my vote would be for the general-use grammer here to be the best reflection of any official standard, with enough extras to make it easiest to use, but I don’t know what the typical philosophy is.

Beakerboy avatar Jan 27 '24 04:01 Beakerboy

The existing ANTLR grammar is missing many of the keywords that appear in the header it’s possible that VBA files with variables name things like VB_Base in a position that it shouldn’t be, will parse with ANTLR when it shouldn’t according to MS-VBAL

However, that document could have bugs. In fact, I found the document incorrectly defines the line-continuation lexical string. I notified Microsoft and they said it should be fixed in the next publication.

Beakerboy avatar Jan 29 '24 23:01 Beakerboy

Thought you’d be interested…since VBA is now at version 7.1, and this grammar is supposedly 6.0, I’ve started from scratch rewriting the grammar from the published Microsoft Spec. My plan is to add a lot more tests to get the coverage up as high as possible.

https://github.com/Beakerboy/grammars-v4/tree/patch-7/vba

feel free to help out or provide feedback.

Beakerboy avatar Feb 02 '24 17:02 Beakerboy

Also, in testing my new grammar, I’ve realized that, since VBA has a whole pre-compiler feature, it’s totally possible to have valid VBA files which are unable to be parsed by a single antlr4 grammar

#If Win16 Then
    Public function foo()
#Else
   Public Function foo()
#Endif
   End Function

this is valid, but without pre-compiling any reasonable grammar will fail.

Do you have a precompiler coded up? I created a VBA precompiler grammar that could probably be leveraged with some visitors to make one. https://github.com/Beakerboy/grammars-v4/blob/coverage/vba/vba_cc/vba_cc.g4

Beakerboy avatar Feb 07 '24 20:02 Beakerboy

My precompiler is pretty much done: https://github.com/Beakerboy/VBA-Precompiler/tree/dev

Beakerboy avatar Feb 15 '24 19:02 Beakerboy

It’s possible that @Xenios91 is referring to VB_[Var]Description https://vbaplanet.com/attributes.php

I have submitted a request to Microsoft to clarify this in the specification document

Beakerboy avatar Feb 17 '24 12:02 Beakerboy