Split the beatmap parser

Open fmang opened this issue 7 years ago • 0 comments

The current beatmap parser is pretty intricate, because the parsing process is not as simple as it sounds.

I'd consider splitting it into two phases:

The first phase reads the beatmap as a character stream, and generates a raw AST, faithful to the original contents.
The second phase interprets the raw AST and builds a high-level beatmap object. In particular, it implements inheritance, slider path normalization, unit conversion.

Pros:

Makes the code and the concepts easier to understand by splitting the task into 2 clear units.
Makes the parser easier to test. See #44.
Supports parsing beatmaps with misordered sections or objects.

Cons:

Performance. An all-in-once process will eat less RAM and CPU. Probably a non-issue since the parser represents a tiny fraction of oshu!'s load time.

Alternative: SAX-like parser

SAX is a way to parse XML by reacting to events like “opened ”, “found a text node”.

For a beatmap, those events would be “got a metadata key-value”, “got a hit object”. These would be defined in an abstract class as pure virtual methods. The beatmap parser then calls these methods as it reads the beatmap.

This approach effectively split the raw decoding logic, and the interpreation logic. I gets most of the pros above, without the cons. It won't support unordered objects naively though.

The two ways aren't incompatible, as the AST builder could be easily implemented using that abstract interface.

Jan 25 '18 19:01 fmang