pegdown icon indicating copy to clipboard operation
pegdown copied to clipboard

Move logic for (un)closed sequences (Strong/Emph, ..) from HTMLSerializer to a generic Node type

Open Elmervc opened this issue 12 years ago • 0 comments

A general problem with peg-parsers is the exponential parse-times for sequences with an opening and closing character (like _emph_,__strong__ and [[wiki-links]]) which have multiple opening characters nested in a single sequence.

To overcome this, I've added the notion of isClosed to the StrongEmphSuperNode. This only solves the problem for strong and emph sequences, but there are more cases that needs to be covered (https://github.com/sirthias/pegdown/issues/104).

I think the way to go is to have a generic super-type node with a boolean denoting the state of the sequence (closed or not closed), eg. OptionSuperNode extends SuperNode, where the option depends on whether the sequence is closed or not.

The implementation of this class will look similar to StrongEmphSuperNode, except that the accept(Visitor visitor)-method should decide to call visitor.visit(this) in case the sequence is closed, or in case of an unclosed sequence:

new SpecialTextNode( this.getOpenChars() ).accept(visitor) //getOpenChars should be implemented by inheriting node-class
for (Node child : this.getChildren()) {
    child.accept(visitor);
}
new SpecialTextNode( this.getCloseChars() ).accept(visitor) //getCloseChars should be implemented by inheriting node-class

This decision is currently made in HTMLSerializer.java, forcing other implementations of serializers to contain this logic also, i.e. it is not the right place.

We may use this generic OptionSuperNode class (or different name) for other sequences.

Elmervc avatar Oct 08 '13 12:10 Elmervc