jBBCode [WIP] fixed generation of bbcode for uppercased tags

This PR is actually based on my other cleanup PR. This PR makes only sense after the cleanup PR has been merged.

I found two issues:

when uppercased tags were used and the text was invalid than the code was not able to re-build the original code. [URL=INVALID]foo[/URL] was generated to [url=invalid]foo[/url].
when options were enabled but the tag itself had no option and the text was invalid than the code added an empty =. So [URL foo=bar]INVALID[/URL] was generated to [url= foo=bar]INVALID[/url].

I am not quite sure if the change in ElementNode::getAsBBCode for the empty = is the best solution. Therefore this PR is still WIP. Any feedback appreciated!

Feb 13 '17 11:02 frastel

Ugh! Can you remove 2bd7dc7 and force push? This is extremely hard to assess correctly as-is. Appreciate your effort, though!

Feb 13 '17 17:02 DaSourcerer

@DaSourcerer Sure no problem. Branch is cleaned.

Feb 13 '17 20:02 frastel

when uppercased tags were used and the text was invalid than the code was not able to re-build the original code. [URL=INVALID]foo[/URL] was generated to [url=invalid]foo[/url].

Can you elaborate a little on what problems that causes?

Feb 13 '17 23:02 shmax

@shmax I think, he doesn't experience any actual problems safe for the parts of the input that could not be parsed as tags being forcibly turned all-lowercase. I am actually in support of this, but find this implementation not very convincing. There has to be a better way than to keep tagname and the original token when one is directly derived from the other. Could be this is a bit of a hint some refactoring may be in order? After all, we've got ElementNode.getTagName() and ElementNode.getCodeDefinition().getTagName(). Latter one is already guaranteed to be lowercase. No easy win, though ...

Oh, and while we're at it: Do we want to support tagnames with multibyte characters in them?

Feb 13 '17 23:02 DaSourcerer

There has to be a better way than to keep tagname and the original

That was my thought as well, particularly as he's doing some run-time string operations that weren't there before, even with the double storage.

Feb 13 '17 23:02 shmax

@shmax

Can you elaborate a little on what problems that causes?

I didn't expected any text modification when an invalid code is given. For my current project the strtolower modification is not accetable.

@DaSourcerer The changes might not be very convincing however I didn't want to introduce too many changes at once in one of my first contributions to this project and thus I tried to not break the current structure.

Feb 14 '17 08:02 frastel

@DaSourcerer:

Do we want to support tagnames with multibyte characters in them?

Just curious: is it very difficult to implement compared to the current implementation?

Feb 14 '17 16:02 Kubo2

@frastel: As I said, this may be a hint some refactoring were in order. I understand your reluctance to introduce possibly code-breaking changes, but I just feel there is a better solution. Btw: That would be a shortcoming on our side, not yours ;)

@Kubo2: It is so-so. We would introduce a hard dependency on the mb extensions (which is probably less probelmatic than I may make it sound like here). We also use the normalized (i.e. lowercased) tag names for various lookups. I am not quite sure how array keys with mb-characters in them behave. Worst scenario: We would lose all 𝓞(1) lookups and have them replaced with 𝓞(n). Worth it? I really don't know ...

Feb 15 '17 21:02 DaSourcerer