mwparserfromhell
mwparserfromhell copied to clipboard
Implement missing node types
- ~~
Comment
~~ -
ParserFunction
/MagicWord
/BehaviorSwitch
- ~~
Link
~~ - ~~
Table
~~ -
Redirect
Remaining node types are lower priority than Comment and Link, so changing priority to low.
Note that I need to implement external links still, which I had forgotten about.
:+1: on the external link implementation
External links are next after #9.
External links is now #39 for v0.3; this issue will be pushed back to v0.4.
Are there any plans to handle Tables anytime soon?
@rmacqueen Unfortunately I can't guarantee anything soon. Tables scare me a bit due to their complexity so I've been afraid to approach the issue, and there are a lot of other things that I think ought to be handled first.
At any rate, showing interest in the issue encourages me to work on it sooner, so thank you for that. Perhaps in a month or so?
@earwig Thanks for the fast response! Yes, I saw that MediaWiki themselves referred to parsing tables as a "quagmire" so I understand the apprehension about tackling them. Quite a number of pages do use tables though so would be nice to get them in.
I've only just recently started using mwparserfromhell, but it's great so far! Thanks for all the work you've done.
Will try to do redirects soon, since that should be easy.
@earwig I need support for tables for a project I'm working on, I have time so I'll go ahead and implement it and submit a pull request. Thanks for your work on this!
...Okay. Good luck, I imagine it's going to be very painful, sorry for the overcomplicated code, etc.
Definitely +1ing the table parsing! Does it look like it'll make it into 0.4?
@gfairchild Yeah... I plan to merge the pull request soon and 0.4 should follow shortly after. Admittedly, it should've been merged a couple weeks ago, but I've been taking a break from open source stuff lately due to real life things. Should definitely be done in the next week.
Can't blame you for that. I appreciate the project!
@earwig Have you considered adding lists as well? It's quite an important part of wiki markup.
Lists are already supported.
On Apr 30, 2019, at 8:03 AM, suhassumukhv [email protected] wrote:
@earwig Have you considered adding lists as well? It's quite an important part of wiki markup.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
@earwig Is there any supporting documentation? I tried it this way,
text = """
* 2006: Douglas S. Morrow Public Outreach Award
* 2014: [[Kennedy Center Honors]] Medallion
* 2016: [[Presidential Medal of Freedom]]
"""
lists = mwp.parse(text).filter_tags(matches=lambda x: x.tag == "li")
lists[0].tag
# outputs "li"
lists[0].contents
# outputs ""
Supporting wikitext was taken from Tom Hanks
@suhassumukhv You're running into #46, essentially the content of a list isn't part of the same tag. This was a design decision (according to that issue).
@clokep Then there's certainly no way to access the list content, is there?
I don’t think characterizing it as a design decision is strictly correct; while it was indeed a choice to make the original implementation easier, it’s still a bug I think we should fix.
I had also forgotten the context in that ticket. Reading over it again, I think there might have been a change in how MediaWiki handles some classes of invalid markup since I wrote the original version of the parser. It’s possible that this is easier to solve now than it used to be.
On Apr 30, 2019, at 9:23 AM, Patrick Cloke [email protected] wrote:
@suhassumukhv You're running into #46, essentially the content of a list isn't part of the same tag. This was a design decision (according to that issue).
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
In #46, there is a code snippet that might do what you want. I’m not sure how well it works.
On Apr 30, 2019, at 9:33 AM, suhassumukhv [email protected] wrote:
@clokep Then there's certainly no way to access the list content, is there?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
This is an old issue, but it'd be great to get a node type for redirects at some point.