ascii-tables
ascii-tables copied to clipboard
Cannot "parse" wikimedia
Summary
If you load up the default input, by refreshing the page, and switch your output style to wikimedia and hit "parse" on the output, you'll get an error prompt. When you hit "OK" your input goes blank but your output remains. Switching the output style afterwards will also not affect the input nor the output.
Investigation
Current default input (with correct tabs):
Col1 Col2 Col3 Numeric Column
Value 1 Value 2 123 10.0
Separate cols with a tab or 4 spaces -2,027.1
This is a row with only one cell
Current wikitable output of above input:
{| class="wikitable"
! Col1
! Col2
! Col3
! Numeric Column
|-
| Value 1
| Value 2
| 123
| 10.0
|-
| Separate
| cols
| with a tab or 4 spaces
| -2,027.1
|-
| This is a row with only one cell
|
|
|
|}
Prompt
It looks like there are only specific circumstances where tables can be parsed at the moment. The first line must be present that has a distinct character to indicate where the columns are, all the remaining column separators must be lined up with the header line separators.
A more generalized solution might be to compare each line to see where there are (non-alphanumeric?) characters that are the same all the way from the bottom to the top of the table to be parsed, or at least are tied for the most in a single column.
Dealing with HTML and wikimedia syntax would be a bit more, there is a javascript implementation of an html to csv parser here: https://gist.github.com/adilapapaya/9787842 I didn't see a wikimedia table parser written in javascript, but I suspect I'm just not using the right search terms.
Regarding the original issue, I don't think parsing wikimedia is actually much of a priority here. There are so many different table formats wikimedia supports and I don't really understand why. It's almost like they had an old way, then changed it, and never removed the first one. But since you are attempting to parse in javascript it can definitely get tricky as you said. Since the table definitions aren't consistent, I'd just avoid that idea all together.
I'd honestly disable the parse button when wikimedia is selected for now and just throw a message below it letting people know its not supported. This will keep the appearance of the site looking good, though I don't know how many people besides myself who have/would try this.