wikitable2csv
wikitable2csv copied to clipboard
CSV contains citation link text
Hello,
when the wiki table contains a citation (e.g. [2] ), the generated csv will interpret it as pure text. This is probably not desired.
Example: https://de.wikipedia.org/wiki/Liste_traditioneller_Radikale#Tabelle_der_Radikale

Output:
Nr.,Zeichen (Varianten),Pīnyīn,Bedeutung und Anmerkungen,Häufig-keit,Kurz-zeichen,Beispiele
147,.mw-parser-output .Hant{font-size:110%}見,jiàn,sehen,161,见[2],規親覺觀
148,角,jiǎo,"Horn, Ecke",158,,觚解觕觥觸
149,言 (訁 links),yán,"sprechen, Wort",861,讠[2]links,誁詋詔評詗詥試詧
(The [2] is the undesired text, because it is useless by itself)
The HTML responsible for this is:
<td>
<link rel="mw-deduplicated-inline-style" href="mw-data:TemplateStyles:r184932629">
<span lang="zh-Hans" class="Hans">见</span>
<sup id="cite_ref-s_2-1" class="reference">
<a href="#cite_note-s-2">[2]</a>
</sup>
</td>
Can the citation links (hyperlinks with square brackets) be removed when generating the csv?
So basically all the <a> tags that are surrounded by a <sup> tag with class="reference".