don't eat space after link
A minor problem with text extraction: space after a link is eaten up.
Eg https://bg.wikipedia.org/w/index.php?title=Джон_Кенеди&action=edit includes:
| description = [[Ich bin ein Berliner|Речта]] от Ратхаус Шьонеберг на Джон Кенеди, 26 юни 1963. Продължителност 9:01.
This is extracted http://mappings.dbpedia.org/server/extraction/bg/extract?title=Джон+Кенеди&revid=&format=turtle-triples&extractors=custom as:
description "Речтаот Ратхаус Шьонеберг на Джон Кенеди, 26 юни 1963. Продължителност 9:01."@bg .
Hi, if I understand correctly, all that needs to be done is output
description "Речта от Ратхаус Шьонеберг на Джон Кенеди, 26 юни 1963. Продължителност 9:01."@bg .
for the example in the summary, with the space before от being preserved?
exactly
All right, working on it :+1:
In what file can I find the logic that generates the extraction?
Should be the org.dbpedia.extraction.dataparser.StringParser in core module.
probably the nodeToString function should take care of LinkNode's (org/dbpedia/extraction/wikiparser/LinkNode.scala)