kgtk icon indicating copy to clipboard operation
kgtk copied to clipboard

`kgtk generate-mediawiki-json` and `kgtk generate-mediawiki-triples` Don't Process Text Strings Properly

Open CraigMiloRogers opened this issue 3 years ago • 1 comments

On inspection, it appears that kgtk generate-mediawiki-json and kgtk generate-mediawiki-triples are not processing text strings properly. My concerns include:

  • embedded quotes
  • embedded vertical bars
  • Allowable backslash sequences based on restrictions in the output format

CraigMiloRogers avatar Jun 10 '21 17:06 CraigMiloRogers

From Jerry Qiu:


'the government\'s openness to analyse their demands.'@en

would be turned into

the government\\s openness to analyse their demands.

in the jsonl file.
Is it possible to preserve the single quotation mark in these processings?```

CraigMiloRogers avatar Aug 03 '21 01:08 CraigMiloRogers