virtuoso-opensource icon indicating copy to clipboard operation
virtuoso-opensource copied to clipboard

Procedure dump_one_graph does not export files in UTF8

Open Tomas2D opened this issue 2 years ago • 3 comments

In case of usage dump_one_graph stored procedure (stated here http://vos.openlinksw.com/owiki/wiki/VOS/VirtRDFDatasetDump) creates files with encoding us-ascii (according to command file -I data_000001.ttl).

The output file`s format should be UTF8 and not ASCII.

Thanks in advance

Tomas2D avatar Mar 24 '22 17:03 Tomas2D

@pkleef @openlink @IvanMikhailov @smalinin -- Please look into this. It seems likely to be a quick fix.

TallTed avatar Mar 24 '22 17:03 TallTed

@Tomas2D The dump contains Unicode escape sequences, not UTF-8 see: Terse RDF Triple Language / string escapes .

/cc @TallTed @pkleef @smalinin HTH

imitko avatar Mar 24 '22 18:03 imitko

@imitko

Okay I see, but if I understand correctly the ASCII is a subset of UTF-8 encoding and thus If I want to convert a file from ASCII to UTF8 the output should remain the same.

Can you please give me a direction of how to "convert" or better "decode" unicode escape sequence to UTF8 characters via some widely used library like iconv. I am now able to achieve my desired result via https://dencode.com/string/unicode-escape.

Tomas2D avatar Mar 27 '22 10:03 Tomas2D