Daniel Lemire
Daniel Lemire
Related: https://github.com/simdutf/simdutf/issues/158
We will not be doing this. It is more complicated than expected and ultimately, unnecessary. An interested user could write a C wrapper.
@NicolasJiaxin I expect that you can probably 'take a stab' at this... even though I think that we will be able to build on @clausecker's work regarding UTF-16 validation.
> and does so FAST Proof needed. :-)
cc @NicolasJiaxin
cc @NicolasJiaxin
Can you run some (silly) benchmarks comparing your tool to iconv? Could be done on AWS.
We will want to have large file support. So we need to 'eat' the data in blocks... say 4kB (max) at a time. I'd do this after running benchmarks with...
> On three files, sutf takes about the same time as on one file, so that might mean that a lot of the time is spent on other stuff than...
> So, it seems indeed that the bottleneck is loading the data. I have played around a bit, but I was not able to bring the time down. Any tips?...