High cost encoding and decoding in the League library
https://github.com/reznikmm/protobuf/blob/a1422f35ca12ad07c7611b4aa2cd86fd01fa8f91/source/runtime/pb_support-io.adb#L746
Similar to #20.
I was going to apply the same solution than in #20. I see two possibilities:
- Use another holder instance in
PB_Support.IO. - Share the holder instance moving it from
PB_Support.InternaltoPB_Support(private part) and use it from both packages.
Do you see any problem in sharing this in PB_Support?
Well, I see PB_Support is pure, so maybe it's not a good idea to change that to include the variable. I'll go for the other option, unless you have other proposal, @reznikmm.
I propose to move
Codec : Text_Codec_Holders.Holder;
into a public part of PB_Support.Internal and reuse it in the body of PB_Support.IO if this works.
It doesn't make much difference after the change, this time. Looking more closely at the report, most of the time is gone encoding and decoding the UTF-8 strings.
What I've gathered is that League.Universal_String is using UTF-16 and we need to convert to UTF-8 forth and back as required by Protobuf.
I was looking at https://web.archive.org/web/20220817170400/https://forge.ada-ru.org/matreshka/wiki/League/Performance which says
- use of special algorithms and utilization of SIMD operations (when available) significantly improve performance.
and I wonder how could I check that it is actually using the SIMD operations in my build. I don't see any clue about it in the Callgrind report.
I was also wondering if there could be a way to speed up conversion if you know that the input is always going to be in the US-ASCII subset, which is my particular case.
We could try to replace Matreshka with VSS. VSS uses utf-8 inside and keeps short strings inline...