endo perf(base64): Replace random access with iteration

Description

We recently discovered that in XS, random access to string characters is O(n). This PR updates the JS base64 implementation to instead use iterators. It improves the performance in XS as expected, but irrelevantly so because XS provides a native implementation which is much faster still. There's a slight penalty in Node.js, which may or may not be tolerable (but note that apples-to-apples comparison across implementations is necessary because the JS input is substantially shorter). I'm comfortable with rejection of this PR (or at least everything after the first commit), but felt that opening it was important for transparency.

Security Considerations

n/a

Scaling Considerations

As noted above. Here are the measured effects:

 Node.js
-encodes 10775.74909090909 characters per millisecond
+encodes 9801.535832414553 characters per millisecond
-JS encodes 23130.176470588234 characters per millisecond
+JS encodes 18329.573806881242 characters per millisecond
-decodes 48752.36084452975 bytes per millisecond
+decodes 45848.339350180504 bytes per millisecond
-JS decodes 43007.671875 bytes per millisecond
+JS decodes 45448.94117647059 bytes per millisecond
 XS
-encodes 577272.2727272727 characters per millisecond
+encodes 578303.1160714285 characters per millisecond
-JS encodes 1335.143596377749 characters per millisecond
+JS encodes 1700.2734761120264 characters per millisecond
-decodes 396777.0553505535 bytes per millisecond
+decodes 393870.26373626373 bytes per millisecond
-JS decodes 1093.7360890302066 bytes per millisecond
+JS decodes 1946.3762376237623 bytes per millisecond

Documentation Considerations

n/a

Testing Considerations

n/a

Upgrade Considerations

n/a

Jan 18 '24 03:01 gibson042

Thank you for digging into this and validating the hypothesis that string[i] was O(i) on XS. I’m inclined to take the performance win on Node.js with the existing code and the performance win on XS with the native implementation, archiving this PR as evidence for future reference.

Jan 18 '24 20:01 kriskowal

Something seems fishy with the bench results. From what I gather, under node.js, the encodes and JS encodes should be using the same implementation, yet they have drastically different results.

Updated the benchmark script to clarify with better output:

$ node test/bench-main.js
[pass 1] encodes 8378.88124410933 characters per millisecond
[pass 2] encodes 8523.48322147651 characters per millisecond
[pass 1] JS short-string encodes 18514.513452914798 characters per millisecond
[pass 2] JS short-string encodes 18514.513452914798 characters per millisecond
[pass 1] decodes 45357.107142857145 bytes per millisecond
[pass 2] decodes 51212.96780487805 bytes per millisecond
[pass 1] JS short-string decodes 43996.02797202797 bytes per millisecond
[pass 2] JS short-string decodes 43261.155206286836 bytes per millisecond

I’m inclined to take the performance win on Node.js with the existing code and the performance win on XS with the native implementation, archiving this PR as evidence for future reference.

Makes sense. Should I open a new PR for the benchmarking script improvements?

Jan 18 '24 21:01 gibson042

Makes sense. Should I open a new PR for the benchmarking script improvements?

That would be excellent.

Jan 18 '24 21:01 kriskowal

[pass 1] encodes 8378.88124410933 characters per millisecond
[pass 2] encodes 8523.48322147651 characters per millisecond

I don't understand how the new changes you made dropped the performance in node even more.

Jan 19 '24 02:01 mhofman

Should I open a new PR for the benchmarking script improvements?

#1986

Jan 19 '24 08:01 gibson042

I am confused about the separation between this PR and https://github.com/endojs/endo/pull/1986 , Both contain all three files. I would have expected one (that one) to define the benchmarking framework, and the other (this one) to be staged on it and use it.

Jan 20 '24 00:01 erights

I am confused about the separation between this PR and #1986 , Both contain all three files. I would have expected one (that one) to define the benchmarking framework, and the other (this one) to be staged on it and use it.

@erights Good point; this is now the case.

Jan 22 '24 19:01 gibson042

Now that #1986 is merged, if it is ok with the other reviewers, I'd prefer that this PR be rebased on master if it is not already based on a state that includes #1986

Reviewers, is that ok with you?

Jan 23 '24 23:01 erights

#1986 is already incorporated, but it would be a good idea to land #1991 and then rebase this PR (which already shares its packages/base64/test/test-main.js) on top of the resulting master.

Jan 24 '24 00:01 gibson042

#1986 is already incorporated, but it would be a good idea to land #1991 and then rebase this PR (which already shares its packages/base64/test/test-main.js) on top of the resulting master.

Sounds good to me

Jan 24 '24 00:01 erights

Oops, sorry! I had no idea. Thanks for fixing.

Jan 24 '24 00:01 erights

@gibson042 Are you shooting to improve performance on both Node.js and XS (and assuming this will still fall thru to the native version on XS)?

Jan 24 '24 01:01 kriskowal

@kriskowal Since the two are at odds and the JS implementation isn't even used by XS, I'm comfortable with getting this PR clean and then closing it without merge as you suggested in https://github.com/endojs/endo/pull/1982#issuecomment-1899159764

Thank you for digging into this and validating the hypothesis that string[i] was O(i) on XS. I’m inclined to take the performance win on Node.js with the existing code and the performance win on XS with the native implementation, archiving this PR as evidence for future reference.

Jan 24 '24 16:01 gibson042

endo endo copied to clipboard

perf(base64): Replace random access with iteration

Description

Security Considerations

Scaling Considerations

Documentation Considerations

Testing Considerations

Upgrade Considerations

endo
endo copied to clipboard