istf-spec
istf-spec copied to clipboard
Optimization: a Stack/Heap approach
Instead of
body {
color: red
}
producing
[
[RULE_START, 1],
[SELECTOR, 'body'],
[PROPERTY, 'color'],
[VALUE, 'red']
[RULE_END]
]
It could instead produce
HEAP = ['body', 'color', 'red']
[
RULE_START, 0,
SELECTOR, 0,
PROPERTY, 1,
VALUE, 2,
RULE_END, 2
]
Where the ints 0,1,2
are memory addresses for the corresponding values in the HEAP
.
This way you can use a flat Uint32Array
typed array for maximum throughput on performance.
Ah I got what you mean - using separate arrays for markers and props/values/selectors!
Thats an interesting idea, we should add a benchmark for this as well.
We could in theory end up with 3 arrays. Function refs, strings and markers. Each of them would be of one consistent type and markers even integers.
It could also mean we could reuse selectors, properties and property values. i.e
body {
color: red;
}
h1 {
color: red;
}
Would produce
HEAP = ['body', 'color', 'red', 'h1']
[
RULE_START, 0,
SELECTOR, 0,
PROPERTY, 1,
VALUE, 2,
RULE_END, 2,
RULE_START, 3,
SELECTOR, 3,
PROPERTY, 1,
VALUE, 2,
RULE_END, 3
]
Oh so the HEAP array would become a collection of unique strings, neat! Not sure though how much it will bring after gzip, but def. worth considering!
Yes gzip might make the KB benefit a hard sell but It should improve the memory that the runtime JS VM has to allocate and code it has to parse.
Yeah, if it won't slow down parsing, definitely worth optimizing!
WIP of a compiler that produces this from css strings. https://rawgit.com/thysultan/stylis.js/master/plugins/bytecode/tests/index.html
How is it related to this issue?
Yes, this is not for the bench, just a proof of concept of a compiler that generates this format from a string source, related issue #16
Ok I have added this approach to the bench (see fromTwoTyptedArrays fn) https://esbench.com/bench/592d599e99634800a03483d8
I din't use Uint32Array because it slows it down a lot.
It is not too fast. It is possible that with a much larger test data it will show better relative performance, but mostly I think its a memory optimization, because we add more lookups when using this approach.
The "From Array of strings" test performs best so far.
For the tests to be fair you'd need to exclude the creation of the arrays from the bench and put that into the setup phase otherwise you're also measuring the creating of the data structure, which in this case is meant to be a preprocess operation instead of a runtime operation.
I do it for other tests as well, so its comparable
Since the different methods use very different data-structures that are created on each run it would not be comparable since the creating of the data-structures is meant to be a compile time operation, this would include fromArrayOfStrings
as well.
You're also measuring the cost of creating the function on every operation i.e return (input) => {
this could throw out any possibility that the VM could generate optimized code which means we might be just measuring a baseline(unoptimized) version that the VM could produce.
For example this bench https://rawgit.com/thysultan/stylis.js/master/plugins/bytecode/tests/index.html, *see the console.
It avoids all the points i mentioned above, showing that this bytecode format with a typedArray is faster.
1873 ops, from typedArray
87 ops, from stylis
1638 ops, fromSingleArrayOfStrings
You're also measuring the cost of creating the function on every operation i.e return (input) => { this could throws out any possibility that the VM could generate optimized code which means we might be just measuring a baseline(unoptimized) version that the VM could produce.
Can you elaborate on this specific case?
For example this bench https://rawgit.com/thysultan/stylis.js/master/plugins/bytecode/tests/index.html, *see the console.
on my machine in chrome is "fromSingleArrayOfStrings" the fastest
Can you elaborate on this specific case?
The operation is calling a function that returns another function, then useIt
calls this function.
So, you're also measuring the cost of creating the function that converts the data-structure to a string which varies across implementations, this paired with the fact that it's being generated on every operations means that the function will also never get labeled as hot by the VM to produce optimized code since every run is both executing and creating a new function.
On my machine is "fromSingleArrayOfStrings" the fastest.
What where the results?
The operation is calling a function that returns another function
Did you see that the first function is self invoking? I scoped the static internals inside of the closure. At bench time there is only one fn call.
// then useIt calls this function.
I am not sure it does what it should, but my idea was to avoid smart optimizations which might end up with doing nothing. I am still not sure if it does the job nor if useIt
function was actually necessarily in this case, but in any case all tests use it so they are fair.
So, you're also measuring the cost of creating the function that converts the data-structure to a string which varies across implementations
I don't think so.
What where the results?
Just refreshed a couple of times, the results vary between 2500 and 3500, this time the "bytecode" variant was faster multiple times. I am not sure the way you measure is correct. Did you try benchmark.js? I guess they have bullet proof logic for measuring.
Just noticed you are using "while" in one case and "for" in another, can you try making both implementations as similar as possible?
Regarding not measuring the data generation time. I realize that we would have better parsing results when we not generate the data for each test, but this would be less realistic, because in the practice we would generate the data first, then parse, both steps are involved in the real world scenario, right?
Regarding warming up and getting optimized hot code. I am not sure this is the right approach here. I am not sure that function will be called that often in a real world use case so that it becomes hot. Its not a webserver working same operations all the time. It will parse how many?, lets say 100 style sheets on average and its done. Also if done right, it will never parse all of them at once, but rather only those which are required for the current UI state.
For more realistic statistics we could count amount of css rules on popular websites.
Did you see that the first function is self invoking?
I missed that :P
Just refreshed a couple of times, the results vary between 2500 and 3500
Same here.
I am not sure the way you measure is correct.
What part is incorrect?
Regarding not measuring the data generation time? I realize that we would have better parsing results when we not generate the data for each test, but this would be less realistic, because in the praxis we would generate the data first, then parse, both steps are involved in the real world scenario, right?
No, you would not do this on every operation at runtime, the data-structure would be created once at runtime. For example
// data-structure
class Main extends React.Component {}
// X number of operations
ReactDOM.render(<Main>, document.body)
ReactDOM.render(<Main>, document.body)
Just noticed you are using "while" in one case and "for" in another, can you try making both implementations as similar as possible?
Sure, i copied you're implementation of fromSingleArrayOfStrings
in the ESBench to compare with that.
What part is incorrect?
It seems correct, but very simplistic, I think jsperf did a lot of work to get better/more stable results.
No, you would not do this on every operation at runtime, the data-structure would be created once at runtime.
Yes but it would be also parsed just once I assume. After that its either something static or some models.
In the bench it's being created on every operation, since the bench runs this operation > 10,000 times it affects the results.
In the bench it's being created on every operation, since the bench runs this operation > 10,000 times it affects the results
You mean data generation for every test? Yes thats what I was trying to reproduce. I assume a test case includes data generation + parsing in the real world.
We can make for sure separate tests with data generation and without. Just to be sure that data generation doesn't eat up most of the performance budget. But at the end, important is a test which has both, data structure and parsing. Basically if data generation costs too much time but parsing is faster it doesn't matter, important is whats faster when both is done.
What might be really important is to do tests with more realistic data, meaning instead of 1 rule, using an average rule amount from some popular sites. Larger data sets might completely change the benchmark results.
In the real world the data-structure should probably be created just once for the lifetime of the page, the toString implementation on the other hand could get executed more than once.
What might be really important is to do tests with more realistic data, meaning instead of 1 rule, using an average rule amount from some popular sites.
I agree.