Implement `sass --embedded` in pure JS mode
Closes #2325.
Implementation
The actual isolate dispatcher and compilation dispatcher are nearly unchanged. However, I had to replace isolate with worker communication, and mock tons of small things that do not work on node.
Testing
- All Dart embedded tests are passing. - GitHub CI has been updated to run these in this PR.
- All JS API tests are passing. - GitHub CI has been updated to run these in this PR.
- All Ruby API tests are passing.
Some initial thoughts:
-
Are we confident this is simpler than writing an embedded compiler implementation directly in TS using the
sasspackage? That would duplicate some of the dispatcher logic, but it would mitigate a lot of the JS interop overhead. (I'm not saying that's definitely the right way, just that we should at least talk through the possibility.) -
The infrastructure here could make https://github.com/sass/dart-sass/issues/868 easier.
-
Can you provide a list of the "small things that do not work on node"?
-
I can start the process of pulling out
sync-message-port.
Are we confident this is simpler than writing an embedded compiler implementation directly in TS using the sass package?
My answer is yes... To begin with, currently the compilation dispatcher (isolate/worker) cannot be implemented in pure JS due to lack of feature. Specifically, no access to "formatted" log for customized logger and containingUrlWithoutMarking from the JS API. Because of that, we have to use the DartExportedJSFunction or simply just do dart2js.
My first implementation is exact a customized dart exported js function to start the compilation worker, and then a TS implementation for the worker dispatcher. The history still exist on GitHub:
- https://github.com/sass/dart-sass/commit/7084da7fa261bd6d640ed44632e4a81af8d8bc51
- https://github.com/sass/embedded-host-node/commit/ec05f2d7f9d57f987475f6d6345799001fae53b1
It was not difficult to get that to a functional state on basically level, but getting two packages connecting to each other and get testing setup working was extremely painful. The performance was extremely bad, due to lack of a ResourcePool thus allowing unlimited number of threads. Spawning node worker thread itself is of slow, and cost relatively high amount of resources. There is a test in js-api-spec that runs 1000 concurrent compilation, which started nearly 1000 threads and my whole system was suffering to the point it started to cause unrelated tests to timeout. Finally, although the API specs are mostly passing, the dart test was having tons of failures for handling ProtocolError, which is extremely difficult to completely align the behavior.
That's the point I decided to completely rewrite it and just switch to dart2js and reuse the existing logic.
That would duplicate some of the dispatcher logic, but it would mitigate a lot of the JS interop overhead.
To be honest, I don't think JS interop overhead matter that much here, because in the end what's the actual compute heavy Sass compilation happens on dart2js code anyways. The only thing that might have a meaningful performance impact is protobuf-es vs protobuf.dart running on dart2js, but protobuf-es is known to be slow: https://github.com/bufbuild/protobuf-es/issues/333, so probably also doesn't matter.
Finally, from a long term code maintainability perspective, it is easier to maintain one implementation instead of two.
Can you provide a list of the "small things that do not work on node"?
By "do not work on node" I meant the dart code do not work out of box without an alternative implementation. Basically every thing in lib/src/embedded/js/ folder:
stdin/stdout->process.stdin/process.stdoutwrapped withStreamControllerIsolate->Worker- The way they spawn are different: Isolates can partially share code memory stack within the same isolate group, they are launched with a entrypoint function that does not need to re-JIT all the code, see: https://github.com/dart-lang/sdk/issues/36097. On the other hand, workers do not share code memory, that each worker load a entrypoint script - the sass.js and its dependencies are quite heavy, thus spawning worker is a pretty high cost. This cost may potentially be reduced a tiny bit if we can split out main and worker entrypoints, lazily load dependencies, and bypass not used ones in specific code path. However, given
sass.jsis several megabytes in size, it will not help much because it will only make the main thread load faster, but the worker thread will still suffer slow start.
- The way they spawn are different: Isolates can partially share code memory stack within the same isolate group, they are launched with a entrypoint function that does not need to re-JIT all the code, see: https://github.com/dart-lang/sdk/issues/36097. On the other hand, workers do not share code memory, that each worker load a entrypoint script - the sass.js and its dependencies are quite heavy, thus spawning worker is a pretty high cost. This cost may potentially be reduced a tiny bit if we can split out main and worker entrypoints, lazily load dependencies, and bypass not used ones in specific code path. However, given
Isolate.exit->process.exitexitCode->process.exitCodeexitCodein dart is a shared variable across all isolates, but inprocess.exitCodein node is worker thread local. To get anyexitCodefrom worker on main thread we have to listen on worker's exit event, this means: in dart we can eagerly exit on failure, but on node we have to wait for worker to transfer their exitCode to main thread and then set the exit code in main thread.
- A wrapper interface for Mailbox and SyncMessagePort to look the same for compilation isolate.
Here are some benchmark results on 1.80.5 native binary vs pkg-npm-release (node v22.9.0).
I tested three cases. All of them on a single persisted compiler, lines that marked with /MT means multi-threaded (10 threads).
- Minimal Sass Input - compile
a {b: c}10000 times. - High Round Trips - invoke host function 10 times per compilation, and run 1000 compilations.
- Bootstrap - compile bootstrap 50 times.
On single-threaded test Dart is much faster than pure JS, especially in the compute heavy bootstrap test. However, in nearly all multi-threaded test, pure JS is not far from Dart, which is actually great.
sass-embedded 1.80.5 (Embedded Host) [Ruby]
dart-sass 1.80.5 (Sass Compiler) [JavaScript]
--- Minimal Sass Input ---
user system total real
sass-embedded 0.304688 0.081796 0.386484 ( 2.195485)
sass-embedded/MT 0.546339 0.260653 0.806992 ( 1.260589)
--- High Round Trips ---
user system total real
sass-embedded 0.268202 0.084800 0.353002 ( 2.135437)
sass-embedded/MT 0.316647 0.207332 0.523979 ( 0.558180)
--- Bootstrap ---
user system total real
sass-embedded 0.013900 0.006100 0.020000 ( 27.595710)
sass-embedded/MT 0.023358 0.012144 0.035502 ( 3.378303)
sass-embedded 1.80.5 (Embedded Host) [Ruby]
dart-sass 1.80.5 (Sass Compiler) [Dart]
--- Minimal Sass Input ---
user system total real
sass-embedded 0.255831 0.066576 0.322407 ( 1.207566)
sass-embedded/MT 0.323200 0.218337 0.541537 ( 0.578477)
--- High Round Trips ---
user system total real
sass-embedded 0.197848 0.069268 0.267116 ( 0.840963)
sass-embedded/MT 0.256294 0.226492 0.482786 ( 0.521811)
--- Bootstrap ---
user system total real
sass-embedded 0.013402 0.005420 0.018822 ( 7.295685)
sass-embedded/MT 0.015503 0.008968 0.024471 ( 2.538047)
Benchmark code
require 'benchmark'
require 'sass-embedded'
puts Sass.info
BOOTSTRAP = Gem::Specification.find_by_name('bootstrap').gem_dir + '/assets/stylesheets/_bootstrap.scss'
def run(n:, thread: 10, use_thread: false)
if use_thread
queue = Queue.new
n.times do
queue << nil
end
list = []
thread.times do
list << Thread.new do
loop do
queue.pop(true)
yield
rescue ThreadError
break
end
end
end
list.each(&:join)
queue.close
else
n.times do
yield
end
end
end
def sass_embedded_minimal(use_thread: false)
run(n: 10000, use_thread:) do
Sass.compile_string('a {b: c}', style: :compressed, logger: Sass::Logger.silent).css
end
end
def sass_embedded_round_trips(use_thread: false)
run(n: 1000, use_thread:) do
Sass.compile_string('@for $i from 1 through 10 { a:nth-child(#{$i}) { b: foo(); } }', functions: { 'foo()' => ->(_) { Sass::Value::String.new('test') } }, style: :compressed, logger: Sass::Logger.silent).css
end
end
def sass_embedded_bootstrap(use_thread: false)
run(n: 50, use_thread:) do
Sass.compile(BOOTSTRAP, style: :compressed, logger: Sass::Logger.silent).css
end
end
puts '--- Minimal Sass Input ---'
Benchmark.bm(16) do |x|
GC.start
x.report('sass-embedded') do
sass_embedded_minimal
end
GC.start
x.report('sass-embedded/MT') do
sass_embedded_minimal(use_thread: true)
end
end
puts '--- High Round Trips ---'
Benchmark.bm(16) do |x|
GC.start
x.report('sass-embedded') do
sass_embedded_round_trips
end
GC.start
x.report('sass-embedded/MT') do
sass_embedded_round_trips(use_thread: true)
end
end
puts '--- Bootstrap ---'
Benchmark.bm(16) do |x|
GC.start
x.report('sass-embedded') do
sass_embedded_bootstrap
end
GC.start
x.report('sass-embedded/MT') do
sass_embedded_bootstrap(use_thread: true)
end
end
This PR is on my radar, it's just been a very review-heavy few weeks for me and this is unfortunately on the low end of the priority spectrum.
GitHub Actions revealed an interesting race condition - running postMessage() immediately followed by process.exit() in the worker may not exit immediately, and while the worker has not finished exiting, the main thread could call worker.terminate() based on the received message, which will cause the worker to have exitCode = 1 instead of whatever exitCode was set by worker.
To overcome this issue, I decided to use the first byte we added to the message buffer to explicitly send the exitCode to the main thread, so we don't need to wait for the unreliable async exit event from worker.
Ok, one more quirk for node worker. The way stdout/stderr stream in the worker thread works is that the worker posts message to the main thread and then the main thread writes the message asynchronously. So we cannot eagerly exit the main isolate, or otherwise the error message that the worker printed to stderr might get lost.
@nex3 I can see that the team is focusing most of the time on the postcss sass parser and this pull request has low priority.
However, I would like to get this done in next few month before dart 3.8 become stable so that we can offer this as a replacement for dart ia32. Please take a look when you get a chance and let me know if you have any questions.
Would really be great to see this merged to have embedded support on FreeBSD even if in pure JS mode. Thank you!
Applied most of the feedbacks. However, I have a awkward feeling about mixing classic js interop and new js interop in the same file, which is why I have lots of js specific things with new interop are only in embedded folder instead of in the generic folder.
@nex3 I haven't changed these yet, as I want to hear what's your thought on this.
Another note: The reason I tried to avoid classic interop and node_introp is that I couldn't get channel.port2 passed to Worker() constructor correctly with node_interop. Neither allowInterop() nor explicit casting worked, and it was a nightmare trying to debug what went wrong. I looked at node_interop repo trying to see if there is any example for it and there is no actual usage of transferList anywhere.
The new dart 3.3 js interop on the other hands was way more straight forward to use and debug. I got Worker() working on my first attempt.
@nex3 Any feedbacks on my last two comments above?
@nex3 Pinging again regarding how to deal with legacy js package vs new js_introp:
- The legacy js package has been officially marked discontinued.
- Both the legacy js package and new js_introp provides
@JSannotations. This issue is addressed in 0.7.0 of legacy js:Breaking Change: Moved annotations to single location in dart:_js_annotations. This makes it easier to avoid collisions with dart:js_interop's
@JS, as you will now see a static error instead of it resolving in favor of the package definition. However, since this may result in breakages in previously working code, this is treated as a breaking change. - Ideally
node_interopshould be rewritten with newjs_interopbut nothing is happening. On the other hand,node_interopcannot be easily upgraded to js 0.7.x due to js 0.7.x requires dart >=3.1.x, whilenode_interop's dev dependenciesbuild_node_compilersdo not support dart 3.x at all. This meansnode_interopis effectively stuck at js 0.6.x as of now, which means we have@JSconflict that we cannot mix old js and new js_interop code in the same file.
So honestly I think the way js interop is handled in this PR is already at best effort. - That new code only uses js_interop and it's currently isolated from the legacy js code.
Any more thoughts? Would you mind take another look at this?
Sorry that this has taken me so long to finish reviewing, there have been a lot of things to juggle this year.
I'm working on a major change (https://github.com/sass/dart-sass/tree/js-interop) to move Dart Sass to the new JS interop system. This includes creating a new package for defining typings and wrappers for the JS core library (as well as possibly pushing some changes upstream to dart:js_interop) as well as updating node_interop which will probably include dropping its compiler dependency. You can see the work in progress on each of these at https://github.com/sass/dart_js_core and https://github.com/pulyaevskiy/node-interop/tree/js_interop.
At this point it may make sense to wait on landing this PR until that work is finished. We're already closing in on that, so it shouldn't be too much more time.
Rooting for this, as I can't use embedded sass on FreeBSD.
Rooting for this, as I can't use embedded sass on FreeBSD.
Me too as I use it on FreeBSD and it works great.
The JS interop work is close to being done, at which point this will be unblocked.
Would just like to add on top of using this to get Discourse on FreeBSD I just got Vikunja working. Both work flawlessly thank you for this change it's been a huge help.
@nex3 Any updates on the js_interop migration? I saw that you opened quite a lot issues in dart-sdk in this area - is there any blocker?
The only real blocker is merging a bunch of my JS type definitions which are currently living in a private package into Dart team owned sources. I'm in discussion with @srujzs about the best way to do that, and I may pitch in to help land it. If you'd like to help as well, we could probably loop you in.
Hi @nex. Do you have an ETA for this PR?