dotty-feature-requests
dotty-feature-requests copied to clipboard
Make the Scala runtime independent of the standard library
It would be a nice achievement if "vanilla"(*) compiler-generated code was not dependent on the full standard library. Benefits include:
- easier interop with other languages,
- possibly smaller footprint,
- less upgrade and migration headaches due to better separation of concerns,
- easier to use alternative libraries.
This would require at least the following changes:
- Create a small, self-contained kernel library. This would have to contain tuple and function types, StringContext, Option, what else? We need to refactor these classes to have no external dependencies to the standard library at large.
- Add an immutable array type to the runtime library, which erases to standard arrays, but is immutable.
- Make vararg parameters take an immutable array instead of a Seq. This has the potential of simplifying Java interop at the price of complicating interop with 2.13.
- Revise the compiler-generated Enum methods to not use Seqs or Maps, but to use immutable arrays instead.
(*) "Vanilla" means that there are some parts of the language that will require rich library support. E.g. a quoted expression will necessarly pull in the full tasty library. But it would be good to find a natural subset of the language that is guaranteed to work only with the kernel library.
Questions: Is this feasible? Are there issues I have overlooked so far? Should we make the effort?
The classes of the kernel library will be visible or not? AFAIK, SBT itself can let us filter out the scala-library with autoScalaLibrary := false. For me I think without a must in scala-library dependency will make scala very useful on many things. eg, the latest very good usage is how to write a library in scala which will not depend on scala-library itself in https://github.com/lightbend/config/pull/600#issuecomment-443565128
My first reaction is to the memory footprint of an application. Most of my Scala apps depend on some Java library. Therefore, the idea of an independent runtime makes me wonder if the memory footprint will increase given that Java’s runtime would be linked.
Related: I think that the Scala and JVM ecosystems at large need to be sensitive to their memory footprint, particularly due the rise of Go and Rust.
@hepin1989 The classes of kernel library would be visible. TupleN, FunctionN, StringContext are all programmer accessible.
@huntc The Java libraries would still be included (on JVM at least, would be different for Scala.js and Scala.native). The only thing that would change is that you could write Scala programs that do not rely on the Scala library, if you are careful.
What else? Symbols and boxed types spring to mind.
Extension methods defined in the standard library could even provide most of the (currently familiar) methods of these "kernel" classes so they wouldn't need to be "fully featured" implementations.
I think this would make a big difference for Scala.js; currently it's not feasible to use Scala.js for tiny scripts because it has to pull in and optimize the entire std lib, meaning a de-facto minimum size of 100kb or so. Having std-lib-free Scala code would both make the output executables tiny for the cases where that matters (bookmarklets, microcontrollers, etc) and also make the output code much easier to skim (since you don't need to mentally filter out the leftover bits of the entire collections library). It also means you would be able to write tiny Scala.js libraries that would fit in much better with the non-Scala.js package and module systems (which normally assume lots of tiny individual modules).
This would also make it more feasible to use Scala for tiny bootstrap scripts; current Mill's launcher is written in Java because the added classloading needed to use scala.Predef (even just println) easily adds a 200-400ms of initialization overhead. Would be nice to be able to write it in Scala even if that means avoiding all those collections
@lihaoyi Scala.JS already does all the DCE it can do, so in theory you should already be able to achieve this just by carefully avoiding using too much stuff from the standard library (which would probably require developing your own mini standard library), I don't think the split proposed here would make a difference (/cc @sjrd who can correct me if I'm wrong here).
What else?
See this fantastic SO answer by Seb: https://stackoverflow.com/questions/40076047/what-types-are-special-to-the-scala-compiler/40083398#40083398
Scala.JS already does all the DCE it can do, so in theory you should already be able to achieve this just by carefully avoiding using too much stuff from the standard library
Yes, if you stick to JS types, never use Scala functions or collections, then Scala.js will dce the standard library away. Once fullOpt'ed, the smallest Scala.js program (only a main with js.Dynamic.global.console.log("hello"), not println) weighs 8 KB in Scala.js 1.x master. Most of the code in there is about the implementation of Longs and some metadata for boxed classes and java.lang.Class. Not a single Scala class in sight (except scala.runtime.BoxedUnit, but that could change).
I agree it would be a wonderful achievement (one I've even pushed for at one point, using the nomenclature "core" rather than "kernel").
But I think, even if the compiler's generated code doesn't depend on it, the standard library defines the language just as much as the compiler does. I mention that because I'm not sure I understand the benefits, asides from the footprint one.
The one benefit that I would love to see is a kernel library that is backwards binary-compatible across major releases, so scala-kernel.jar, not scala-kernel_3.0.jar. That would allow creating libraries in Scala that are:
- cross-platform (JVM/Scala.js/ScalaNative), and
- Java-friendly, by not pulling in the problematic scala library
Scala.JS already does all the DCE it can do, so in theory you should already be able to achieve this just by carefully avoiding using too much stuff from the standard library (which would probably require developing your own mini standard library)
Well, that's exactly what this proposal is isn't it? Having everyone agree on a mini standard library ("kernel") so people can know what to use to avoid using stuff from the standard library.
Yes, if you stick to JS types, never use Scala functions or collections, then Scala.js will dce the standard library away.
One advantage of having an explicit Scala mini standard library is that you can write cross-platform code without a std lib dependency. JS types don't work on the JVM, JVM types don't work on JS, whereas the proposed Scala mini-standard-library/kernel would work on both.
Make vararg parameters take an immutable array instead of a Seq. This has the potential of simplifying Java interop at the price of complicating interop with 2.13.
Maybe another option is to introduce an abstraction for repeated parameters, i.e. a trait Repeated[+A]? It would have a minimal, array-like interface so that it would not require collections library. At the same time Seq would inherit from Repeated so that collections could be passed into varargs methods without copying them into immutable arrays.
I have a use case for this feature which could help provide an extra motivation on top of architectural elegance.
I am working on helping a project developing a Categorical Database transformation tool based on work Functorial Databases (Databases understood as functors from small categories to Set). They produce a jar that is 14MB in size.
I needed to write a bit of Java to parse RDF Nodes (URIs and Literals) that can be loaded from the class path. Initially I took some code from banana-rdf and quickly got a very flexible prototype that way. I tried packaging it in one jar to put it in the class path and found that jar to be 35MB in size, which is ok for a prototype, but a bit problematic if there is a bug (which there was) as it makes it difficult to tell if the bug is in my code or theirs. It is also a bit awkward to have what was meant to be a plugin be twice as large as the product. That was not going to help me convince them to allow me to use Scala to program...
So I decided to extract the parser from banana-rdf adapt it a bit and compile it. I had written it to use only types like Int and Char for efficiency, and all I had to do was remove some of the nice features of Scala I had used. The aim was to not have to depend on the scala libraries as that still adds 5MB jar. So I wrote up a question on Scala Users Writing Java in Scala, and got some good very helpful feedback pointing me also to this issue.
In the end I got the code to produce a 14K jar. But I can't easily tell if I still have some dependency on the Scala library (though Boxing for example).
In short are three use cases for 1. Testing 2. Interacting with Java and 3. having a good opening when introducing Scala to a team using Java and 4. Allowing people like me to not have to remember the muscle memory of Java syntax.