scala-native
scala-native copied to clipboard
Proposal: Interop 2.0
Interop 2.0
Scala Native 0.4 is going to contain a revamped interop layer based on feedback we've gather so far. Here are some rough overview of the updated features:
Extern members
-
@extern objectno more. -
def $name(..$args): $T = externcan appear in any class, module or trait. -
val $name: $T = externcan appear in any class, module or trait. -
var $name: $T = externcan appear in any class, module or trait. -
Extern members must provide explicit return type (enforced by the compiler.)
-
Binds to native symbols with
$nameor to a custom name given via@name("..."). -
@link("...")can appear on any class, module, trait, method or field. If the given definition is reachable at link-time, linker would link with the corresponding native library automatically.
Structs
-
CStructNno more. -
@struct class $name(...$params) extends ..$traits { ..$members } -
Can have arbitrary number of method and fields, both mutable and immutable.
-
Can extend traits.
-
Can be a case-class.
-
Implicitly stack-allocated.
@struct class Point(var x: Int, var y: Int) val point = new Point(10, 20) // implicitly allocates a stack slot point.x = 20 // updates stack-allocated memory point.y = 30 -
Resides on stack, inside managed objects, inside managed and fixed arrays and other structs in unboxed form.
val array = new Array[Point](10) array(0).x = 10 // updates array-allocated memory array(0).y = 10 -
Boxes whenever passed to generic erased context (i.e. upcast to Any.)
-
Transperantly copied by-value across different storage locations.
-
Passing them to extern methods respects C ABI.
Fixed-size arrays
-
CArrayno more. TODO: needs a new name. -
Has a standard literal syntax for initialization.
-
New type for fixed-size arrays.
-
Implicitly stack allocated for local variables.
-
Resides on stack, inside managed objects, inside managed and fixed arrays and other structs in unboxed form.
-
Boxes whenever passed to generic erased context (i.e. upcast to Any.)
-
Transperantly copied by-value across different storage locations.
-
Passing them to extern methods respects C ABI.
Pointers
-
Ptr[T]with most operations unchanged. -
_Nmethods are gone, new implicit conversionimplicit def inner[T <: AnyRef](ptr: Ptr[T]): Ttogether with extra checks in the compiler to make sure that it only applies to structs. This enables calling methods and reading fields directly through pointer:var points: Ptr[Point] = stackalloc[Point](100) points.x = 10 // updates points(0).x points.y = 20 // updates points(0).y points += 1 points.x = 30 // updates points(1).x points.y = 40 // updates points(1).y -
Boxes whenever passed to generic context but remains unboxed inside of the arrays.
Function pointers
- CFunctionPtr as before, with known bugs fixed.
C numeric types
- Type aliases in the
scalanative.native.*as before.
Unsigned numeric types
-
They become
@structclasses instead of scalac-style value classes. This enables storing them in unboxed form inside of the arrays. -
Otherwise unchanged as before
Varargs
- Similar to what we've had before, only
CVarargbecomes a first-class object and can be abstracted over.
Generic intrinsic methods
- Methods that take
scalanative.native.Tagcan be abstracted over in generic way by providing correct implicits. This may incur performance overhead due to boxing. Specialized code remains to be the recommended option for performance-sensitive code.
TODO
- TODO:
@unionwith similar semantics to@struct
Structs Can have arbitrary number of method and fields, both mutable and immutable Transperantly copied by-value across different storage locations.
Would mutating a field of an unboxed struct also change all boxed representations of the same struct and vice-versa? or would they contain different values that are hard to debug due to transparency of copying?
Would mutating a field of an unboxed struct also change all boxed representations of the same struct and vice-versa? or would they contain different values that are hard to debug due to transparency of copying?
They are not objects, therefore you are not guaranteed to get the same box every time. This is by design.
@densh, I get this, this is also true for non-small integers in hotspot. What I'm asking about it this:
@struct class Struct(var x: Int)
val p = new Struct(0)
val array = new Array[Any](1)
array(0) = p
p.x = 1
println(array(0).asInstanceOf[Struct].x)
What is this supposed to print under proposed semantics?
Expected semantics for you example is as following:
@struct class Struct(var x: Int)
val p = new Struct(0) // p is initialized with 0 on the stack
val array = new Array[Any](1) // array is initialized with null
array(0) = p // creates a value copy of p to a new box, puts box into the array
p.x = 1 // update p to contain 1, has no effect on box inside of the array
println(array(0).asInstanceOf[Struct].x) // prints value inside of the array which is 0 at this point
@densh, thanks for the clarification!
Can we represent unions in a similar way? I would love a syntax like this:
@struct class Struct {
val x: CInt = 0
val y = Union {
val a: CInt = 0
val b: CLong = 0
}
}
But we can't do that when we want to use constructor parameters I guess:
@struct class Struct(val x: CInt, val y: Union[CInt, CLong] ???)
Plus we would be back to the problem of 22 parameters. Any ideas?
Any ideas for improving interoperability and type safety concerning Scala enums (ADT or scala.Enumeration based) and C enums? For example by introducing an @enum annotation and/or trait for extracting the C enum value.
Am I correct to think that all parameters are always passed by value? (unless I pass a Ptr[T] by value, i.e: a reference)
Any support in the compiler for unmutability, such as const struct* and const struct const *? Not sure it that would be useful at all or whether the compiler could provide additional guarantees from such sort of information.
Is it still possible to allocate struct not on the stack?
I think the proposal looks great but have a few questions and concerns.
Currently we have Stack Allocation as follows:
def stackalloc[T](implicit tag: Tag[T]): Ptr[T] = undefined
def stackalloc[T](n: CSize)(implicit tag: Tag[T]): Ptr[T] = undefined
Which conceivably could be as follows if the default did not create an array:
def stackalloc[T](n: CSize = 1)(implicit tag: Tag[T]): Ptr[T] = undefined
In C and C++ the second version would be equivalent to creating an array on the stack where the array itself is a pointer to the type of the array.
int arr[3];
int* pa = arr;
This seems great so far even though stackalloc is bit verbose perhaps and in C, getting a pointer to an individual object on the stack would require getting the pointer to the object which is like first variant of the stackalloc function above. Example:
int i = 2;
int* pi = &i;
In C/C++ creating a struct and allocating on the stack would look like the following assuming we typedefthe struct or name it which is similar to what we are used to in Scala. Example of a typed and named struct and stack allocation.
typedef struct A {
int x;
} A;
A a;
a.x = 42;
My first concern is that using the new keyword to create the struct on the stack is a little confusing especially because in C++ the new keyword would then need a delete call. Even though we would like seamless integration, we also need to think a context shift is desirable, "I'm dealing with C now" and especially with the new boxing rules which require thought. I'm not sure if it is possible, but what about using an annotation rather than new?
I'm not sure what is wrong with the name CArray? Is there a plan for deprecation or can we just break the code from 0.3.x to 0.4.x which is not as nice to the users but it is early times. If so, the current stackalloc could work if we had a version that explicitly took a pointer to return a pointer to an object or array.
val a = stackalloc[A] // stack allocated A struct
val pa = stackalloc[Ptr[A]] // stack allocated pointer to A struct
val parr = stackalloc[Ptr[A]](2) // stack allocated pointer to an array of A of length 2
The above leaves off the initialization of C arrays in the proposal.
It would make sense to possibly just leave off the new portion altogether but since Scala has companion objects with apply methods and the proposal allows @struct case class then this is too similar to calling the apply on the companion object which hides the new anyway so this doesn't help. We could use the companion to create a value on the stack too I assume.
To sum up I'd like to point out my assumptions again.
- We want really easy access to C with additional safety provide by
Zones andResourceblocks and some good compiler checks and messages. This includesstructs andunions. - There is a need to know C if using C and we should keep some delineation that we are working with C otherwise it may be hard to make a context switch from C to Scala and back.
- In practice C usage will be wrapped by Scala and provided to users so the users do not need to worry or need to know C. This is similar to the base Java libraries that use C in Scala Native now.
- Scala Native will have great optimizations that make code run super fast so in most cases wrapping the C libraries is for reuse of the C not for additional performance.
- We don't want to introduce tooling issues so the code should be pure Scala that tools understand.
Please consider making Stack Allocation of C objects and arrays be similar and slightly different from Scala. Also the ability to make a C pointer or value easy to distinguish when created on the stack would be helpful. If we have something good currently in the platform but changing it would break backward source compatibility, I personally think that it okay at this early stage. As the platform matures deprecation takes on more meaning given a larger body of code.
@ekrich I understand your concern about stack allocation using 'new' being unexpected for C++ developers, but think it might be a little overstated. Hotspot already does stack allocation, transparently, when deemed desirable. I concede that this is not identical—Hotspot will always preserve original semantics, whereas SN @struct would alter semantics—and yet, Scala users already have no expectation of following a new by an explicit delete. C++ developers adopting Scala Native will find no delete to call, and will have to eventually grok Scala, so a headscratch and 5 minutes of fumbling is probably the worst case scenario here ;-)
@densh Will this resolve #555?
Just a quick followup. There are two different types of stack allocation in C, objects and arrays. It can be shown as follows in Scala like syntax with as in the proposal - one dimensional:
// T needs to be a C type
val t: T = new T
// empty list with n elements
val pt: Ptr[T] = new CArray[T](n: CInt)
// with initialization list with n elements
val pt: Ptr[T] = new CArray[T] {T1, T2...Tn }
A lot of the discussion I've seen around C interop seems to me to consider two perspectives:
- native bindings to be made available via the standard library
- ad-hoc native bindings as part of Scala Native projects I use the term 'ad-hoc' above, because the needs of binding libraries present additional considerations.
I'm working on a pet project for which I need a simple 2d physics engine. I decided to go for a library binding for Chipmunk Physics. Since there is no vanilla Scala library binding for Chipmunk, I figured I might as well target Scala & SN both. Keeping both bindings usable while minimizing platform shims is more painful than it has to be (though changes listed in this issue are a big improvement). I don't think my situation is very unique, so I assume there's a case to be made for adopters interested in implementing library bindings, who would also like to reuse the resulting codebase for vanilla Scala bindings, all while minimizing platform-specific shims.
Once the interop API stabilizes, this need can be met by adding a supported scala-native-compat library for vanilla Scala, aiming at including a sensible subset of SN FFI features. Such a library can be implemented on top of an existing JVM FFI library—for example: @struct classes can be rewritten by macro to a jnr-ffi Struct; Ptr[T] can be adapted to jnr-ffi Pointer. Making the interop API vanilla-friendly will help drive adoption for Scala Native: since SN is about to have the best C interop API available in the Scala ecosystem, it would encourage developers to produce library bindings for SN even when they originally intend to only target vanilla Scala.
So, I propose that a scala-native-compat vanilla library be added for future consideration, and that some thought be given to gap analysis between it and the evolving interop API.
@asoltysik Added a TODO for unions.
@jonas C enums are just weakly-typed named numeric constants. The most direct mapping to Scala is to represent them as numeric constants. We don't have to add a completely new feature to support this.
@frgomes Parameters are pass-by-value. Lack of support for C-style const hasn't been a problem so far. We'll probably keep it this way.
@muxanick Yep, you can allocate structs whenever and then access its state by accessing fields and methods directly on the pointer. This is going to be possible thanks to an implicit conversion mentioned above.
@ekrich We can't add new syntax in Scala Native, we can only overload existing language features to work differently for interop types. This might lead to some confusion but there is nothing we can do about it.
@ekrich Fixed-size arrays have different semantics from the CArray. We'll do a deprecation cycle for 1 release for all old-style interop features which have been superseded by newer counterparts.
@nadavwr Yep, this resolves #555 and a few more issues. It looks like we'll be able to close most of interop-related bugs and feature requests once the PR for this materializes.
@nadavwr Making our interop layer work on Scala JVM would be really cool but it's out of scope for the Scala Native.
@nadavwr : Regarding interop with JVM, maybe scala-bindgen could provide and/or integrate some sort of glue layer somehow. Some source of inspiration, without any target date:
https://github.com/frgomes/scala-bindgen/issues/3
https://github.com/frgomes/scala-bindgen/issues/5
Maybe you could open a new issue describing in more detail your idea? Thanks
@densh Since we are going to have implicit stack allocation via new for struct and new C arrays etc., could you consider removing stackalloc all together?
// currently
var points: Ptr[Point] = stackalloc[Point](100)
// proposed
var points: Ptr[Point] = new Ptr[Point](100)
Basically, every C construct is stack allocated via new so everything is consistent.
When dealing with heap allocation maybe we could do the same as stack allocation but inside a Zone.
Zone { implicit z =>
// current
var points: Ptr[Point] = z.alloc[Point](100) // returns pointers
// proposed
var points: Ptr[Point] = new Ptr[Point](100)
}
This would pretty much hide the implementation behind syntax like a for comprehension.
Edit: I guess I didn't think about cases where you stackalloc and alloc in a Zone. Not sure how you would distinguish between the two cases.
Here is an issue I submitted to utest where Scala Native is not super friendly to the library. This may help influence design decisions . https://github.com/lihaoyi/utest/issues/118
CFunctionPtr can simply become FPtr which is much shorter name?
It looks like wen can start implementing this in 0.3.x and slowly deprecate older versions of the APIs which are going to go away in 0.4. This will give a grant a bigger window of compatibility for library authors and smoother migration. 0.4 is going to just be a "remove all deprecations" release.
given scope of breaking changes here, it seems #1197 fits right in
Nesting of extern can be inside object but not class. See scala.scalanative.unsafe.ExternTest for example. Switching to JUnit required moving them to the top scope.
Here is a little sandbox code for a @struct test that passes and receives a struct that could be used for testing.
import scalanative.unsafe.extern
import scalanative.unsafe.CFloat
import scala.scalanative.runtime.struct
import complex.cacosf
@struct class FloatComplex(var re: CFloat, var im: CFloat)
@extern object complex {
def cacosf(
complex: FloatComplex
): FloatComplex = extern
}
object Test {
def main(args: Array[String]): Unit = {
println("Hello, World!")
val fcin = new FloatComplex(1.0f, 1.0f)
// println(fcin) // won't work would have to box
val res = cacosf(fcin)
// println(res) // same as above
}
}