NDArray
NDArray copied to clipboard
[RFC] NDArray Protocol
NDArrayProtocol
For a multidimensional array library its extremely important to be able to generate specialized implementation of its operations for a varied amount of cases while still being able to use them interchangeably. In Swift this is done via protocols which is usually straightforward, however, for protocols with an associatedType
implementing this is a bit trickier because
protocol '<SomeProtocolWIthAssociatedType>' can only be used as a generic constraint because it has Self or associated type requirements
This happens when the protocol is the return type of a function or similarly when you create an array of these protocols, but hiding implementation details and being able to have an array of NDArrays
for says implementing concatenation is a must have for NDArray.
Type Erasure
The solution proposed by the Swift language is to implement type-erasing types often called Any*
like AnySequence
, AnyIterator
, AnyView
, etc. There seem to be various strategies to achieve type erasure, most (all?) relying on the type storing all relevant operations from the underlying type as closures, in this way it is able to indirectly reference the type it contains while keeping the compiler happy. The first immediate downside of this strategy is that it requires a lot of boilerplate.
Implementation Strategy
The strategy will be to perform the following changes:
- Define the
NDArrayProtocol
- Rename the current
NDArray
struct asBaseNDArray
- Implement
NDArray
as a type erasing struct
The idea is that the user only handles NDArray
s while the implementer can create highly optimized types for certain operations.
POC
Next is a POC of how the general architecture might look. Note the following:
- In the current proposal
subscript
is not part of theNDArrayProtocol
, instead thesubscript_get
andsubscript_set
methods are defined and its theNDArray
type that uses these to implementsubscript
. - Without loosing generality,
BaseNDArray
is defined as only containing scalar data. - all the
_*
constant closures onNDArray
wrap an operation of the underlying type. - The
DoubleNDArray
is just a simple struct that doubles is given internal value, its defined just to test the proposed strategy.
protocol NDArrayProtocol {
associatedtype Scalar
func subscript_get(_: Int) -> Self
mutating func subscript_set(_: Int, _: NDArray<Scalar>)
var first: Scalar { get }
}
struct NDArray<Scalar> : NDArrayProtocol, CustomStringConvertible {
let _subscript_get: (Int) -> Self
let _subscript_set: (Int, Self) -> Void
let _first: () -> Scalar
var first: Scalar { _first() }
init(_ v: Scalar) {
self.init(ndarray: BaseNDArray(v))
}
init(ndarray: Self) {
self = ndarray
}
init<N: NDArrayProtocol>(ndarray: N) where N.Scalar == Scalar {
var ndarray = ndarray
_subscript_get = { NDArray(ndarray: ndarray.subscript_get($0)) }
_subscript_set = { ndarray.subscript_set($0, $1) }
_first = { ndarray.first }
}
func subscript_get(_ i: Int) -> NDArray<Scalar> {
_subscript_get(i)
}
mutating func subscript_set(_ i: Int, _ v: NDArray<Scalar>) {
_subscript_set(i, v)
}
subscript(r: Int) -> NDArray<Scalar> {
get { _subscript_get(r) }
mutating set(v) {
_subscript_set(r, v)
}
}
var description: String {
"NDArray<\(Scalar.self)>(\(_first()))"
}
}
extension NDArray where Scalar: Numeric {
func getDoubler() -> Self {
NDArray(ndarray: DoubleNDArray(first))
}
}
struct BaseNDArray<Scalar>: NDArrayProtocol {
var data: Scalar
var first: Scalar { data }
init(_ v: Scalar) {
data = v
}
func subscript_get(_: Int) -> Self {
self
}
mutating func subscript_set(_: Int, _ v: NDArray<Scalar>) {
data = v.first
}
}
struct DoubleNDArray<Scalar>: NDArrayProtocol where Scalar: Numeric {
var data: Scalar
var first: Scalar { data }
init(_ v: Scalar) {
data = v * 2
}
func subscript_get(_: Int) -> Self {
self
}
mutating func subscript_set(_: Int, _ v: NDArray<Scalar>) {
data = v.first * 2
}
}