soa-derive Allow to nest structs of arrays

Basic idea

This would be using a new helper attribute to indicate that you'd like to add the fields as part of the SoA:

#[derive(StructOfArray)]
struct Point {
    x: f64,
    y: f64,
}

// Proposed added behaviour: generates a `PointVec` and a `Vec<f64>`
#[derive(StructOfArray)]
struct Raster {
    #[nested_soa]
    coords: Point,
    value: f64,
}

This would also effect pointers, references and slices. The idea is that the structure of the data is kept but it allows to use SoA all the way down.

Advantages

It's an opt-in, so it's not a breaking change in any way, shape or form. It uses an attribute, which means that it can be easily changed (nice to have to measure if it affects performance).

Also this would give this project an edge because it's very hard to implement in a lot of other languages, specially in a strongly typed, low-level language like Rust (where performance really matters).

The implementation should be straightforward and rely on having the same interface as Vec. As such the generated code will call the functions that just so happens to exist in both Vec and PointVec.

How would it work

First, we need to add a few traits similar to StructOfArray but for the pointers, references and slices. The attribute works as a marker. We generate the types based on it. For instance, if a field has the attribute we generate <Example as ::soa_derive::StructOfArray>::Type instead of the regular Vec<Example>. Pointers, references and slices follow a similar pattern with the other traits. And that's it. It should just work because we have the same interfaces :)

Oct 16 '20 10:10 mangelats

I realized this while working on the single length. I think this is more useful so I'll make it before the single length :)

Oct 16 '20 10:10 mangelats

This sounds like a good idea, however it might not be possible to do it for slices without GAT, cf https://github.com/lumol-org/soa-derive/pull/25#issuecomment-674342046.

Oct 16 '20 10:10 Luthaf

That's the reason it's necessary to use multiple traits. We need lifetime templated traits. Here a simple example in the playground

Oct 16 '20 10:10 mangelats

Well, that sounds good then =)

Oct 16 '20 12:10 Luthaf

Hey! Sorry for the inactivity these days. Here is an update on what I'm working right now.

I'm using the discussed traits that define the nested fields' types and it's working wonderfully :)

The current problem is how to handle the difference in what kind of expression we need depending of the type, for example particle_ptr.mass as *mut f64 vs particle_ptr.as_mut_ptr(). I'm in my third iteration and this is what I can say about them:

Using a helper trait to "inject" a function only to the non-nested field. It's how I added support for indexing. The main problem is that it doesn't work properly if we try to do so to a type that already has a method named the same. It can still be used with standard types like pointers because we can check if we do so (even though it may broke if a method with the same name is added in the future, which can cause issues with index and we may have to change it).
Using a trait to add a static method to the generated SoAs and the standard types. This avoids any name collision to the expense of having to generate a lot of traits implementations for every generated SoA vector (probably increases build times). Also there are times that it's not possible to use it. As an example as_ref on the base type (Particle).
(Just started trying it out) Generate every expression depending on the underlying. This makes the generated code straightforward, cleaner and should work everywhere. But doing that we get worse generating code because we can no longer simply use a single quote! (making some kind of helper function or macros may help with this problem).

If you have any thoughts or ideas, let me know. I'll probably add a comment after the third version is done or I find a roadblock.

Nov 28 '20 14:11 mangelats

I'm not quite sure I understand exactly how every one of your points work, could you add some basic example code?

The main problem is that it doesn't work properly if we try to do so to a type that already has a method named the same

If we are using a trait, don't inherent methods (the one defined on the type) take priority? And we can (and should) generate the code to always use the full path of the trait (soa_derive::IndexVec::index(&vec, i) instead of vec.index(i)). Or I am misunderstanding something here?

Nov 30 '20 09:11 Luthaf

Here is what the code looks like (simplified): 1. Use traits to add functionality to the primitive type

trait PtrExt {
    type PtrMut;
    fn as_mut_ptr(&self) -> Self::PtrMut;
}
impl<T> PtrExt for *const T {
    type PtrMut = *mut T;
    fn as_mut_ptr(&self) -> Self::PtrMut {
        self as *const T
    }
}
impl ParticlePtr {
    fn as_mut_ptr(&self) -> ParticleMutPtr {
        use PtrExt;
        ParticleMutPtr {
            name: self.name.as_mut_ptr(),
            mass: self.mass.as_mut_ptr(),
            nested: self.nested.as_mut_ptr(),
        }
    }
    // [...]
}

2. Use traits static methods

trait SoaPtr {
    type PtrMut;
    fn as_mut_ptr(&self) -> Self::PtrMut;
}
impl<T> SoaPtr for *const T {
    type PtrMut = *mut T;
    fn as_mut_ptr(&self) -> Self::PtrMut {
        self as *const T
    }
}
impl SoaPtr for ParticlePtr {
    type PtrMut = ParticleMutPtr;
    fn as_mut_ptr(&self) -> Self::PtrMut {
        self.as_mut_ptr()
    }
}
impl ParticlePtr {
    fn as_mut_ptr(&self) -> ParticleMutPtr {
        ParticleMutPtr {
            name: SoaPtr::as_mut_ptr(self.name),
            mass: SoaPtr::as_mut_ptr(self.mass),
            nested: SoaPtr::as_mut_ptr(self.nested),
        }
    }
    // [...]
}

3. Generating the proper expressions

impl ParticlePtr {
    fn as_mut_ptr(&self) -> ParticleMutPtr {
        ParticleMutPtr {
            name: self.name as *mut String,
            mass: self.mass as *mut f64,
            nested: self.nested.as_mut_ptr(),
        }
    }
    // [...]
}

Nov 30 '20 11:11 mangelats

In the first and second case I'm abusing traits to force overriding (in another language would simply be an overrided function for each type). Like you said, the first case may break with the addition of the wrong method into a type (would take preference). If we want to add a full path to the Trait we need every generated type to implement it. That's case 2.

Defining the correct lifetimes explicitly can be a nightmare when using traits, so generating the precise expression (case 3) is turning out to be the simplest and most reliable solution.

Nov 30 '20 11:11 mangelats

If we are using a trait, don't inherent methods (the one defined on the type) take priority? And we can (and should) generate the code to always use the full path of the trait (soa_derive::IndexVec::index(&vec, i) instead of vec.index(i)). Or I am misunderstanding something here?

Index is a bit different. We do not (and cannot) control the traits (Index and IndexMut). The only problem we may have is that they add index or index_mut to slice or Vec and make them do something different than the traits.

We could probably add the primitive types into our custom trait and then forward to the standard ones for the primitive type and to the custom implementation for generated ones. This would make the code even longer (take a look at index.rs, it's already massive). For now I'd keep it as it is, maybe add a comment explaining this in the code.

Nov 30 '20 12:11 mangelats

soa-derive soa-derive copied to clipboard

Allow to nest structs of arrays

Basic idea

Advantages

How would it work

soa-derive
soa-derive copied to clipboard