sphericart icon indicating copy to clipboard operation
sphericart copied to clipboard

Add Julia bindings

Open Luthaf opened this issue 2 years ago • 18 comments

This will require multiple steps:

  • [ ] Register the C package with Yggdrasil, creating a Sphericart_jll package. This will require making sure we can cross-compile the C++ code from Linux to all supported architectures and OS;
  • [ ] Write the pure Julia bindings, re-exposing the functions from Sphericart_jll with a nicer Julia API

Luthaf avatar Apr 26 '23 09:04 Luthaf

I have a prototype implementation, which - on small basis sizes - is about a factor 2-3 faster than my previous one based on spherical coords (for large bases they are about the same) and will make this public soon once I also have the rrules done.

One could argue having the pure julia implementation here in this repo instead of a separate Julia repo. I don't particularly mind either way. We will most likely want to maintain our own implementation with the interfaces we need but if that can be made compatible with what is here then all the better.

For sure I wouldn't mind getting input on performance optimizations from people who actually understand this rather than just trial-and-error.

I also don't plan on a GPU implementation immediately so that would be another thing where I'd be interested in following somebody else.

cortner avatar Oct 26 '23 05:10 cortner

An interesting complication is that a @generated julia implementation is really extremely fast, but then I can't exploit batched evaluation via AVX or SIMD. So I'm planning to provide two codes, one that is @generated for a single input and outputs an SVector and a second one that takes the loop of inputs inside and SIMD or AVXs it (to be tested which is faster on which hardware, I'm only working with M1 right now ...)

cortner avatar Oct 26 '23 05:10 cortner

If you can / are willing to "harmonize" the API, I think it'd be interesting to have it in the same repo. The idea here is to re-use as much code as possible (so that later on, if there's a need) one can get crazy with optimizations, and/or provide more hardware-specific back-ends, but I can see the argument for being somehow less tightly bound for Julia.

ceriottm avatar Oct 26 '23 05:10 ceriottm

OkSo the procedure would be have a Julia package live in a subfolder I think that will then be registered with the Julia General registry. I haven't done this before but maybe it's obvious and if not maybe @Luthaf can help if he is interested in having this?

One other caveat is that the Julia code should probably be MIT licensed.

Regarding different hardware -- In Julia that would probably be achievable via KernelAbstractions.jl later.

cortner avatar Oct 26 '23 06:10 cortner

Yeah let's hear from @Luthaf and @frostedoyster . I think it'd be really good to consolidate the infrastructure we all use, and this is an easy starting point. When I was mentioning "harmonizing API" I meant just making sure function/class arguments and defaults are consistent, so users don't get too confused.

We have an ongoing offline discussion as to whether we should default to solid harmonics or normalize-by-default.

ceriottm avatar Oct 26 '23 06:10 ceriottm

We have an ongoing offline discussion as to whether we should default to solid harmonics or normalize-by-default

I guess for large maxL it won't matter anyhow but for small maxL it might have an impact on performance.

For modelling I would want to provide both options - it's just and easy extra transformation layer. For performance optimization if you already have a splined Rnl basis then it seems pretty obvious?

cortner avatar Oct 26 '23 06:10 cortner

So the procedure would be have a Julia package live in a subfolder I think that will then be registered with the Julia General registry.

This should be fine, the web interface in https://juliahub.com/ui/Packages can take a directory for packages that live in sub-directories. I'm not sure if the @JuliaRegistrator bot can also do this, but there should be a way to register the package.

One other caveat is that the Julia code should probably be MIT licensed.

Sure, we could also dual license the whole thing Apache/MIT (I know this is popular in the Rust ecosystem, but I don't remember why, I'll check).

One could argue having the pure julia implementation here in this repo instead of a separate Julia repo.

Especially if you already have a prototype, I'll be happy with also having a pure Julia implementation. If we also do bindings to the native implementation, we can have something like this in the long term

sph = spherical_harmonics(xyz; backend=:Julia) # pure julia, support autodiff & multiple dispatch


sph = spherical_harmonics(xyz; backend=:Native)  # native code, potentially faster, support fewer features

Luthaf avatar Oct 26 '23 12:10 Luthaf

I guess for large maxL it won't matter anyhow but for small maxL it might have an impact on performance.

For modelling I would want to provide both options - it's just and easy extra transformation layer. For performance optimization if you already have a splined Rnl basis then it seems pretty obvious?

I agree entirely that for performance there's no question it should be the solid harmonics, and that it's no biggie to provide both options. the discussion was about the default. I feel quite strongly that the splid harmonics are actually the more natural things and it's ok to push for the better option as default, but @frostedoyster has a (good) argument that probably most users will expect the normalized version. Anyway, for the moment it should def be default normalize=False for consistency - I was just curious to hear your opinion.

ceriottm avatar Oct 27 '23 05:10 ceriottm

I feel quite strongly that the splid harmonics are actually the more natural things and it's ok to push for the better option as default

I'm coming around to that perspective as well.

but @frostedoyster has a (good) argument that probably most users will expect the normalized version.

In my Julia codes I always provide solid_harmonics and spherical_harmonics (named a bit differently but that's the gist). I'm not sure why you wouldn't want to just supply both? Let people choose which one to use?

EDIT: I looked at your Python docs again. Why not create two classes, one called SphericalHarmonics the other SolidHarmonics - that would make it crystal clear. The keyword normalized=true is actually very confusing. It could be an option to normalize the basis in different ways (there are different conventions in different fields...)

cortner avatar Oct 27 '23 05:10 cortner

I didn't yet test on the same CPU so maybe premature, but I think the Julia codes might be in a similar ballpark as yours. I am guessing I'm somewhere in-between your general-purpose and hard-coded. This is on a 2-year old M1. I tried to run the benchmarks as described on your readme but got no useful output, only red ...

Generated: (this is just basic meta-programming, no computer-algebra optimized thing)

[ Info: static_solid_harmonics
L = 1
  2.208 ns (0 allocations: 0 bytes)
L = 2
  4.708 ns (0 allocations: 0 bytes)
L = 3
  7.925 ns (0 allocations: 0 bytes)
L = 4
  11.970 ns (0 allocations: 0 bytes)
L = 5
  34.073 ns (0 allocations: 0 bytes)
L = 6
  40.575 ns (0 allocations: 0 bytes)

And the dynamic but batched implementation: (vectorized, but no multi-threading yet)

[ Info: batched evaluation vs broadcast
[ Info: nX = 32 (try a nice number)
L = 3
    single:   7.758 ns (0 allocations: 0 bytes)
broadcast!:   248.021 ns (0 allocations: 0 bytes)
   batched:   396.455 ns (0 allocations: 0 bytes)
L = 6
    single:   39.060 ns (0 allocations: 0 bytes)
broadcast!:   887.146 ns (0 allocations: 0 bytes)
   batched:   823.037 ns (0 allocations: 0 bytes)
L = 9
    single:   67.092 ns (0 allocations: 0 bytes)
broadcast!:   1.692 μs (0 allocations: 0 bytes)
   batched:   1.479 μs (0 allocations: 0 bytes)
L = 12
    single:   82.467 ns (0 allocations: 0 bytes)
broadcast!:   2.708 μs (0 allocations: 0 bytes)
   batched:   2.329 μs (0 allocations: 0 bytes)
L = 15
    single:   149.455 ns (0 allocations: 0 bytes)
broadcast!:   4.786 μs (0 allocations: 0 bytes)
   batched:   3.615 μs (0 allocations: 0 bytes)

TBH - I wouldn't mind feedback if anybody in your group is willing to look at this. I could make WIP-PR before I have the gradients finished.

cortner avatar Oct 27 '23 05:10 cortner

A WIP PR sounds like a good idea! I don't have a lot of time to look at it over the next 2 weeks, but @frostedoyster can also give the code a look!

Luthaf avatar Oct 27 '23 12:10 Luthaf

I don't have much experience with Julia, but I'd be happy to take a look

frostedoyster avatar Oct 27 '23 12:10 frostedoyster

Thanks - I'll prepare the PR. I propose to call the subfolder SpheriCart.jl which is the Julia naming convention since SpheriCart will be the name of the package in the Julia package registry.

If you have any concerns about it let me know.

cortner avatar Oct 27 '23 15:10 cortner

Perfect! I'm looking forward to the PR. Thank you also for the feedback on normalized. That's something we took from e3nn, but perhaps it isn't the best long-term solution. What are your feelings @ceriottm @Luthaf?

frostedoyster avatar Oct 27 '23 16:10 frostedoyster

Hm - maybe one more question. So far I haven't looked at your code at all to avoid license problems. Going forward, if we were to think about optimizing maybe this would become useful. This is especially true to extend the Julia codes to GPU. Does Apache-2 allow me to produce a derived product and license it under MIT? If not, then I would need your explicit permission to do so.

cortner avatar Oct 27 '23 20:10 cortner

Re license I don't mind dual license. Also better to do it now than after we have a bunch of contributors we can't track.

Re the normalization, I suggest that for the moment we adapt the Julia API to what we have, and open an issue to refactor. As @frostedoyster says, that's a lot of refactoring and re documenting so maybe not top priority.

ceriottm avatar Oct 27 '23 23:10 ceriottm

I suggest that for the moment we adapt the Julia API to what we have

Initially I'm just going to do SolidHarmonics. I'll add the wrapper for SphericalHarmonics asap. I'm not convinced the two APIs need to be identical. The two languages are too different.

cortner avatar Oct 28 '23 05:10 cortner

Re license I don't mind dual license. Also better to do it now than after we have a bunch of contributors we can't track.

My question was more specific : I thought I'd need explicit permission from the copyright holder if I look at your code, "transpile" it and release as new code under a different license. But it seems Apache-2 is sufficiently permissive that it doesn't matter:

The Apache 2.0 License is permissive. It allows you to use, modify, and distribute the licensed software, including creating derivative works, without requiring those derivative works to be licensed under the same terms. You can release the modified parts of the code under any license you prefer

But let me know if I've misunderstood it.

cortner avatar Oct 28 '23 05:10 cortner

@Luthaf -- unless you still plan to provide sphericart also via binaries, I think this can now be closed?

cortner avatar May 01 '25 02:05 cortner

yes, this can be closed for now! The only reason to do this someday would be if the GPU performance of hand-written kernel is a lot better than what the pure Julia code can do, but this would require extensive benchmarking first!

Luthaf avatar May 01 '25 07:05 Luthaf