lobster icon indicating copy to clipboard operation
lobster copied to clipboard

design direction: "long" vector operations?

Open aardappel opened this issue 3 years ago • 7 comments

Lobster has a lot of support for built-in 2/3/4D vector operations, both at the language level with all the operators (+ etc), and with a lot of built-in functions (dot() etc). You should be able to write any function you can write with scalars just as easily with these vectors.

These vectors have intentionally been limited to 4 elements, because that allows us to make them really fast (they're by-value, inline structs), and it covers all typical uses of vectors used in games/graphics.

If you want to use longer vectors, you can make bigger structs or use Lobster's variable length vectors, but there are no operations to support them, so you'd have to write your own loops in Lobster, which is both slow and cumbersome. Should Lobster should support longer vectors more directly, and if yes what form should that take?

We can't simply scale up our current fixed struct support, especially since many uses of longer vectors are not compile time constant. They should probably work on any arbitrary vectors at least.

We could extend all our current vector ops to include overloads for variable size vectors of (int and float). That would be nice, but that is a LOT of new code, so we better be sure it is useful.

Who would use these longer vectors? People interested in more scientific, simulation style algorithms. Machine learning maybe. Generally a way to write algorithms in a different style: rather than a loop of scalar operations on a vector, a sequences of whole-vector operations.

Some open questions I can think of (feel free to add your own):

  • Variable sized vectors are dynamically allocated things, so should operations preferably work in place (+= style instead of + ?) Should we support both (yet more code explosion)? If only +=, is needing to explicitly needing to copy vectors going to get annoying? Or should we do it automatically (overwrite if refc == 1).. though this may need language lifetime analysis support.
  • Should this extend to operators in the language for existing operators? That be nicest, but certainly makes it an even heavier feature to support. If not, we need add() builtins for long vectors etc.
  • If we're going to cater to scientific/ML style uses, shouldn't we have a more "numpy" style interface, giving users wider flexibility in terms of number of dimensions, and sizes of elements. For example, ML may prefer using 16-bit floats, rather than Lobster's 64-bit floats, as elements. On the other hand, something like numpy is even more enormous in terms of code size, and even harder to consistently make fast. You could say that instead Lobster should focus on providing the few 1D vector ops that multi-dimensional numpy-style functions can be built on top of in Lobster user libraries, and still be fast. For example, a long vector version of dot() is enough to build a fast large matrix multiply with (assuming one of the matrices is already transposed!). The advantage of this approach is that you end up with types that the language directly understands ([[float]] instead of an opaque numpy object), the disadvantage that you'll not get 16-bit floats until Lobster supports those at the language level.

Can you think of other considerations? Other use cases that are important? Performance considerations?

Oh yes, there's the elephant in the room.. shouldn't such computations run on a GPU/TPU/.. anyway? Yes, probably, if you must have the absolutely the most ops/watt, you wouldn't be writing this in any CPU targeting language. Maybe one day Lobster can run (partially) on the GPU? This is not a problem we're trying to solve today, though. Assume you have a reason you like using Lobster, and want to write some higher dimension vector code, for simulation or otherwise.

A further thing you could argue is that any of this functionality should just be written directly in Lobster, and we should instead be working towards making Lobster as fast as C++. That is certainly a long term goal, but it will take a while, and it is hard to be exactly as fast as C++. That, and we'd like to keep supporting running in an interpreter (or simple JIT) for the longest time during development, which really benefits from having inner-most loop functions in C++ to not be too slow.

aardappel avatar Jul 28 '20 16:07 aardappel

Uses for long vectors.. arrays of Matrix4x4 3x3 4x3 is what comes to mind.

bartwe avatar Jul 28 '20 18:07 bartwe

Ideally we could use the whole set of operators on numeric vectors but I am oblivious to how much of a 'code explosion' and extra effort this would mean. If I can't have all operators, then the /= *= += -= %= etc. subset would still be very useful, especially if I could also do if any(x > 3): with x being [float]. However, here again you'd allocate a new vector x > 3, something that you'd like to avoid. Alternatively, a small set of array functions such as dot(), sum(), add(), mult(), .. could be a start.

stefandd avatar Jul 28 '20 18:07 stefandd

@bartwe those are exactly the kind of uses that this would not address, i.e. more fixed size game/graphics oriented types.

But you're right, so far we have avoided having those types in the language at all, since the graphics API emulates a transform stack, and keeps all actual matrices internal. Any matrix math beyond that tends to happen in shaders.

Would be nice to at some point have these types, but hopefully not built-in to the language.

aardappel avatar Jul 28 '20 19:07 aardappel

@stefandd as for a code explosion, we currently already have several hundred VM ops dealing just with these fixed int/float vectors. It's int vs float, vector vs scalar, lvalue vs rvalue, different kinds of rvalues, etc.

Then look at the amount of builtins that take vector arguments.

To do this to where it is equally well supported as the current types would indeed by quite a code explosion.

We could start with a small set of functions, but then anyone wanting to use them immediately runs into limitations, and has to write manual loops for a lot of things themselves.

aardappel avatar Jul 28 '20 19:07 aardappel

@aardappel I agree that the main goal -- attaining more speed -- is definitely more important and eventually Lobster will be able to implement matrix operation etc. directly in Lobster. In Luajit matrix multiplications done naively are as fast as NumPy (the non-Intel MKL version) once the FFI interface is used and maybe some day Lobster might get into similar territory.

stefandd avatar Jul 29 '20 10:07 stefandd

I'm skeptical about the need for it to be honest. On a usecase of something like ML, I'd probably be exploring implementing computationally heavy stuff (using mature libs) on C++ side, and registering those native functions for lobster. Haven't actually tried it for I don't have a need, but looking at docs it seems straightforward

arvyy avatar Jul 29 '20 20:07 arvyy

@arvyy yes, I can see that.. maybe its an idea that needs to simmer a bit longer :)

aardappel avatar Jul 29 '20 20:07 aardappel

Since this discussion:

  • Fixed vector support for 2..4 elements was extended to work on 1..N elements, but still based of fixed size structs of 1 element type.
  • There are some specialized 4x4 matrix operations that work on a [float] of 16 elements currently, see matrix_ ops.
  • Extending built-in vector operators generally to dynamically sized vectors is not planned, this is likely to be a bit too high-level to be directly part of the language. Instead, built-in functions that operate on [float] like the existing matrix ones could be extended with support for e.g. addition of arbitrary such vectors.
  • Given that Lobster now has operator overloading, it be easy to wrap the above operations in a little struct that makes using them feel more like a built-in type, e.g.
     struct matrix:
        v:[float]
        def operator*(o:matrix):
            return matrix { matrix_multiply(v, o.v) }
    

aardappel avatar Apr 22 '23 15:04 aardappel