Arraymancer icon indicating copy to clipboard operation
Arraymancer copied to clipboard

Missing useful functions

Open edubart opened this issue 6 years ago • 9 comments

I will list here some functions that I miss in the API and keep updating as I need them, for start I will list some useful functions that I will potentially need, also open for discussion on its naming or if should be in the core

I am going to mark really important ones for now in bold.

General:

  • [x] at(tensor, i, j, ...) => access subdimensional tensors #55
  • [x] squeeze(tensor, [dim]) => remove dimension from tensor
  • [x] unsqueeze(tensor, dim) => add dimension to tensor
  • [x] size(tensor) => return number of elements in the tensor
  • [x] copy(tensor) => copy contents from tensor, this is different from assignment (e.g. we may want to copy to a view), number of elements in both tensors must match
  • [x] flatten(tensor) => convert a tensor to a vector
  • [x] fill(tensor, value) => fill tensor elements with the given value
  • [x] save(tensor, filename) => save tensor to file
  • [x] load(filename) => load tensor from file
  • [x] stack(tensors) => array of tensors to plus 1 rank tensor

No copies operations:

  • [x] unsafeToTensor(seq) => convert a seq to a tensor without copying, useful when loading without seq duplication
  • [x] unsafeToTensorReshape(seq) => same as above, but does reshape
  • [x] unsafeAt(tensor, i, j, ...) => like at, but no copies
  • [x] unsafeSqueeze(tensor, [dim]) => remove dimension from tensor, but no copies
  • [x] unsafeUnsqueeze(tensor, dim) => add dimension to tensor, but no copies
  • [x] unsafeTranspose => transpose with no copies when possible
  • [x] unsafePermute => permute with no copies when possible
  • [x] unsafeFlatten => convert a tensor to a vector with no copies when possible
  • [x] unsafeBroadcast => broadcasting with no copies
  • [x] unsafeAsContiguous => like as contiguous, but no copies when it is already contiguous

Simple Math:

  • [x] randomNormalTensor(tensor) => returns a random tensor in the standard normal distribution
  • [x] abs(tensor) => returns abs on all elements
  • [ ] sum(tensor, [axes]) => sum over many axes
  • [x] min(tensor, [axis]) => returns minimum in the given axis
  • [x] max(tensor, [axis]) => returns maximum in the given axis
  • [x] std(tensor, [axis)] => returns standard deviation in the given axis
  • [x] var(tensor, [axis)] => returns variance in the given axis
  • [ ] prod(tensor, [axis]) => returns product of elements in the given axis
  • [ ] norm(tensor, axis) => returns the 2p-norm of a 2d tensor in the given axis
  • [ ] pnorm(tensor, axis, p) => returns the p-norm of a 2d tensor in the fiven axis
  • [x] pow(tensor, v)
  • [x] square => pow for power of 2, is it faster to just do x*x instead of pow(x,2.0f) ?
  • [x] clamp(tensor, a, b) => clamp values of tensor between a and b
  • [ ] mclamp(tensor, a, b) => like above, but in-place
  • [ ] msqrt, mln, msin, mround, ... all the common element-wise math functions but in-place

Linear algebra:

  • [ ] eye(n, [m]) => retuns NxM identity matrix, a must for doing linear algebra in general
  • [ ] batchMatmul(tensor, tensor) => batch matrix multiplication for tensors with rank >= 3, useful for doing batches of convolution for example
  • [ ] inverse(tensor) => inverse of a matrix, useful for doing closed form of linear regression with few features for example
  • [ ] svd(tensor) => singular value decomposition, useful for doing PCA (principal component analysis) and dimensionality reduction of features for example
  • [ ] eig(tensor) => compute eigenvalues and eigenvectors of a square matrix, also useful for dimensionality reduction

edubart avatar Sep 21 '17 20:09 edubart

einsum would be very nice too for concise, ergonomic and less error-prone aggregations.

szabi avatar Oct 17 '17 13:10 szabi

Einsum will be there at one point, I talked about it in my design document here, not sure when because it's quite a difficult function to implement well and optimize.

Probably at first the only backend that will have it will be Cuda thanks to the gemmStridedBatched function that makes Tensor contractions easier to implement

mratsim avatar Oct 17 '17 17:10 mratsim

But can't the (algorithmic) implementation (ie. implementing it "well and optimize[d]" be mostly lifted over from numpy or tensorflow? I'd assume they put some thought into that already?

(No, I did not look at the actual code of einsum over there).

szabi avatar Oct 22 '17 10:10 szabi

Let's move conversation about einsum to #124. It's a function which is much more complex than the other listed there.

mratsim avatar Oct 22 '17 23:10 mratsim

just for another 0.02 of a data-point... having PCA available would open up some use-cases for Arraymancer for my use.

brentp avatar Feb 09 '18 16:02 brentp

Yeah I will definitely add PCA and SVD.

On Feb 9, 2018 5:49 PM, "Brent Pedersen" [email protected] wrote:

just for another 0.02 of a data-point... having PCA available would open up some use-cases for Arraymancer for my use.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mratsim/Arraymancer/issues/58#issuecomment-364489822, or mute the thread https://github.com/notifications/unsubscribe-auth/AVr1jSkvzM93-ufMi0zrWR6XgRRVO1VUks5tTHcAgaJpZM4Pf5LM .

mratsim avatar Feb 09 '18 17:02 mratsim

@brentp PCA is done! It requires LAPACK on the machine, you can see example usage here:

https://github.com/mratsim/Arraymancer/blob/64d014900cf16f92e8945509bbe881373bbd5f31/tests/ml/test_dimensionality_reduction.nim#L8-L49

And "documentation" here: https://github.com/mratsim/Arraymancer/blob/64d014900cf16f92e8945509bbe881373bbd5f31/src/ml/dimensionality_reduction/pca.nim#L9-L16

mratsim avatar May 03 '18 20:05 mratsim

this is great! I will definitely be making use of this. thanks for your work on this project.

brentp avatar May 03 '18 20:05 brentp

Many of the items that were missing are now available: svd, pinv (i.e. pseudo-inverse), product along an axis, mclamp, eye (and several related functions such as identity and diag) and symeig (to calculate eigenvues and vectors of symmetrical matrices).

I personally miss (1D) convolution (the high might be taken from the cnn code?) and some other basic signal processing functions (filtering and FFT for example) and a few numpy convenience functions such as resize, insert, append, delete, roll and friends.

AngelEzquerra avatar Dec 29 '23 13:12 AngelEzquerra