stdlib icon indicating copy to clipboard operation
stdlib copied to clipboard

feat: add `blas/base/cscal`

Open aman-095 opened this issue 1 year ago • 12 comments

Description

What is the purpose of this pull request?

This RFC proposes to add a routine to scale values of input complex single-precision floating-point vector by another complex single-precision floating-point constant vector as defined in BLAS Level 1 routines. Specifically adding @stdlib/blas/base/cscal is proposed.

Related Issues

Does this pull request have any related issues? None.

Questions

Any questions for reviewers of this pull request?

No.

Other

Any other information relevant to this pull request? This may include screenshots, references, and/or implementation notes.

No.

Checklist

Please ensure the following tasks are completed before submitting this pull request.


@stdlib-js/reviewers

aman-095 avatar Apr 01 '24 06:04 aman-095

@kgryte @Pranavchiku benchmarks for this routine are failing, don't know why NaN is coming in output. Can you please review it once?

aman-095 avatar Apr 01 '24 06:04 aman-095

Sure @kgryte will work on this and resolve the issues.

aman-095 avatar Apr 01 '24 18:04 aman-095

Thanks for the updates. Feel free to add the C and Fortran implementations.

kgryte avatar Apr 05 '24 06:04 kgryte

@kgryte, In order to add the C implementation and match it with the Js implementation, we have the C implementation for the cmulf function and Complex 64, but not for the Complex64 array. So, how should I proceed? Should I make changes to the C implementation? I was thinking of taking input as a float-type array and then making Complex64 for each pair using stdlib_complex64_t cx. 

aman-095 avatar Apr 06 '24 09:04 aman-095

You should be able to interpret every two elements in the float array as complex. You just need to increment your loop pointer by two elements, rather than one. Something along the lines of

stdlib_complex64_t z;

uint8_t *ip1 = (uint8_t *)cx; // <= define a pointer and reinterpret as a sequence of bytes
int64_t is1 = 8 * strideX; // <= define the pointer stride (8 bytes per complex float32)

for ( i = 0; i < N; i++, ip1 += is1 ) { // <= increment the pointer according to the byte stride
	z = *(stdlib_complex64_t *)ip1; // <= deref pointer and interpret as complex64
	*(stdlib_complex64_t *)ip1 = stdlib_base_cmulf( ca, z ); // <= compute result and assign back to pointer
}

kgryte avatar Apr 06 '24 10:04 kgryte

This is similar to what we've done in our vectorized loops elsewhere. E.g., https://github.com/stdlib-js/stdlib/blob/develop/lib/node_modules/%40stdlib/strided/base/binary/include/stdlib/strided/base/binary/macros.h

kgryte avatar Apr 06 '24 10:04 kgryte

Note that I interpreted in the above code sample as a sequence of bytes. You could also interpret as a sequence of floats and then adjust your stride accordingly.

kgryte avatar Apr 06 '24 10:04 kgryte

@kgryte, In the addon.c file, we need napi for complex64 - single-precision complex floating-point constant, it is not there so should I take that as a float32 array as input?

aman-095 avatar Apr 07 '24 10:04 aman-095

@aman-095 Yes, we needed this PR to get over the finish line for parsing down a complex number: https://github.com/stdlib-js/stdlib/pull/1760

In the interim, you can see https://github.com/stdlib-js/stdlib/blob/4b7bda7cf10fd6dfdd4246152a120da2860893c7/lib/node_modules/%40stdlib/math/base/napi/unary/src/main.c#L330 for an example of how to unwrap a complex number object passed down from JS to C.

kgryte avatar Apr 07 '24 10:04 kgryte

So, till the above PR is merged, I will see the link you shared, and maybe after that, start working on something else.  CC @kgryte

aman-095 avatar Apr 07 '24 10:04 aman-095

Yeah, you don't need to wait for that PR to be merged to continue with this PR. You can manually handle the complex number object when it's passed through to the addon, rather than use a macro.

kgryte avatar Apr 07 '24 10:04 kgryte

@aman-095 @stdlib/napi/argv-complex64 has now been added, so you can use the STDLIB_NAPI_ARGV_COMPLEX64 macro to resolve a single-precision complex floating-point number. You'll just need to merge in the latest develop to this branch.

kgryte avatar Apr 17 '24 06:04 kgryte