stdlib
stdlib copied to clipboard
[RFC]: add `blas/base/zdotu`
Description
This RFC proposes adding the package @stdlib/blas/base/zdotu
for performing dot product of 2 complex double-precision vectors:
void c_zdotu( const int *N, const void *X, const int *strideX, const void *Y, const int *strideY);
Related Issues
Related Issue: Tracker
Questions
No.
Other
No.
Checklist
- [X] I have read and understood the Code of Conduct.
- [X] Searched for existing issues and pull requests.
- [X] The issue name begins with
RFC:
.
Hi, I'd like to work on this issue!
@performant23 That'd be great! Thanks for volunteering to work on this!
Perfect! Also, just had a few doubts as I was moving onto the C implementation.
First, building on the previous PR review, we replaced complex(kind=8) with complex(kind=kind(0.0d0)). So, in the first case, we treat the parts of the complex number as 8 bytes in the memory while in the second case, we let the kind parameter return the value of a dummy complex number to determine its kind. So, I think it should be mainly since some machines/compilers would have different definitions of precisions and how they're allocated. So, just to confirm, we'd be dynamically creating allocatable arrays in all our complex double-precision routines right? E.g.
(A). Array Arguments:
complex(kind=kind(0.0d0)), intent(in) :: zx(*), zy(*)
Also importantly, what happens to the routines with return values that are complex double-precision for JS, C, and Fortran?
E.g.
(B). Function signatures:
complex(kind=kind(0.0d0)) function zdotu( N, zx, strideX, zy, strideY )
.
.
.
So, currently I didn't find any double-precision complex implementations for JS as such (including blasjs), and also CBLAS doesn't provide any significant help in understanding it since there a wrapper is used to call the Fortran implementation (but here I'm not sure how we'd proceed with actually implementing in C)
(C). Second, since we're returning a double-precision complex number, Node.js doesn't seem to have built-in support for complex numbers.
So, for these cases, would we use something like napi_create_object
(In addon.c
)?
Like for this one:
double real_part = stdlib_real(dot);
double imag_part = stdlib_imag(dot);
napi_value real, imag, result;
status = napi_create_double(env, real_part, &real);
assert(status == napi_ok);
status = napi_create_double(env, imag_part, &imag);
assert(status == napi_ok);
status = napi_create_object(env, &result);
assert(status == napi_ok);
status = napi_set_named_property(env, result, "real", real);
assert(status == napi_ok);
status = napi_set_named_property(env, result, "imag", imag);
assert(status == napi_ok);
return result;
where dot
is the result we get after passing it to c_zdotu
.
Hi @kgryte, hope you had a restful weekend! I was just wondering if you have any comments on my above questions. This would help me immensely since currently it's blocking me and your clarification on this would enable me to continue the development on this!
Thanks!
C) Returning an object with real and imaginary components, yes, that should work, and it's what we do for unary math functions. See https://github.com/stdlib-js/stdlib/blob/1fb4994e369f396c81b96787e89cb379c015ab29/lib/node_modules/%40stdlib/math/base/napi/unary/src/main.c#L209. Note, however, you should follow our re
and im
property name conventions.
B) that looks fine.
A) I went with kind(0.0d0)
, as specifying 8
is not necessarily portable in older Fortran versions. kind(0.0d0)
effectively queries the byte size of a double.
Thanks, @kgryte! This is really helpful for adding this and all subsequent L1 z*
packages which return double-precision complex values!
@performant23 FYI: for returning a complex number-like object, I added macro support to simplify JS value creation: https://github.com/stdlib-js/stdlib/tree/develop/lib/node_modules/%40stdlib/napi/create-complex-like. You'll need to merge in the latest develop
to use.
You can see something similar at work for returning a JS double: https://github.com/stdlib-js/stdlib/commit/e96e8887ddeb1d83001c281c9946a89321da0d9b
This should hopefully eliminate the need for (C), as you can use the macro instead.