jsource
jsource copied to clipboard
Refactor `zeroioni`
In jsrc/verbs/monadic/tally.cpp:
// TODO: refactor me
[[nodiscard]] auto
refactorme_zeroionei(int64_t n) {
return reinterpret_cast<array>(Bnum + (n));
}
Just to save the next person some time: this code is currently in jsrc/array.hpp.
To add a bit of documentation: the current implementation (without the typecase) is:
Bnum + (n)
The definition of Bnum can be found in j.c:
I Bnum[22][8] =
{ // the numbers we keep at hand. 0 and 1 are B01, the rest INT; but the first 2 are integer forms of 0 and 1
CBAIVAL(INT, 0), CBAIVAL(INT, 1), CBAIVAL(INT, -10), CBAIVAL(INT, -9), CBAIVAL(INT, -8), CBAIVAL(INT, -7),
CBAIVAL(INT, -6), CBAIVAL(INT, -5), CBAIVAL(INT, -4), CBAIVAL(INT, -3), CBAIVAL(INT, -2), CBAIVAL(INT, -1),
CBAIVAL(B01, 0), CBAIVAL(B01, 1), CBAIVAL(INT, 2), CBAIVAL(INT, 3), CBAIVAL(INT, 4), CBAIVAL(INT, 5),
CBAIVAL(INT, 6), CBAIVAL(INT, 7), CBAIVAL(INT, 8), CBAIVAL(INT, 9)};
Related to this we have two constants in j.h:
#define NUMMAX 9 // largest number represented in num[]
#define NUMMIN (~NUMMAX) // smallest number represented in num[]
The value of NUMMIN is actually -10 (see Godbolt, but there must be a better way to visualize this). This matches the min/max values of Bnum.
This looks to me like these are some kind of static/singleton values: instead of creating a new A every time you need these number, one could just get a pointer to a static value.
Possible refactors I could think of:
- The parentheses around
nare copied from the original C macro, and are not needed here. Bnumis an array,Bnum + nis actually an array index, writing it likeBnum[n]makes that a bit cleaner- We might want to do a boundary check: there are 22 items, so
nshould never be bigger than21 - The index is based on the internal bit representation of J. To get the value of
2, you'd have to input14. - Do we want to combine this with
refactorme_num? Our caller written asreturn !zero_or_one(k) ? refactorme_num(k) : zeroionei(k), which means we useBnumfor0and1, but not for the other integers available inBnum. And we would have to do this at every call site. It might be easier to makerefactorme_numcheck if the input is in the min/max boundaries ofBnumand return these staticAs, or create a newAotherwise. - It kind of looks to me like
Bnumhas an incorrect type, and should actually beA[22]instead ofI[22][8]. This would get rid of the casting, but we would have to validate all users ofBnum.
After reading some more code: the definition of num and refactorme_num is:
#define num(n) ((A)(Bnum + 2 + (n)-NUMMIN))
[[nodiscard]] inline auto
refactorme_num(int64_t n) {
return reinterpret_cast<array>(Bnum + 2 + n - NUMMIN);
}
These two definitions behave the same, I'm going to use num from here, that's easier to type, but this would apply to our C++ function as well.
Our caller is (slightly rewritten to show the relevant lines more clearly):
// this is for "creating an integer atom with value k"
template <typename T>
auto make_scalar_integer(J jt, T k) -> array {
if (xor_replicate_sign(k) <= NUMMAX) {
return !zero_or_one(k) ? num(k) : zeroionei(k);
}
array z = make_array<T, copy_shape_0>(jt, 1, 0);
set_value_at(z, 0, k);
return z;
}
As we've seen before, we can only use num and zeroionei if we're in the range of [-10, 9], or [NUMMIN, NUMMAX]. The line with xor_replicate_sign decides whether we use a value from Bnum, or create a new A. This means that check can only mean "is k in the range [NUMMIN, NUMMAX]. If it isn't, we have to create the scalar ourself.
It looks like num(k) does not work with the values 0 and 1, so we have to check that as well, and decide whether we need to call zeroionei or num. This also means the boundary check of zeroionei described above is incorrect, it should only accept two values (which is kind of implied by the name, which I guess is supposed to be read as zero I one I).
The lookup of num now actually starts to make sense. Some examples:
#define num(n) ((A)(Bnum + 2 + (n)-NUMMIN))
#define num(n) ((A)(Bnum + 2 + (n) + 10)) // We've already established that `NUMMIN` equals `-10`
#define num(n) ((A)(Bnum + 12 + (n)) // The most simplified version
The negative numbers are -10 to -1 map to [2, 11], that fits the indexes of Bnum. Values 0 and 1 are special, those are the first two items, the values at index 12 and 13 are the booleans instead (there is probably a reason for that, but I can't think of anything that would make sense). Last are the positive values 2 to 9, mapped to indexes [14, 21], and once again we have a match.
Refactors I would suggest:
- In
make_scalar_integer, replace the checkxor_replicate_sign(k) <= NUMMAXwith something more descriptive. The check itself can be reduced toNUMMIN <= k && k <= NUMMAX, this can be changed into a function if needed - Move the logic of the next line to
refactorme_num, callers should not have to worry about that weird boolean exception - Check the code for calls with
num(0)andnum(1), those are actually booleans. I would prefer to move those out ofBnumand intoBbool, but this would require us to update all current C callers as well. I would define a separate macro to get those booleans. - If that boolean change is feasible, we could update
numto work with the values 0 and 1. - Once we've finished that, calls to
zeroioneican simply be updated to usenuminstead. - Then apply the ultimate refactoring: Remove Dead Code
should actually be `A[22]` instead of `I[22][8]`
A is typedef for AD*, so this would be array of 22 pointers, without room for the data.
The correct type would be AD[22]
Also, just a note that this relies on sizeof(AD) == sizeof(I[8])
I'd guess one reason for I[8] instead of AD has been the initialisation: first has always 8 members to initialise, the latter has 10 now, but one seems to be 64 bit specific filler, so it would've been 9 on some platforms.
Now, that 64 bit is only supported platform the initialisation of the struct would be just fine, although it needs some more braces around the subobjects as well.
Yeah, I messed up AD and A in that one.
I think an issue here is the values are generated by a macro:
#define CBAIVAL(t, v) \
{ 7 * SZI, (t)&TRAVERSIBLE, 0, (t), ACPERMANENT, 1, 0, (v) }
I guess we're using this to construct the AD ourselves, I guess this one could directly produce an AD as well.
I guess this case is done, that function no longer exists. The case does include some scribbles of documentation that may be worth saving, but I guess the wiki would be a better place to save that (but the wiki is currently disabled, I'm not sure whether that's on purpose)