Integer array index improvement
Currently gawk (v5.0.1) is about twice as quick as mawk (v1.3.4) for
mawk 'BEGIN{ while(i<1000000){ x[""i]=""i++;} print(x["0"])}'
When forced to use integer indexes, mawk is faster (or comparable to) gawk:
mawk 'BEGIN{i=0; while(i<1000000){ x[i]=i++;} print(x[0])}'
This first example is doing two "wrong" things:
-
variable iis not being initialised as an integer, so initially is defaulting to an uninitialized type (internally calls the slowerfind_by_sval()function)than - indexes of
variable xare being forced to be strings which makes it slow (internally calls the slowerfind_by_sval()function)
The code can be improved to be consistently faster by modifying the default case option of array_find() to identify "string" indexes that are actually integer values, and call find_by_ival() rather than find_by_sval()
e.g. (array.c)
case C_NOINIT:
ap = find_by_sval(A, &null_str, create_flag, &redid);
break;
default:
{
double d = strtod(string(cp)->str, (char **) 0);
Int ival = d_to_I(d);
if ((double) ival == d) {
if (A->type == AY_SPLIT) {
if (ival >= 1 && ival <= (int) A->size)
return (CELL *) A->ptr + (ival - 1);
if (!create_flag)
return (CELL *) 0;
convert_split_array_to_table(A);
} else if (A->type == AY_NULL)
make_empty_table(A, AY_INT);
ap = find_by_ival(A, ival, create_flag, &redid);
} else
ap = find_by_sval(A, string(cp), create_flag, &redid);
}
break;
For minimal testing it works OK, but maybe there exists
double d = strtod(string(cp)->str, (char **) 0);
Int ival = d_to_I(d);
where the cell string value of cp converts to a zero value d, then it would result in incorrectly treating a string index as an integer index.
Anyway, hope this helps.
thanks (I'll check on this when I get back to mawk - currently on xterm...)
In a quick check, the suggested change results in some test-failures. I'll come back to this when I can spend a day or two (at the moment am just working on simple changes for a maintenance release).