blis icon indicating copy to clipboard operation
blis copied to clipboard

Refactor the control tree infrastructure

Open devinamatthews opened this issue 2 years ago • 22 comments

This is a major refactor that places the vast majority of the logic, physical block sizes, preferences, kernel/variant pointers, etc. under the direct purview of the control tree. This refactor also included a cleanup of the thrinfo_t code such that building a thread control tree can be done in a completely generic fashion, including user-customized control trees.

devinamatthews avatar Jan 13 '23 22:01 devinamatthews

@fgvanzee this PR is phase 1: it puts all of the logic and parameters into the control tree but does not yet allow for user customization. It also does not cleanly handle induced methods and mixed-domain computation (actually the latter is broken but I'm not concerned ATM). This PR shouldn't be merged in its current form but I wanted to give you a chance to review the changes so far and make any necessary changes.

devinamatthews avatar Jan 13 '23 22:01 devinamatthews

Looks like it needs to symbol filed updated too. Ugh.

devinamatthews avatar Jan 13 '23 22:01 devinamatthews

@fgvanzee ready to look over. Still 181 files changed :)

devinamatthews avatar May 16 '23 20:05 devinamatthews

@fgvanzee ready to look over. Still 181 files changed :)

Thanks, will do.

Can you remind me if the CI failures are still expected at this point?

fgvanzee avatar May 16 '23 21:05 fgvanzee

Yes it will fail be cause mixed-precision/mixed-domain is broken.

devinamatthews avatar May 16 '23 21:05 devinamatthews

Appveyor must be something else.

devinamatthews avatar May 16 '23 21:05 devinamatthews

Yes it will fail be cause mixed-precision/mixed-domain is broken.

That's not entirely consistent with what I'm seeing. For example, the second Travis CI test:

OOT=0 TEST=FAST SDE=0 THR="openmp" CONF="auto" = PACKAGES="gcc-9 binutils"

uses the "fast" input files, which disable mixed-datatype altogether. And yet that test still fails.

Am I missing something?

fgvanzee avatar May 16 '23 21:05 fgvanzee

Fixed the Appveyor problem (probably).

devinamatthews avatar May 16 '23 21:05 devinamatthews

Looks like there are some more little things that didn't show up here for some reason. I'll iterate a few times.

devinamatthews avatar May 16 '23 21:05 devinamatthews

@fgvanzee looks OK now

devinamatthews avatar May 16 '23 22:05 devinamatthews

@fgvanzee well, the wrong one failed :)

devinamatthews avatar Jun 06 '23 02:06 devinamatthews

@fgvanzee well, the wrong one failed :)

Halp. Why does macOS not like:


rval_fail=$(grep -q 'FAILURE' $1)
rval_exit=$(grep -q 'Exiting normally' $1)

# If FAILURE was found (exit code 0), or if the termination tag was not
# found (not exit code 0), print an error message.
if [[ ${rval_fail} -eq 0 ||
      ${rval_exit} -ne 0 ]]; then
 

fgvanzee avatar Jun 06 '23 02:06 fgvanzee

Just do:

if grep -q 'FAILURE' $1 2>/dev/null ||
   grep -qv 'Exiting normally' $1 2>/dev/null; then

devinamatthews avatar Jun 06 '23 02:06 devinamatthews

grep -q doesn't output 0 or 1, you would have to use $? to get the return code afterwards.

devinamatthews avatar Jun 06 '23 02:06 devinamatthews

grep -q doesn't output 0 or 1, you would have to use $? to get the return code afterwards.

Okay, thanks.

fgvanzee avatar Jun 06 '23 02:06 fgvanzee

Just do:

if grep -q 'FAILURE' $1 2>/dev/null ||
   grep -qv 'Exiting normally' $1 2>/dev/null; then

This... did not help. 😂

fgvanzee avatar Jun 06 '23 02:06 fgvanzee

There. I think that finally does it.

fgvanzee avatar Jun 06 '23 03:06 fgvanzee

@fgvanzee I'll need to merge the result of #782 but otherwise it's ready to go over in full.

devinamatthews avatar Oct 12 '23 01:10 devinamatthews

@Aaron-Hutchinson can you check the changes I made to the SiFive x280 packing kernels in this PR? The Travis test passed but I wanted to make sure it looks OK.

devinamatthews avatar Feb 15 '24 17:02 devinamatthews

@fgvanzee this should be ready to merge now unless @Aaron-Hutchinson has any comments on SiFive x280 packing.

devinamatthews avatar Feb 16 '24 19:02 devinamatthews

@devinamatthews Very sorry that I didn't see this sooner. I believe @myeh01 is far more familiar with these packing kernels, so I defer judgment to him. It looks ok to me.

CC: @nick-knight

Aaron-Hutchinson avatar Feb 26 '24 20:02 Aaron-Hutchinson

@devinamatthews Thanks for the hard work! I saw only one thing, which is to call the vectorized packing kernels when cdim <= cdim_max (and not just when cdim == cdim_max). Otherwise, LGTM!

myeh01 avatar Feb 26 '24 22:02 myeh01