highway icon indicating copy to clipboard operation
highway copied to clipboard

Build failure on ppc64: `{standard input}: Invalid mnemonic 'lxv'` etc.

Open barracuda156 opened this issue 8 months ago • 7 comments

The build on ppc64 uses some invalid instructions:

[ 47%] Building CXX object CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_f32d.cc.o
/opt/local/bin/g++-mp-14 -DHWY_SHARED_DEFINE -DTOOLCHAIN_MISS_ASM_HWCAP_H -DTOOLCHAIN_MISS_SYS_AUXV_H -Dhwy_contrib_EXPORTS -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_devel_highway/highway/work/highway-1.2.0 -pipe -Os -D_GLIBCXX_USE_CXX11_ABI=0 -DNDEBUG -std=c++17 -arch ppc64 -mmacosx-version-min=10.5 -fPIC -fvisibility=hidden -fvisibility-inlines-hidden -Wno-builtin-macro-redefined -D__DATE__=\"redacted\" -D__TIMESTAMP__=\"redacted\" -D__TIME__=\"redacted\" -fmerge-all-constants -Wall -Wextra -Wconversion -Wsign-conversion -Wvla -Wnon-virtual-dtor -Wcast-align -fmath-errno -fno-exceptions -Wno-psabi -MD -MT CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_f32d.cc.o -MF CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_f32d.cc.o.d -o CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_f32d.cc.o -c /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_devel_highway/highway/work/highway-1.2.0/hwy/contrib/sort/vqsort_f32d.cc
[ 50%] Building CXX object CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_f64a.cc.o
/opt/local/bin/g++-mp-14 -DHWY_SHARED_DEFINE -DTOOLCHAIN_MISS_ASM_HWCAP_H -DTOOLCHAIN_MISS_SYS_AUXV_H -Dhwy_contrib_EXPORTS -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_devel_highway/highway/work/highway-1.2.0 -pipe -Os -D_GLIBCXX_USE_CXX11_ABI=0 -DNDEBUG -std=c++17 -arch ppc64 -mmacosx-version-min=10.5 -fPIC -fvisibility=hidden -fvisibility-inlines-hidden -Wno-builtin-macro-redefined -D__DATE__=\"redacted\" -D__TIMESTAMP__=\"redacted\" -D__TIME__=\"redacted\" -fmerge-all-constants -Wall -Wextra -Wconversion -Wsign-conversion -Wvla -Wnon-virtual-dtor -Wcast-align -fmath-errno -fno-exceptions -Wno-psabi -MD -MT CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_f64a.cc.o -MF CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_f64a.cc.o.d -o CMakeFiles/hwy_contrib.dir/hwy/contrib/sort/vqsort_f64a.cc.o -c /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_devel_highway/highway/work/highway-1.2.0/hwy/contrib/sort/vqsort_f64a.cc
{standard input}:66:Invalid mnemonic 'lxv'
{standard input}:81:Invalid mnemonic 'fctiduz'
{standard input}:82:Invalid mnemonic 'mfvsrd'
{standard input}:89:Invalid mnemonic 'lxvx'
{standard input}:91:Invalid mnemonic 'xvsubsp'
{standard input}:92:Invalid mnemonic 'stxvx'
{standard input}:93:Invalid mnemonic 'xxlor'
{standard input}:105:Invalid mnemonic 'xxspltib'
{standard input}:110:Invalid mnemonic 'xxlor'
{standard input}:111:Invalid mnemonic 'xxlor'
{standard input}:112:Invalid mnemonic 'xxlor'
{standard input}:121:Invalid mnemonic 'xvaddsp'
{standard input}:122:Invalid mnemonic 'xvaddsp'
{standard input}:123:Invalid mnemonic 'xxspltib'
{standard input}:125:Invalid mnemonic 'vextsb2d'
{standard input}:126:Invalid mnemonic 'xvaddsp'
{standard input}:127:Invalid mnemonic 'xxbrq'
{standard input}:128:Invalid mnemonic 'xxbrw'
{standard input}:129:Invalid mnemonic 'xvaddsp'
{standard input}:130:Invalid mnemonic 'vrld'
{standard input}:131:Invalid mnemonic 'xvaddsp'
{standard input}:132:Invalid mnemonic 'xscvspdp'
{standard input}:133:Invalid mnemonic 'xscvdpuxds'
{standard input}:134:Invalid mnemonic 'stxssp'
{standard input}:135:Invalid mnemonic 'mfvsrd'
{standard input}:139:Invalid mnemonic 'lxv'
{standard input}:140:Invalid mnemonic 'lxv'
{standard input}:143:Invalid mnemonic 'xvmaddasp'
{standard input}:144:Invalid mnemonic 'lxv'
{standard input}:145:Invalid mnemonic 'lxv'
{standard input}:146:Invalid mnemonic 'xvmaddasp'
{standard input}:147:Invalid mnemonic 'lxv'
{standard input}:148:Invalid mnemonic 'lxv'
{standard input}:149:Invalid mnemonic 'xvmaddasp'
{standard input}:150:Invalid mnemonic 'lxv'
{standard input}:151:Invalid mnemonic 'lxv'
{standard input}:152:Invalid mnemonic 'xvmaddasp'
{standard input}:182:Invalid mnemonic 'lxv'
{standard input}:197:Invalid mnemonic 'fctiduz'
{standard input}:198:Invalid mnemonic 'mfvsrd'
{standard input}:205:Invalid mnemonic 'lxvx'
{standard input}:207:Invalid mnemonic 'xvsubsp'
{standard input}:208:Invalid mnemonic 'stxvx'
{standard input}:209:Invalid mnemonic 'xxlor'
{standard input}:221:Invalid mnemonic 'xxspltib'
{standard input}:226:Invalid mnemonic 'xxlor'
{standard input}:228:Invalid mnemonic 'xxlor'
{standard input}:229:Invalid mnemonic 'xxlor'
{standard input}:237:Invalid mnemonic 'xvaddsp'
{standard input}:239:Invalid mnemonic 'xvaddsp'
{standard input}:240:Invalid mnemonic 'xxspltib'
{standard input}:241:Invalid mnemonic 'vextsb2d'
{standard input}:242:Invalid mnemonic 'xvaddsp'
{standard input}:243:Invalid mnemonic 'xxbrq'
{standard input}:244:Invalid mnemonic 'xxbrw'
{standard input}:245:Invalid mnemonic 'xvaddsp'
{standard input}:246:Invalid mnemonic 'vrld'
{standard input}:247:Invalid mnemonic 'xvaddsp'
{standard input}:248:Invalid mnemonic 'xscvspdp'
{standard input}:249:Invalid mnemonic 'xscvdpuxds'
{standard input}:250:Invalid mnemonic 'stxssp'
{standard input}:251:Invalid mnemonic 'mfvsrd'
{standard input}:255:Invalid mnemonic 'lxv'
{standard input}:256:Invalid mnemonic 'lxv'
{standard input}:259:Invalid mnemonic 'xvmaddasp'
{standard input}:260:Invalid mnemonic 'lxv'
{standard input}:261:Invalid mnemonic 'lxv'
{standard input}:262:Invalid mnemonic 'xvmaddasp'
{standard input}:263:Invalid mnemonic 'lxv'
{standard input}:264:Invalid mnemonic 'lxv'
{standard input}:265:Invalid mnemonic 'xvmaddasp'
{standard input}:266:Invalid mnemonic 'lxv'
{standard input}:267:Invalid mnemonic 'lxv'
{standard input}:268:Invalid mnemonic 'xvmaddasp'
{standard input}:298:Invalid mnemonic 'lxvd2x'
{standard input}:313:Invalid mnemonic 'fctiduz'
{standard input}:314:Invalid mnemonic 'mfvsrd'
{standard input}:321:Invalid mnemonic 'lxvd2x'
{standard input}:323:Invalid mnemonic 'xvsubsp'
{standard input}:324:Invalid mnemonic 'stxvd2x'
{standard input}:325:Invalid mnemonic 'xxlor'
{standard input}:350:Invalid mnemonic 'xxlor'
{standard input}:352:Invalid mnemonic 'xxlor'
{standard input}:354:Invalid mnemonic 'xxlor'
{standard input}:362:Invalid mnemonic 'xvaddsp'
{standard input}:365:Invalid mnemonic 'xvaddsp'
{standard input}:367:Invalid mnemonic 'lxvw4x'
{standard input}:370:Invalid mnemonic 'xvaddsp'
{standard input}:372:Invalid mnemonic 'xvaddsp'
{standard input}:373:Invalid mnemonic 'lxvd2x'
{standard input}:375:Invalid mnemonic 'vrld'
{standard input}:376:Invalid mnemonic 'xvaddsp'
{standard input}:377:Invalid mnemonic 'xscvspdp'
{standard input}:378:Invalid mnemonic 'xscvdpuxds'
{standard input}:379:Invalid mnemonic 'stxsspx'
{standard input}:380:Invalid mnemonic 'mfvsrd'
{standard input}:385:Invalid mnemonic 'lxvd2x'
{standard input}:386:Invalid mnemonic 'lxvd2x'
{standard input}:387:Invalid mnemonic 'xvmaddasp'
{standard input}:388:Invalid mnemonic 'lxvd2x'
{standard input}:389:Invalid mnemonic 'lxvd2x'
{standard input}:390:Invalid mnemonic 'xvmaddasp'
{standard input}:391:Invalid mnemonic 'lxvd2x'
{standard input}:392:Invalid mnemonic 'lxvd2x'
{standard input}:393:Invalid mnemonic 'xvmaddasp'
{standard input}:394:Invalid mnemonic 'lxvd2x'
{standard input}:395:Invalid mnemonic 'lxvd2x'
{standard input}:398:Invalid mnemonic 'xvmaddasp'
{standard input}:425:Invalid mnemonic 'lxvd2x'
{standard input}:426:Invalid mnemonic 'xvmulsp'
{standard input}:427:Invalid mnemonic 'xvaddsp'
{standard input}:428:Invalid mnemonic 'stxvd2x'
{standard input}:438:Invalid mnemonic 'mfvsrd'
{standard input}:440:Invalid mnemonic 'mfvsrd'
{standard input}:459:Invalid mnemonic 'lxvx'
{standard input}:460:Invalid mnemonic 'xvmulsp'
{standard input}:461:Invalid mnemonic 'xvaddsp'
{standard input}:462:Invalid mnemonic 'stxvx'

barracuda156 avatar Apr 20 '25 06:04 barracuda156

Well, no surprise, the check is wrong in the header: it checks if __ALTIVEC__ is defined, but uses not Altivec, but VSX instructions.

barracuda156 avatar Apr 20 '25 06:04 barracuda156

This part breaks the build: https://github.com/google/highway/blob/457c891775a7397bdb0376bb1031e6e027af1c48/hwy/detect_targets.h#L653

barracuda156 avatar Apr 20 '25 07:04 barracuda156

Need to build Google Highway with -DHWY_COMPILE_ONLY_EMU128=1 as none of the PowerPC-based Macs support VSX and as POWER8 CPU's became available almost 8 years after the discontinuation of the Power Mac G5.

The build errors are also caused by using an earlier version of GNU Assembler (as) that does not support assembling instructions added in POWER7 or POWER8 (including VSX).

johnplatts avatar Apr 20 '25 23:04 johnplatts

@johnplatts Thank you for responding. Perhaps all is needed is a macro checking for VSX, as it is done above? Altivec is supported back to G4 and whatever ISA version it used. IMO, this is an issue of inappropriate macro. Of course, it can be avoided by a configure flag, as you suggest, but it is not something obvious to a user. If it is undesirable to check for VSX there for some reason, adding !defined(__APPLE__) will do, since we know for sure that macOS cannot run on ISA 2.06-supporting cpus. I can make a PR with a single-line change for that.

barracuda156 avatar Apr 21 '25 01:04 barracuda156

Thanks @johnplatts. Would you agree it makes sense to exclude Apple (preferably via HWY_OS_APPLE)? It seems very unlikely that Apple will go back to POWER, and it would help for such cases of old binutils.

@barracuda156 note that we don't want to be too specific with the macro checks because the runtime dispatch and attainable means "can be generated by the compiler", which is a looser check than "actually enabled via -m flags".

jan-wassenberg avatar Apr 22 '25 09:04 jan-wassenberg

@johnplatts Just in case, Altivec-compatible implementation of this never existed? It would be nice to have it working instead of just disabling. (Yeah, I understand that nobody gonna bother writing that from scratch, and I have no experience with Altivec code myself, but if it is a matter of restoring an earlier version, I can do it locally.)

barracuda156 avatar Apr 22 '25 10:04 barracuda156

I believe it is correct that there was never a working version using only Altivec. Some of the required ops were implemented using VSX.

jan-wassenberg avatar Apr 22 '25 13:04 jan-wassenberg

@barracuda156 are you fine with the current status or should we still exclude Apple from the PPC targets?

jan-wassenberg avatar Jul 04 '25 09:07 jan-wassenberg

I encountered this error while trying to build version 1.0.7 for 32b PowerPC a few weeks ago. However, then-current Git HEAD (6cdcbfd0fed5b0e695ab279571580749f6832664) works fine. Thank you for fixing this! 👍

vidraj avatar Aug 27 '25 08:08 vidraj

@jan-wassenberg Sorry for a delayed response. I think we have no problem with ppc-specific code now, though this problem exists in highwayhash (I guess it is a related project, but repo has been archived recently, so I can’t submit a fix for it).

I think there is an unsolved issue with futices being used where they cannot be supported, but it is unrelated to ppc.

barracuda156 avatar Aug 28 '25 18:08 barracuda156

:) I also worked on HighwayHash and its runtime dispatch mechanism was an early prototype of what's now in Highway, but otherwise there is no overlap.

What's the issue with futex?

jan-wassenberg avatar Aug 29 '25 11:08 jan-wassenberg

and its runtime dispatch mechanism was an early prototype of what's now in Highway

IC. zeek still uses a bundled copy of highwayhash, that’s how I bumped into it.

What's the issue with futex?

The current implementation relies on features which do not exist in macOS before 10.12. I added this patch to fix the build: https://github.com/macos-powerpc/powerpc-ports/blob/292ce14e6a6e0ced04024874ab66b63d668cbb42/devel/highway/files/patch-futex.diff MacPorts used configure arg: https://github.com/macports/macports-ports/commit/8f035fc71e7a2995d10fd3959d44e996c29c3919

barracuda156 avatar Aug 29 '25 12:08 barracuda156

The HighwayHash algorithm is still good, but I'd advise all users to switch to johnplatts' excellent implementation which uses Highway for the SIMD part.

Thanks, we can upstream something similar :)

jan-wassenberg avatar Aug 29 '25 12:08 jan-wassenberg