simde
simde copied to clipboard
3DNow! functions
DOSbox Staging plans to use SIMDe and it will benefit if SIMDe provides execution of 3DNow! instructions on modern x86 (that doesn't have 3DNow!), ARM64, etc.
3DNow! emulation code is available in:
- 86box
- QEMU patch 3dnow.diff
- Bochs patch 3dnow.cc
- FEX-Emu (x86 emulator that can be used by Wine) supports all 3DNow!, including Extended and the Geode specific instructions
Relevant for many games and software in 3DNow! mode, https://github.com/joncampbell123/dosbox-x/issues/3217
3DNow! floating-point instructions
- [ ] PI2FD – Packed 32-bit integer to floating-point conversion
- [ ] PF2ID – Packed floating-point to 32-bit integer conversion
- [ ] PFCMPGE – Packed floating-point comparison, greater or equal
- [ ] PFCMPGT – Packed floating-point comparison, greater
- [ ] PFCMPEQ – Packed floating-point comparison, equal
- [ ] PFACC – Packed floating-point accumulate
- [ ] PFADD – Packed floating-point addition
- [ ] PFSUB – Packed floating-point subtraction
- [ ] PFSUBR – Packed floating-point reverse subtraction
- [ ] PFMIN – Packed floating-point minimum
- [ ] PFMAX – Packed floating-point maximum
- [ ] PFMUL – Packed floating-point multiplication
- [ ] PFRCP – Packed floating-point reciprocal approximation
- [ ] PFRSQRT – Packed floating-point reciprocal square root approximation
- [ ] PFRCPIT1 – Packed floating-point reciprocal, first iteration step
- [ ] PFRSQIT1 – Packed floating-point reciprocal square root, first iteration step
- [ ] PFRCPIT2 – Packed floating-point reciprocal/reciprocal square root, second iteration step
3DNow! integer instructions
- [ ] PAVGUSB – Packed 8-bit unsigned integer averaging
- [ ] PMULHRWA (PMULHRW) – Packed 16-bit integer multiply with rounding
- [ ] PSWAPW mm,mm/m64 0F 0F /r BB Undocumented AMD 3DNow! instruction on K6-2 and K6-3. Swaps 16-bit words within 64-bit MMX register. Instruction known to be recognized by MASM 6.13 and 6.14. Opcode reused for documented
PSWAPD
instruction from AMD K7 onwards.
3DNow! performance-enhancement instructions
- [ ] FEMMS – Faster entry/exit of the MMX or floating-point state
- [ ] PREFETCH m8 0F 0D /0 Prefetch cache line. Prefetch at least a 32-byte line into L1 data cache. - see below how confusingly _mm_prefetch is used for
PREFETCHW
- [x] PREFETCHW m8 0F 0D /1 Prefetch cache line with intent to write. Prefetch at least a 32-byte line into L1 data cache. - implemented as _mm_prefetch (listed under the heading
PRFCHW
)
3DNow!+ DSP instructions
- [ ] PF2IW – Packed floating-point to integer word conversion with sign extend
- [ ] PI2FW – Packed integer word to floating-point conversion
- [ ] PFNACC – Packed floating-point negative accumulate
- [ ] PFPNACC – Packed floating-point mixed positive-negative accumulate
- [ ] PSWAPD – Packed swap doubleword
3DNow! Professional Geode instructions
- [ ] PFRSQRTV – Reciprocal square root approximation for a pair of 32-bit floats
- [ ] PFRCPV – Reciprocal approximation for a pair of 32-bit floats
Hey @Torinde ,
I would positively receive any contributions to add 3DNow! functions to SIMDe. Anyone interested in helping with this, please comment and we can schedule a video call to get you up to speed.
Could you please add "instruction-set-support" label?
@Torinde done 👍