feat(vfpu): add missing instructions + prefix support (`C000[X,Y,Z]`)
I generated most of the missing instructions using a tool I created and have confirmed the accuracy of the code gen via tests I'm writing for a vector math library. Though it would be wise to do a regression test of this commit against the rust-psp samples. 🙂
This table details the supported instructions, including their encoding/parameter formats for convenience. Anything not supported is likely best to be hand-rolled from this point on.
Instruction Support (99/113)
| ? | Inst | Ops | Enc |
|---|---|---|---|
| :white_large_square: | bvf | vfpu-branch | vfpu-branch |
| :white_large_square: | bvfl | vfpu-branch | vfpu-branch |
| :white_large_square: | bvt | vfpu-branch | vfpu-branch |
| :white_large_square: | bvtl | vfpu-branch | vfpu-branch |
| :white_check_mark: | lv.q | vfpu-load16 | vfpu-memory-quad |
| :white_check_mark: | lv.s | vfpu-load4 | vfpu-memory |
| :white_large_square: | lvl.q | vfpu-load16 | vfpu-memory-quad |
| :white_large_square: | lvr.q | vfpu-load16 | vfpu-memory-quad |
| :white_check_mark: | mfvc | vfpu-control-gpr | vfpu-gpr-control |
| :white_check_mark: | mtvc | vfpu-control-gpr | vfpu-gpr-control |
| :white_check_mark: | sv.q | vfpu-store16 | vfpu-memory-quad |
| :white_check_mark: | sv.s | vfpu-store4 | vfpu-memory |
| :white_large_square: | svl.q | vfpu-store16 | vfpu-memory-quad |
| :white_large_square: | svr.q | vfpu-store16 | vfpu-memory-quad |
| :white_check_mark: | vabs | vector-unary | vfpu-alu |
| :white_check_mark: | vadd | vector-binary | vfpu-alu |
| :white_check_mark: | vasin | vector-unary | vfpu-alu |
| :white_check_mark: | vavg | vector-unary-reduce | vfpu-alu |
| :white_check_mark: | vbfy1 | vector-unary | vfpu-alu |
| :white_check_mark: | vbfy2 | vector-unary | vfpu-alu |
| :white_check_mark: | vc2i | vector-unary-expand4 | vfpu-alu |
| :white_large_square: | vcmovf | vfpu-condmove | vfpu-condmove |
| :white_large_square: | vcmovt | vfpu-condmove | vfpu-condmove |
| :white_large_square: | vcmp | vfpu-compare | vfpu-alu-compare |
| :white_check_mark: | vcos | vector-unary | vfpu-alu |
| :white_check_mark: | vcrs | vector-binary | vfpu-alu |
| :white_check_mark: | vcrsp | vector-binary | vfpu-alu |
| :white_check_mark: | vcst | vector-nullary-cst | vector-imm5 |
| :white_check_mark: | vdet | vector-binary-reduce | vfpu-alu |
| :white_check_mark: | vdiv | vector-binary | vfpu-alu |
| :white_check_mark: | vdot | vector-binary-reduce | vfpu-alu |
| :white_check_mark: | vexp2 | vector-unary | vfpu-alu |
| :white_check_mark: | vf2h | vector-unary-reduce2 | vfpu-alu |
| :white_check_mark: | vf2id | vector-unary-scale | vector-imm5 |
| :white_check_mark: | vf2in | vector-unary-scale | vector-imm5 |
| :white_check_mark: | vf2iu | vector-unary-scale | vector-imm5 |
| :white_check_mark: | vf2iz | vector-unary-scale | vector-imm5 |
| :white_check_mark: | vfad | vector-unary-reduce | vfpu-alu |
| :white_check_mark: | vfim | vector-nullary-uimm16 | vector-imm16 |
| :white_check_mark: | vflush | vfpu-static | vfpu-fixedop |
| :white_check_mark: | vh2f | vector-unary-expand2 | vfpu-alu |
| :white_check_mark: | vhdp | vector-binary-reduce | vfpu-alu |
| :white_check_mark: | vhtfm2 | vector-matrix-transform | vfpu-alu-m1 |
| :white_check_mark: | vhtfm3 | vector-matrix-transform | vfpu-alu-m1 |
| :white_check_mark: | vhtfm4 | vector-matrix-transform | vfpu-alu-m1 |
| :white_check_mark: | vi2c | vector-unary-reduce | vfpu-alu |
| :white_check_mark: | vi2f | vector-unary-scale | vector-imm5 |
| :white_check_mark: | vi2s | vector-unary-reduce2 | vfpu-alu |
| :white_check_mark: | vi2uc | vector-unary-reduce | vfpu-alu |
| :white_check_mark: | vi2us | vector-unary-reduce2 | vfpu-alu |
| :white_check_mark: | vidt | vector-nullary | vfpu-alu |
| :white_check_mark: | viim | vector-nullary-uimm16 | vector-imm16 |
| :white_check_mark: | vlgb | vector-unary | vfpu-alu |
| :white_check_mark: | vlog2 | vector-unary | vfpu-alu |
| :white_check_mark: | vmax | vector-binary | vfpu-alu |
| :white_large_square: | vmfvc | vfpu-control-read | vfpu-read-control |
| :white_check_mark: | vmidt | matrix-nullary | vfpu-alu |
| :white_check_mark: | vmin | vector-binary | vfpu-alu |
| :white_check_mark: | vmmov | matrix-unary | vfpu-alu |
| :white_check_mark: | vmmul | matrix-binary | vfpu-alu |
| :white_check_mark: | vmone | matrix-nullary | vfpu-alu |
| :white_check_mark: | vmov | vector-unary | vfpu-alu |
| :white_check_mark: | vmscl | matrix-binary-scale | vfpu-alu |
| :white_large_square: | vmtvc | vfpu-control-write | vfpu-write-control |
| :white_check_mark: | vmul | vector-binary | vfpu-alu |
| :white_check_mark: | vmzero | matrix-nullary | vfpu-alu |
| :white_check_mark: | vneg | vector-unary | vfpu-alu |
| :white_check_mark: | vnop | vfpu-static | vfpu-fixedop |
| :white_check_mark: | vnrcp | vector-unary | vfpu-alu |
| :white_check_mark: | vnsin | vector-unary | vfpu-alu |
| :white_check_mark: | vocp | vector-unary | vfpu-alu |
| :white_check_mark: | vone | vector-nullary | vfpu-alu |
| :white_check_mark: | vpfxd | vfpu-prefix | vfpu-prefix |
| :white_check_mark: | vpfxs | vfpu-prefix | vfpu-prefix |
| :white_check_mark: | vpfxt | vfpu-prefix | vfpu-prefix |
| :white_check_mark: | vqmul | vector-binary | vfpu-alu |
| :white_check_mark: | vrcp | vector-unary | vfpu-alu |
| :white_check_mark: | vrexp2 | vector-unary | vfpu-alu |
| :white_check_mark: | vrndf1 | vector-nullary | vfpu-alu |
| :white_check_mark: | vrndf2 | vector-nullary | vfpu-alu |
| :white_check_mark: | vrndi | vector-nullary | vfpu-alu |
| :white_check_mark: | vrnds | vector-inullary | vfpu-alu |
| :white_check_mark: | vrot | vector-unary-rot | vector-imm5 |
| :white_check_mark: | vrsq | vector-unary | vfpu-alu |
| :white_check_mark: | vs2i | vector-unary-expand2 | vfpu-alu |
| :white_check_mark: | vsat0 | vector-unary | vfpu-alu |
| :white_check_mark: | vsat1 | vector-unary | vfpu-alu |
| :white_check_mark: | vsbn | vector-binary | vfpu-alu |
| :white_check_mark: | vsbz | vector-unary | vfpu-alu |
| :white_check_mark: | vscl | vector-binary-scale | vfpu-alu |
| :white_check_mark: | vscmp | vector-binary | vfpu-alu |
| :white_check_mark: | vsge | vector-binary | vfpu-alu |
| :white_check_mark: | vsgn | vector-unary | vfpu-alu |
| :white_check_mark: | vsin | vector-unary | vfpu-alu |
| :white_check_mark: | vslt | vector-binary | vfpu-alu |
| :white_check_mark: | vsocp | vector-unary-expand2 | vfpu-alu |
| :white_check_mark: | vsqrt | vector-unary | vfpu-alu |
| :white_check_mark: | vsrt1 | vector-unary | vfpu-alu |
| :white_check_mark: | vsrt2 | vector-unary | vfpu-alu |
| :white_check_mark: | vsrt3 | vector-unary | vfpu-alu |
| :white_check_mark: | vsrt4 | vector-unary | vfpu-alu |
| :white_check_mark: | vsub | vector-binary | vfpu-alu |
| :white_check_mark: | vsync | vfpu-static | vfpu-fixedop |
| :white_check_mark: | vt4444 | vector-unary-reduce2 | vfpu-alu |
| :white_check_mark: | vt5551 | vector-unary-reduce2 | vfpu-alu |
| :white_check_mark: | vt5650 | vector-unary-reduce2 | vfpu-alu |
| :white_check_mark: | vtfm2 | vector-matrix-transform | vfpu-alu |
| :white_check_mark: | vtfm3 | vector-matrix-transform | vfpu-alu |
| :white_check_mark: | vtfm4 | vector-matrix-transform | vfpu-alu |
| :white_check_mark: | vuc2ifs | vector-unary-expand4 | vfpu-alu |
| :white_check_mark: | vus2i | vector-unary-expand2 | vfpu-alu |
| :white_large_square: | vwbn | vector-unary-mod | vector-imm8 |
| :white_check_mark: | vzero | vector-nullary | vfpu-alu |
@sajattack this would close #63, and we should create a new issue tracking the missing instructions :)
~~I would also like to add that there's a possibility of implementing swizzling to all supported instructions, but it would require an additional TT muncher, or an additional macro that supports all combinations of: X, |X|, -|X|~~
~~I wasn't clever enough to do the muncher, and the latter would be a combinatorial explosion. I figured Saj or Potato might be better at macros though, so if either of you can come up with a clever way to do it I can go ahead and do another round of codegen.~~
EDIT: You already had a muncher for this, added prefix support to all arguments that support it :)
@overdrivenpotato I think this PR is ready for review, if you'd do me the honors ❤️
@SK83RJOSH VFPU math tests (using vsin and vcos) are failing https://ci.mijalkovic.ca/teams/rust-psp/pipelines/rust-psp/jobs/run-tests-for-pr/builds/409.2 https://github.com/overdrivenpotato/rust-psp/blob/693423a93db19cf63a1eba24b253516cb90e191f/psp/src/math/mod.rs#L61-L99 https://github.com/overdrivenpotato/rust-psp/blob/master/ci/tests/src/math_test.rs
BTW you could use the automated test generator at https://github.com/pspdev/vfpu-docs/ to generate your tests. It would need to be modified to output Rust code, but seems doable to me :) Just my 2c really, I really hate writing tests so I only wrote a small subset of them (the ones that take more time to automate than just manually write).
My plan was to try to write tests for a few instructions of each encoding + each of their flavors; since I think that should be rigorous enough for all codegen but I'll most definitely take a look at that. 🙏
Ironically enough, this entire thing has lead me down another rabbit hole atm, which is getting assertions/stack traces working correctly so I can make the user story for testing a bit less cumbersome... though after a week of plugging away at that I may have to accept what we have 😄
Alrighty, added tests that should hit most if not all of the code gen, with the exception of fixedop + imm16 since those are pretty straight forward/are near identical to other instructions. The only thing I have tested here, that I would like to, is matrix operations.
~~Hold off on merging this until I can validate those, since it would be very unfortunate if gum breaks.~~
Okey-dokey, I tested the samples + my own project for regressions in gum, only found one so this should be ready to merge. 🙂