ara icon indicating copy to clipboard operation
ara copied to clipboard

[HW] Add support for vector fixed-point instructions

Open M-Ijaz-10x opened this issue 2 years ago • 2 comments

Add support for vssra, vssrl, vnclip, vnclipu and vsmul vector fixed-point instructions (NOTE: This PR depends on PR#106 and needs to be merged after that)

Changelog

Fixed

  • N/A

Added

  • Support for vector single-width fractional multiply with rounding and saturation instruction: vsmul
  • Support for vector single-width scaling shift instructions: vssra, vssrl
  • Support for vector narrowing fixed-point clip instructions: vnclip, vnclipu
  • Test for vnclip and vnclipu vector fixed-point instructions

Changed

  • Updated tests for vssra, vssrl and vsmul vector fixed-point instructions

Checklist

  • [x] Automated tests pass
  • [x] Changelog updated
  • [x] Code style guideline is observed

Please check our contributing guidelines before opening a Pull Request.

M-Ijaz-10x avatar Sep 28 '22 15:09 M-Ijaz-10x

Thanks a lot for the great work, I will review it tomorrow! In the meantime, I am checking that we have no frequency degradation.

Best, Matteo

mp-17 avatar Oct 05 '22 20:10 mp-17

I found some issues in integrating everything with the current lane size (over-utilization during placement), so I am exploring new lane sizes to host the changes! Ara is getting big! :-)

mp-17 avatar Oct 11 '22 08:10 mp-17

Feel free to ping me when the PR is ready for another round of reviews @M-Ijaz-10x :-)

mp-17 avatar Oct 31 '22 11:10 mp-17

Feel free to ping me when the PR is ready for another round of reviews @M-Ijaz-10x :-)

I have made rest of the requested changes except for the rounding functions (I am verifying them). You can review rest of the changes. Thanks!

M-Ijaz-10x avatar Oct 31 '22 12:10 M-Ijaz-10x

LGTM, thanks a lot for the changes! I am running a back-end flow now to verify that everything is okay.

mp-17 avatar Oct 31 '22 12:10 mp-17

@mp-17 is there any update on the back-end flow side?

M-Ijaz-10x avatar Nov 01 '22 05:11 M-Ijaz-10x

@M-Ijaz-10x the lane was implemented with no problems. I am waiting for the system: I think that max 6h and it will be over!

mp-17 avatar Nov 01 '22 09:11 mp-17

Still ongoing, busy servers today - But I will ping you as soon as it's over.

mp-17 avatar Nov 01 '22 15:11 mp-17

The frequency was slightly lowered, but it's just because of the larger lane. I will try to solve this, but the implementation should be okay from a back-end side!

mp-17 avatar Nov 03 '22 21:11 mp-17

Hello @M-Ijaz-10x, you are verifying the rounding functions, right? Please ping me when everything is ready so that I can run the CI and the backend flow on the most updated system. Up to now, the backend results are okay.

mp-17 avatar Nov 16 '22 12:11 mp-17

@mp-17 I have updated the rounding mode functions as well. You can now run the backend flow

M-Ijaz-10x avatar Nov 25 '22 12:11 M-Ijaz-10x

Thanks a lot, the flow is running now! We thought about it, and since the fixed-point support has a non-negligible impact on the area, we can have a parameter to enable or disable the instantiation of the related logic in the alu or multiplier. So that if you need it, you can instantiate it, otherwise you go without it. I can parameterize it and issue a PR on your branch, would that be okay with you?

mp-17 avatar Nov 30 '22 10:11 mp-17

Thanks a lot, the flow is running now! We thought about it, and since the fixed-point support has a non-negligible impact on the area, we can have a parameter to enable or disable the instantiation of the related logic in the alu or multiplier. So that if you need it, you can instantiate it, otherwise you go without it. I can parameterize it and issue a PR on your branch, would that be okay with you?

Sure. Just let me know when you create a PR on our branch

M-Ijaz-10x avatar Nov 30 '22 12:11 M-Ijaz-10x

PR: https://github.com/10x-Engineers/ara/pull/224

mp-17 avatar Nov 30 '22 17:11 mp-17

@mp-17 CI checks are over :)

M-Ijaz-10x avatar Dec 01 '22 14:12 M-Ijaz-10x

The backend is almost over as well! If there are no surprises because of the rounding functions, I will merge immediately ;-)

mp-17 avatar Dec 01 '22 21:12 mp-17

@mp-17 any updates?

M-Ijaz-10x avatar Dec 03 '22 14:12 M-Ijaz-10x

We will need to tune the implementation to reach 1 GHz without exploding in area; since this is now parameterized, we can merge it and then think about how to tune it.

mp-17 avatar Dec 04 '22 16:12 mp-17

If you solve the conflicts (for the last time, I promise :-)) I will merge immediately.

mp-17 avatar Dec 04 '22 16:12 mp-17

If you solve the conflicts (for the last time, I promise :-)) I will merge immediately.

Done :)

M-Ijaz-10x avatar Dec 04 '22 16:12 M-Ijaz-10x