ara
ara copied to clipboard
[HW] Add support for vector fixed-point instructions
Add support for vssra
, vssrl
, vnclip
, vnclipu
and vsmul
vector fixed-point instructions (NOTE: This PR depends on PR#106 and needs to be merged after that)
Changelog
Fixed
- N/A
Added
- Support for vector single-width fractional multiply with rounding and saturation instruction:
vsmul
- Support for vector single-width scaling shift instructions:
vssra
,vssrl
- Support for vector narrowing fixed-point clip instructions:
vnclip
,vnclipu
- Test for
vnclip
andvnclipu
vector fixed-point instructions
Changed
- Updated tests for
vssra
,vssrl
andvsmul
vector fixed-point instructions
Checklist
- [x] Automated tests pass
- [x] Changelog updated
- [x] Code style guideline is observed
Please check our contributing guidelines before opening a Pull Request.
Thanks a lot for the great work, I will review it tomorrow! In the meantime, I am checking that we have no frequency degradation.
Best, Matteo
I found some issues in integrating everything with the current lane size (over-utilization during placement), so I am exploring new lane sizes to host the changes! Ara is getting big! :-)
Feel free to ping me when the PR is ready for another round of reviews @M-Ijaz-10x :-)
Feel free to ping me when the PR is ready for another round of reviews @M-Ijaz-10x :-)
I have made rest of the requested changes except for the rounding functions (I am verifying them). You can review rest of the changes. Thanks!
LGTM, thanks a lot for the changes! I am running a back-end flow now to verify that everything is okay.
@mp-17 is there any update on the back-end flow side?
@M-Ijaz-10x the lane was implemented with no problems. I am waiting for the system: I think that max 6h and it will be over!
Still ongoing, busy servers today - But I will ping you as soon as it's over.
The frequency was slightly lowered, but it's just because of the larger lane. I will try to solve this, but the implementation should be okay from a back-end side!
Hello @M-Ijaz-10x, you are verifying the rounding functions, right? Please ping me when everything is ready so that I can run the CI and the backend flow on the most updated system. Up to now, the backend results are okay.
@mp-17 I have updated the rounding mode functions as well. You can now run the backend flow
Thanks a lot, the flow is running now! We thought about it, and since the fixed-point support has a non-negligible impact on the area, we can have a parameter to enable or disable the instantiation of the related logic in the alu
or multiplier
.
So that if you need it, you can instantiate it, otherwise you go without it.
I can parameterize it and issue a PR on your branch, would that be okay with you?
Thanks a lot, the flow is running now! We thought about it, and since the fixed-point support has a non-negligible impact on the area, we can have a parameter to enable or disable the instantiation of the related logic in the
alu
ormultiplier
. So that if you need it, you can instantiate it, otherwise you go without it. I can parameterize it and issue a PR on your branch, would that be okay with you?
Sure. Just let me know when you create a PR on our branch
PR: https://github.com/10x-Engineers/ara/pull/224
@mp-17 CI checks are over :)
The backend is almost over as well! If there are no surprises because of the rounding functions, I will merge immediately ;-)
@mp-17 any updates?
We will need to tune the implementation to reach 1 GHz without exploding in area; since this is now parameterized, we can merge it and then think about how to tune it.
If you solve the conflicts (for the last time, I promise :-)) I will merge immediately.
If you solve the conflicts (for the last time, I promise :-)) I will merge immediately.
Done :)