corundum
corundum copied to clipboard
cheaper fpgas
I was wondering if much lower spec (and much cheaper) FGPAs would work.
The lattice ECP5 has 5gb/s serdes, so i guess that's not a good option. How about Artix? the datasheet is a little confusing to me as someone who has never used xilinx.
" 211 Gb/s Serial Bandwidth" but only "6.6 Gb/s Transceiver Speed"
i guess 6gbs is still ok, assuming pcie is separate.
ECP5 is too small, too slow, and doesn't have PCIe hard IP. Artix 7 could possibly be an option, though it would be limited to either 1 Gbps or require an external XAUI PHY, and I would have to port the PCIe DMA components to 7-series, which is not high on my priority list. Cyclone 10 GX is possibly a decent option as the serdes on that part support 10G natively. However, the PCIe hard IP is limited to gen 2 x4, which is unfortunate. TBH, Kintex UltraScale or UltraScale+ are probably the best "bang for the buck" parts (PCIe gen 3 x8 + native 10G and even 25G serdes), and they are already supported. Also, there are some used FPGA boards available that are capable of running Corundum (specifically VCU1525/BCU1525 - yes, they are about a grand or so, but that is an insane deal for a VU9P FPGA + PCIe gen 3x16 + dual QSFP28 + ~700 Gbps BW into DDR4). TBH, I might be more open to porting to Stratix V to support the DE5-Net or some of the surplus Catapult boards from Azure vs. something like ECP5 or Artix. Arria 10 and Stratix 10 DX are on the roadmap, once that's done then Stratix V would probably be more an issue of timing closure than anything else as it's an older and slower part.
At any rate, Corundum is intended for datacenter networking research where additional functionality is built on top of the core host interface and evaluated in a datacenter environment. So it's really only interesting for line rates 10G and higher and on FPGAs large enough to include additional functionality. If someone wanted to fund development for some of these lower-end parts, then that might be an option, but until then, if it's not something that I personally need for research purposes, it's probably not going to be supported.
I should also mention that I have no interest in producing hardware at this time. So even if there is a screaming deal on a particular part, if you can't buy it on a board in a PCIe form factor that also provides SFP+ interfaces or similar, it's not an interesting option. There was a kickstarter a while back for a PCIe form-factor board with a KU3P FPGA for like $500 which would have been insane, but then they changed gears and went mining-only with no IO to speak of. Such a waste of perfectly good FPGAs.
thanks for the long response.
Corundum is intended for datacenter networking research where additional functionality is built on top of the core host interface
yeah that's why i'm interested. But more from a production perspective since i actually run a datacenter company. A grand for a network card isn't competitive when you can get an entire amd epyc machine doing the same thing. Someone like azure has extremely high margins so they probably care more about scale than price efficiency, but we're a tiny shop with very different financials.
If someone wanted to fund development for some of these lower-end parts, then that might be an option
yes. very open to that.
I should also mention that I have no interest in producing hardware at this time.
i do make hardware, but the big xilinx are out of range for any commercial viability. You need to be a big corpo to get them at reasonable prices.
We do ARM & RISC-V servers, easily pushing 300Gb/s through a 1000 USD cluster. The great challenge is that there's no network fabric matching that price to performance. So that's why i'm researching if FGPAs might be the solution here. Maybe this is just not a match, but happy to discuss further.
research
Perhaps some of the functionality developed on top of Corundum will make it into the next generation of commercial, ASIC-based NICs. Or higher-level networking research will impact the design of switches, network stacks, etc. It's possibly not stable enough for production use as is, but this could be addressed if the project grows.
Anyway, if you're seriously interested in possibly using Corundum for something in production, that's certainly something that can be looked in to. Depending on what it is you have in mind, maybe even using PCIe is not the right choice - I have been working on a Zynq version of Corundum with an AXI interface instead of PCIe, for example, maybe something along those lines would make more sense, depending on what it is you're doing.
Also, even if the big Xilinx parts are not economical, have you taken a more serious look at some of the lower-end parts, such as the Kintex UltraScale+ line? Like I said, the KU3P is already supported, and should be able to operate at around 50 Gbps. I'm not sure what sort of pricing you might be able to get from Xilinx for that part, but it's certainly going to be a lot better than anything Virtex, and you won't need any extra PHY chips to make it work.
Like I said, the KU3P is already supported,
yeah, KU3P is about 1 grand the chip alone, plus you need a ton of expensive power delivery. that's more expensive than a cisco nexus, which can do 200gbps
Corundum with an AXI interface
neat!
Have you thought about using something like LiteX (with LiteDRAM) + LitePCIe (and maybe LiteICLink for connecting multiple smaller FPGAs together?). These cores have wide support for everything from low end iCE40 parts up to high end VU19P parts. @enjoy-digital has been doing some excellent work bringing up cheap high end FPGA hardware from ex-bitcoin mining rigs (see https://twitter.com/enjoy_digital/status/1329744466907979778 for example). Totally understand if you want to keep all your own implementations of this stuff.
(LiteX designs are also a core early target for the SymbiFlow project.)
thanks, but litex isn't really interesting for business strategy reasons. unless i'm mistaken, liteeth doesnt support 10G anyway.
One of the things I'm interested in (once I get non-negative free time) is a version of corundum without the network ports, just the host PCIe logic (e.g. showing as two NICs connected back-to-back) -- for learning and experimenting on the PCIe interface and driver. I'm hoping such a contraption would run on lower-end boards, like the SQRL Acorns.
IIRC, the kickstarter with the KU3P managed to get pricing from Xilinx permitting a board price of around $500. Not terribly cheap, but much more reasonable than many of the alternatives. Have you looked at the Cyclone 10 at all? If all you need is a 10G port, accessible over PCIe, the Cyclone 10 GX supports PCIe gen 2 x4 and has 10G transceivers, so with that you could build a single-port 10G NIC with one of those. Anyway, if you just need a 10G NIC and no custom features, then commercial ASIC-based NICs are probably going to be more economical.
If you want two NICs back to back, that's trivial - the "core" corundum logic exposes AXI stream interfaces, you can easily cross-connect those internally and ignore the external ports. However, the SQRL acorn I believe is Artix 7, so I would need to port the PCIe DMA components to 7-series, which is currently not on the roadmap.
Minor update to this: the DMA engine has been "split" into a core module + device-specific shim, so you would only need to write a shim for Artix 7 instead of a whole DMA engine. However, resource consumption of the overall design + timing closure would likely be a serious issue; timing closure is already a problem on Virtex 7, and the failures tend to be in the control logic and not the datapath.
Minor update to this: the DMA engine has been "split" into a core module + device-specific shim, so you would only need to write a shim for Artix 7 instead of a whole DMA engine. However, resource consumption of the overall design + timing closure would likely be a serious issue; timing closure is already a problem on Virtex 7, and the failures tend to be in the control logic and not the datapath.
With this, could support for Artix 7 be included into the Corundum roadmap ?
One of the things I'm interested in (once I get non-negative free time) is a version of corundum without the network ports, just the host PCIe logic (e.g. showing as two NICs connected back-to-back) -- for learning and experimenting on the PCIe interface and driver. I'm hoping such a contraption would run on lower-end boards, like the SQRL Acorns.
I created https://github.com/corundum/corundum/issues/114 with this in mind too.
For Artix 7 support on the roadmap, that's probably not going to happen, unless maybe someone wants to fund it somehow. It's already hard enough to close timing on Virtex 7, and Artix 7 is significantly smaller and slower. Artix US+ might be a different story. I have been mulling over making a stripped down "corundum lite" that might work better for this sort of thing, but again the main problem is lack of time.
As far as the "no network ports" thing; that might be something I'll need to look in to in more detail. With the switch project coming along, it looks like one relatively common configuration is to have a PCIe link to control the switch, with additional Ethernet links to ports on the control SoC for network traffic. So, it might make sense to have a way to configure the design for that sort of use case.
Any news about support of others FPGAs and lower speed Ethernet flavors?
lower-end parts: no updates aside from the fact that we're working on a design around a K26 SoM as part of OCP TAP (2x 10G + 2 lanes of PCIe). Lower speed Ethernet: supporting running 10G-capable serdes at 1G is on the to-do list, and this will likely also enable running at 1G directly, but this is also complicated by the fact that we want to support white rabbit and all of this has to work even when the reference clock isn't 156.25 MHz (which rules out CPLLs).