raspberry-pi-pcie-devices icon indicating copy to clipboard operation
raspberry-pi-pcie-devices copied to clipboard

Test Google Coral TPU M.2 Accelerator A+E key

Open geerlingguy opened this issue 3 years ago • 53 comments

I just bought a Coral M.2 Accelerator A+E key after seeing a lot of buzz about this little 'IoT' TensorFlow-compatible AI accelerator.

coral-tpu

I also just received an M.2 A key to PCIe 1x slot adapter card (see https://github.com/geerlingguy/raspberry-pi-pcie-devices/issues/38), and I will pop the Coral board into there.

I haven't done much with AI/ML, but apparently one of the big holdups for using this board with the Pi may be overcome soon—see: https://github.com/google-coral/edgetpu/issues/280

(And I don't want to step on @timonsku's toes here either—he was the original inspiration for me getting this particular card, after seeing his Piunora; just figured now is as good a time as ever to take a quick stab at TensorFlow.)

See related, in the Pi Forums: https://www.raspberrypi.org/forums/viewtopic.php?p=1772610&sid=4833ac3f714618282207affca2bcd846#p1772610

And the patch advertising MSI-X support in the Pi Kernel (currently only on 5.10.y branch): https://github.com/raspberrypi/linux/commit/6bf63f7711b550de8c803a4c4ad792ecfbe721df

(Note that it may be incorporated into Ubuntu for Pi too... https://twitter.com/m_wimpress/status/1345077692568367105

geerlingguy avatar Dec 24 '20 04:12 geerlingguy

No worries, glad to see more people having an interest in seeing this working :)

timonsku avatar Dec 24 '20 21:12 timonsku

Great, looking forward to solve this issue in my project!

Valdiolus avatar Dec 29 '20 07:12 Valdiolus

It has arrived! (Picture uploaded).

I also got a couple other goodies today, though, so I'm going to have to wait to start testing it at least a couple days :( (ah, if only time were infinite!).

geerlingguy avatar Jan 24 '21 01:01 geerlingguy

Surely time is infinite in Catholic canon. ;)

Earnest-Williams avatar Jan 24 '21 15:01 Earnest-Williams

@Geofferic - Haha, well yes... but I'm imagining if I can make it to the point where it is indeed infinite—I don't think testing a Coral TPU is going to be my highest priority 😆

geerlingguy avatar Jan 24 '21 21:01 geerlingguy

$ sudo lspci -vvvv -d 1ac1:089a
05:00.0 System peripheral: Device 1ac1:089a (prog-if ff)
	Subsystem: Device 1ac1:089a
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 0
	Region 0: Memory at 600900000 (64-bit, prefetchable) [disabled] [size=16K]
	Region 2: Memory at 600800000 (64-bit, prefetchable) [disabled] [size=1M]
	Capabilities: [80] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 25.000W
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
		LnkCap:	Port #1, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 5GT/s, Width x1, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [d0] MSI-X: Enable- Count=128 Masked-
		Vector table: BAR=2 offset=00046800
		PBA: BAR=2 offset=00046068
	Capabilities: [e0] MSI: Enable- Count=1/32 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [f8] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [100 v1] Vendor Specific Information: ID=1556 Rev=1 Len=008 <?>
	Capabilities: [108 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Capabilities: [110 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=0us LTR1.2_Threshold=0ns
		L1SubCtl2: T_PwrOn=10us
	Capabilities: [200 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr+ BadTLP- BadDLLP+ Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-

geerlingguy avatar Feb 17 '21 22:02 geerlingguy

Following the default instructions for setting up the Coral M.2 PCIe card, I get the following kernel panic when the device manager starts:

IMG_3633

(No problem prior to installing the Coral packages gasket-dkms and libedgetpu1-std.)

geerlingguy avatar Feb 17 '21 23:02 geerlingguy

(Also posted this over to https://github.com/google-coral/edgetpu/issues/280#issuecomment-780913752).

geerlingguy avatar Feb 17 '21 23:02 geerlingguy

Latest update from someone over in the Coral issue queue:

For now, the plan is to wait until the office is open so we can use a PCIe analyzer and confirm this hypothesis. But there doesn't appear to be any additional changes that we can do in SW - the device expecting a host to be able to perform 64-bit read/write is built into the hardware.

USB is still the recommendation for the CM4. USB2.0 is possible out of box, and USB3.0 may be possible although extra design considerations are required (more info here: https://coral.ai/products/accelerator-module/).

It looks doubtful (though still in the realm of possibility) the PCI Express version of the Coral TPU will work on the current generation of Raspberry Pi. Though I still wonder if it's a similar issue to the 64/32-bit discrepancy that Broadcom had to work around for the MegaRAID card.

If so, there's a possibility the driver could add a one-off to work around the PCIe limitation on the Pi 4, but it would be much nicer for Pi OS or the firmware to somehow make it work more as expected :P

geerlingguy avatar Apr 13 '21 02:04 geerlingguy

@geerlingguy the wavshare carriers have a M.2 which should fit a Coral B+M

I am confused at the '64/32-bit discrepancy' as why use 32bit raspiOS but what is the PCIe limitation of the Pi4?

StuartIanNaylor avatar Jul 09 '21 12:07 StuartIanNaylor

@StuartIanNaylor - Right now the Coral drivers don't seem to work on either 32-bit or 64-bit Pi OS. But note that the OS type is not always related to the weird issues you get on the PCI Express bus (though for some things, the 32-bit Pi OS behaves more consistently).

geerlingguy avatar Jul 09 '21 22:07 geerlingguy

@geerlingguy has this been resolved or is this still an issue?

wbreiler avatar Oct 11 '21 17:10 wbreiler

AFAIK it's an hardware issue which cannot be solved.

darkbasic avatar Oct 11 '21 17:10 darkbasic

Ah, alright

wbreiler avatar Oct 11 '21 18:10 wbreiler

@darkbasic - It's a hardware issue with the BCM2711 PCIe implementation that can't be changed, however it could be possible to work around the problem in software.

As with the MegaRAID driver, the problem stems from the fact that 64-bit PCIe accesses expect certain things to work certain ways—and they do, on other ARM64 devices, and on Intel/AMD64—but they don't work at all and crash the Pi. So in software you kind of have to hack around things just for the Pi if you want them to work on the Pi.

So far this seems to affect GPUs, Coral TPUs, and storage controllers the most—some of the newer or more advanced/complex cards.

geerlingguy avatar Oct 11 '21 19:10 geerlingguy

That would be awesome, I'd love to be able to use an M.2 Coral TPU.

darkbasic avatar Oct 11 '21 19:10 darkbasic

Which is a reccomanded SoC or mobo compatible with Google Coral TPU M.2 Accelerator m.2 E-key ?

grigio avatar Oct 13 '21 17:10 grigio

Which is a reccomanded SoC or mobo compatible with Google Coral TPU M.2 Accelerator m.2 E-key ?

Try to use nxp - like one from Coral dev board. But the cost is quite high.

Valdiolus avatar Oct 30 '21 03:10 Valdiolus

Hello. Do you use heatsinks and fans for cooling "power IC (PMIC) and Edge TPU"? Since the datasheet says that it is necessary to use cooling: https://coral.ai/docs/m2/datasheet/ https://coral.ai/docs/m2-dual-edgetpu/datasheet/

If I plan to use "Coral Edge TPU m.2" for 24/7 operation, will I need cooling?

vukitoso avatar Dec 02 '21 19:12 vukitoso

I'd maybe recommend following as per description from your first linked datasheet, in section 5.3, possibly a dear to both lower and upper cooling of the TPU to provide maximum performance, especially considering you're wanting to run 24/7, also take special note to possible transient spikes in power up to 3A

aiden1989acw avatar Dec 08 '21 09:12 aiden1989acw

@aiden1989acw, thx.

spikes in power up to 3A

for m.2 boards at 3.3V

https://coral.ai/docs/m2/datasheet/ - 3.2 Power consumption

Although the average current drawn from the 3.3V supply is typically less than 500 mA, brief current transients that occur during inferencing can reach roughly 3 A. These spikes also occur suddenly: even a simple model can generate current transients in excess of 1 A/μs. However, these numbers are representative of only the models tested at Google, and your numbers will vary. To determine the actual peak supply current, you should observe the current when running the models you will deploy in production.

vukitoso avatar Dec 09 '21 17:12 vukitoso

This is still hope, however small: https://github.com/google-coral/edgetpu/issues/280#issuecomment-1065115347

geerlingguy avatar Mar 11 '22 16:03 geerlingguy

Which is a reccomanded SoC or mobo compatible with Google Coral TPU M.2 Accelerator m.2 E-key ?

@grigio You can use the Jetson Nano with the M.2 module, see setup notes here: https://github.com/jveitchmichaelis/edgetpu-yolo/blob/main/hardware.md It works pretty well!

jveitchmichaelis avatar Mar 21 '22 17:03 jveitchmichaelis

what about this ? https://pipci.jeffgeerling.com/boards_cm/tinycar-cm4-markus-kasten.html

UcefMountacer avatar Sep 16 '22 16:09 UcefMountacer

@jveitchmichaelis it's out of stock in Europe

grigio avatar Sep 19 '22 07:09 grigio

Yes unfortunately almost all the popular dev boards are out of stock due to the chip shortage/supply chain issues. If you just want to play with the module then the Coral Dev Board Minis are more available.

jveitchmichaelis avatar Sep 19 '22 13:09 jveitchmichaelis

Just wanted to update here that I have been able to get much further on a Raspberry Pi 5; see https://www.jeffgeerling.com/blog/2023/testing-pcie-on-raspberry-pi-5

Right now it seems like a clock sync/retiming issue with the PCIe bus. I keep getting these messages:

[  362.772240] pcieport 0000:00:00.0: AER: Corrected error received: 0000:00:00.0
[  362.772252] pcieport 0000:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[  362.772255] pcieport 0000:00:00.0:   device [14e4:2712] error status/mask=00001000/00002000
[  362.772258] pcieport 0000:00:00.0:    [12] Timeout               
[  372.628183] pcieport 0000:00:00.0: AER: Corrected error received: 0000:00:00.0
[  372.628199] pcieport 0000:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[  372.628204] pcieport 0000:00:00.0:   device [14e4:2712] error status/mask=00001000/00002000
[  372.628209] pcieport 0000:00:00.0:    [12] Timeout               
[  373.268131] apex 0000:01:00.0: RAM did not enable within timeout (12000 ms)
[  373.268141] apex 0000:01:00.0: Error in device open cb: -110
[  373.268160] apex 0000:01:00.0: Apex performance not throttled due to temperature

One Pi engineer said a prototype Pi 5 board with a different PCIe connection worked with his Coral TPU, but he hasn't gotten it working with the standard FPC connection (there are timeouts when you try using the hardware).

Perhaps a better FFC will fix the issue, we'll see!

geerlingguy avatar Oct 11 '23 14:10 geerlingguy

@geerlingguy pcie3.0 or 2.0 as do you not neet to set back to 2.0? Radxa tried FPC on the rock Pi4 at 1st but that was pcie2.0 x4 but that was the same until they made a revision.

StuartIanNaylor avatar Oct 11 '23 17:10 StuartIanNaylor

@StuartIanNaylor - I have tried at PCIe Gen 3.0, 2.0, and 1.0, and have encountered the exact same issue on all three speeds.

geerlingguy avatar Oct 11 '23 19:10 geerlingguy

Just as a note, I have heard the final version of the FFC (the little flat cable that goes from Pi 5 to a HAT or expansion board) will be impedance-controlled—the one I'm using in my testing is not.

It seems like that may solve a lot of the little issues I'm seeing, especially with the Coral and SATA storage controllers.

geerlingguy avatar Oct 26 '23 02:10 geerlingguy