Dual-Edge-TPU-Adapter icon indicating copy to clipboard operation
Dual-Edge-TPU-Adapter copied to clipboard

Only one tpu showing up with PCIe adapter

Open foxy82 opened this issue 1 year ago • 16 comments

Hi.

I have just installed the PCIe adapter with the coral dual tpu board. However I'm still only seeing a single Coral TPU. I followed the instructions on the coral page to install the Linux drivers. Is there anything else I need to do to see both TPUs?

foxy82 avatar Jul 18 '22 07:07 foxy82

Hi @foxy82 On host system you should see two instances of Coral TPUs with 'lspci' no matter if drivers are installed or not.

When drivers are properly installed, no additional steps are needed to see both apex0 and apex1 devices.

To use TPUs with Virtual Machine, configuraton needed for PCIe passthroug for both TPUs

When using Frigate, both TPUs should be configured in 'detectors' section

Are you using TPU within VM or without? Also, could you show output of lspci?

magic-blue-smoke avatar Jul 18 '22 12:07 magic-blue-smoke

I'm running on proxmox - my final plan is to pass this through to an lxc container. However to keep things simple for now I'm running on the bare host.

Here is my lspci:

root@pve:~/coral/pycoral# lspci
00:00.0 Host bridge: Intel Corporation Comet Lake-S 6c Host Bridge/DRAM Controller (rev 03)
00:02.0 VGA compatible controller: Intel Corporation CometLake-S GT2 [UHD Graphics 630] (rev 03)
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
00:14.0 USB controller: Intel Corporation Comet Lake PCH-V USB Controller
00:14.2 Signal processing controller: Intel Corporation Comet Lake PCH-V Thermal Subsystem
00:16.0 Communication controller: Intel Corporation Device a3ba
00:17.0 SATA controller: Intel Corporation 400 Series Chipset Family SATA AHCI Controller
00:1b.0 PCI bridge: Intel Corporation Device a3eb (rev f0)
00:1c.0 PCI bridge: Intel Corporation Device a394 (rev f0)
00:1c.5 PCI bridge: Intel Corporation Device a395 (rev f0)
00:1c.6 PCI bridge: Intel Corporation Device a396 (rev f0)
00:1f.0 ISA bridge: Intel Corporation Device a3c8
00:1f.2 Memory controller: Intel Corporation Memory controller
00:1f.3 Audio device: Intel Corporation Device a3f0
00:1f.4 SMBus: Intel Corporation Comet Lake PCH-V SMBus Host Controller
01:00.0 Non-Volatile memory controller: Sandisk Corp Device 501a
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
03:00.0 PCI bridge: ASMedia Technology Inc. Device 1182
04:03.0 PCI bridge: ASMedia Technology Inc. Device 1182
04:07.0 PCI bridge: ASMedia Technology Inc. Device 1182
06:00.0 System peripheral: Global Unichip Corp. Coral Edge TPU
07:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
07:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe

I've run through the instructions here: https://coral.ai/docs/m2/get-started and that works:

root@pve:~/coral/pycoral# python3 examples/classify_image.py --model test_data/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite --labels test_data/inat_bird_labels.txt --input test_data/parrot.jpg
----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
12.0ms
2.7ms
2.8ms
2.8ms
2.8ms
-------RESULTS--------
Ara macao (Scarlet Macaw): 0.75781

foxy82 avatar Jul 18 '22 13:07 foxy82

From what I see here, the issue is with hardware. lspci should show two System peripheral: Global Unichip Corp. Coral Edge TPU lines, because TPUs are independent

Could you please try following:

  • Remove TPU card from adapter
  • Clean contacts on m.2 card
  • Inspect TPU card for mechanical damage: TPUs and PMICs are bare flip chips and easy to break
  • Reinsert card and try again with lspci
  • If have another TPU card (or adapter) try different combinations of those

magic-blue-smoke avatar Jul 18 '22 20:07 magic-blue-smoke

@magic-blue-smoke I believe I should be getting my PCIe adapter tomorrow (very fast shipping 👌 ) so I haven't tested yet.

One question I had that could potentially be affecting OP as well, does it matter the BIOS settings for the PCIe lanes? For example my mobo defaults to Gen 3 but I read that this device utilizes Gen 2, could that cause some issue if it isn't set to Gen 2?

I know PCIe is supposed to be backwards compatible but after hearing issues of gen 4 GPUs not showing up when using incorrect BIOS settings, figured I would ask.

NickM-27 avatar Jul 18 '22 22:07 NickM-27

Personally I received the adapter few days ago and I got no issue at all. Plugged it into my proxmox server, pci passthrough into a HomeAssistant OS VM. Installed Frigate add-on for HomeAssistant, configured it and it works well :)

baskwo avatar Jul 18 '22 23:07 baskwo

@foxy82 Something to try is to make sure you have iommu isolation. With my AMD 1700x I had to tweak few things in the kernel so it could passthrough correctly. Maybe it's what made the adapter had no issue

baskwo avatar Jul 19 '22 00:07 baskwo

@magic-blue-smoke I believe I should be getting my PCIe adapter tomorrow (very fast shipping ok_hand ) so I haven't tested yet.

One question I had that could potentially be affecting OP as well, does it matter the BIOS settings for the PCIe lanes? For example my mobo defaults to Gen 3 but I read that this device utilizes Gen 2, could that cause some issue if it isn't set to Gen 2?

I know PCIe is supposed to be backwards compatible but after hearing issues of gen 4 GPUs not showing up when using incorrect BIOS settings, figured I would ask.

@NickM-27 I've tested adapter on a number of motherboards with Gen3 x4 PCIe slot and it falls back to Gen2 x1 without any issues, so I don't expect problems here

Personally I received the adapter few days ago and I got no issue at all. Plugged it into my proxmox server, pci passthrough into a HomeAssistant OS VM. Installed Frigate add-on for HomeAssistant, configured it and it works well :)

@baskwo Glad to hear boards are arriving!

magic-blue-smoke avatar Jul 19 '22 00:07 magic-blue-smoke

From what I see here, the issue is with hardware. lspci should show two System peripheral: Global Unichip Corp. Coral Edge TPU lines, because TPUs are independent

Could you please try following:

  • Remove TPU card from adapter
  • Clean contacts on m.2 card
  • Inspect TPU card for mechanical damage: TPUs and PMICs are bare flip chips and easy to break
  • Reinsert card and try again with lspci
  • If have another TPU card (or adapter) try different combinations of those

Hi - so I removed the adapter looked it over and it looks ok. I checked the coral board and it also looks ok. Cleaned both and re-inserted them and got the same outcome. I don't have another dual coral and I don't have another dual TPU adapter board. I do still have the original single lane adapter board I was using (https://www.amazon.co.uk/gp/product/B09F64TJ1W/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1)

I tried the dual adapter board in another computer and get the same result - only a single TPU.

However I now think the problem is the adapter board and I will explain why.... alongside trying to get the board working I've also been migrating my Frigate NVR installation to a new proxmox LXC container and I was constantly getting errors in the container like:

Traceback (most recent call last):
  File "examples/classify_image.py", line 121, in <module>
    main()
  File "examples/classify_image.py", line 71, in main
    interpreter = make_interpreter(*args.model.split('@'))
  File "/usr/lib/python3/dist-packages/pycoral/utils/edgetpu.py", line 87, in make_interpreter
    delegates = [load_edgetpu_delegate({'device': device} if device else {})]
  File "/usr/lib/python3/dist-packages/pycoral/utils/edgetpu.py", line 52, in load_edgetpu_delegate
    return tflite.load_delegate(_EDGETPU_SHARED_LIB, options or {})
  File "/usr/lib/python3/dist-packages/tflite_runtime/interpreter.py", line 163, in load_delegate
    library, str(e)))
ValueError: Failed to load delegate from libedgetpu.so.1

Googling a lot of people report this for USB adapters where they don't have it plugged in or have a dodgy USB cable (there isn't much info on this happening for PCIe devices)

In doing the above diagnostic I put the coral back into the old WiFI adapter I was using originally and suddenly the LXC container is no longer throwing the errors above and Frigate is working properly so it seems to suggest the dual adapter board is causing issues.

foxy82 avatar Jul 19 '22 09:07 foxy82

@foxy82 Could you please contact me using form at the bottom of page here for board replacement?

magic-blue-smoke avatar Jul 19 '22 15:07 magic-blue-smoke

Just to update, got mine in and both are showing up, thank you!! @magic-blue-smoke

NickM-27 avatar Jul 19 '22 23:07 NickM-27

Mine is doing a cross country USA tour. JFK to LAX then it will have to come back to Virginia..

Hopefully it will get here soon so I can test..

ropeguru avatar Jul 27 '22 11:07 ropeguru

Mine is doing a cross country USA tour. JFK to LAX then it will have to come back to Virginia..

Hopefully it will get here soon so I can test..

@ropeguru global logistics is painful sometimes and unfortunately this is something we can't control.

I'd recommend to double-check delivery address and contact DHL if there's a mistake or your package appears in FL or WA next time.

magic-blue-smoke avatar Jul 27 '22 23:07 magic-blue-smoke

@ropeguru global logistics is painful sometimes and unfortunately this is something we can't control.

I'd recommend to double-check delivery address and contact DHL if there's a mistake or your package appears in FL or WA next time.

Oh, this was certainly not directed at you or any of the work you have done. This is my first time dealing with SF International shipping and there track is kind of useless on details.

Edit - SF International gave me a USPS tracking number. Putting that in showed it departed NJ on the 26th via partner ACI Logistics. Has not updated since and we are now on the 29th. Trying to contact ACI to find out what is going on. Shouldn't take 3+ days to go from NJ to VA.

ropeguru avatar Jul 28 '22 10:07 ropeguru

From what I see here, the issue is with hardware. lspci should show two System peripheral: Global Unichip Corp. Coral Edge TPU lines, because TPUs are independent

Could you please try following:

  • Remove TPU card from adapter
  • Clean contacts on m.2 card
  • Inspect TPU card for mechanical damage: TPUs and PMICs are bare flip chips and easy to break
  • Reinsert card and try again with lspci
  • If have another TPU card (or adapter) try different combinations of those

Hi. I face the same problem. I bought an adapter on https://www.makerfabs.com/dual-edge-tpu-adapter.html but since I have it I can succeed to see the two TPU.

EticWeb avatar Oct 22 '22 19:10 EticWeb

Hi. Could you confirm if the https://www.makerfabs.com/dual-edge-tpu-adapter.html web store is the right place to buy your PCIe adapter for coral dual edge tpu? I have buy a new mother board but it's always the same issue I have only one TPU listed. Here is my lspci result. ~# lspci 00:00.0 Host bridge: Intel Corporation Device 9b73 (rev 03) 00:02.0 VGA compatible controller: Intel Corporation Device 9ba8 (rev 03) 00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model 00:14.0 USB controller: Intel Corporation Comet Lake PCH-V USB Controller 00:14.2 Signal processing controller: Intel Corporation Comet Lake PCH-V Thermal Subsystem 00:16.0 Communication controller: Intel Corporation Device a3ba 00:17.0 SATA controller: Intel Corporation 400 Series Chipset Family SATA AHCI Controller 00:1c.0 PCI bridge: Intel Corporation Device a394 (rev f0) 00:1c.6 PCI bridge: Intel Corporation Device a396 (rev f0) 00:1f.0 ISA bridge: Intel Corporation Device a3c8 00:1f.2 Memory controller: Intel Corporation Memory controller 00:1f.3 Audio device: Intel Corporation Device a3f0 00:1f.4 SMBus: Intel Corporation Comet Lake PCH-V SMBus Host Controller 01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15) 02:00.0 PCI bridge: ASMedia Technology Inc. Device 1182 03:03.0 PCI bridge: ASMedia Technology Inc. Device 1182 03:07.0 PCI bridge: ASMedia Technology Inc. Device 1182 04:00.0 System peripheral: Global Unichip Corp. Coral Edge TPU

Do you think the adapter is out of order. I have apply the suggestion made to @foxy82 without result. Help please.

EticWeb avatar Oct 25 '22 10:10 EticWeb

Hi @EticWeb Sorry for late reply, missed your first message. Makerfabs is manufacturer and seller for my TPU adapter boards and Raspberry Pi CM4 TV Stick Regarding issue you're having please contact me using form at the bottom of home page

magic-blue-smoke avatar Oct 26 '22 00:10 magic-blue-smoke