openFPGALoader icon indicating copy to clipboard operation
openFPGALoader copied to clipboard

[Request]Add Xilinx Virtual Cable Support (XVC)

Open zhuangzard opened this issue 2 years ago • 37 comments

Could this project add Xilinx Virtual Cable Support? JTAG and programming are very solid in this project, found some other XVC project, like https://github.com/kholia/xvcpi and https://github.com/BerkeleyLab/XVC-FTDI-JTAG are both great project. But they are lack of good FPGA board support and good cable support.

If bring openFPGALoader with XVC, will make this project ever better for remote debug utilizing Raspberry Pi or desktop to program and debug.

XVC is the well documented here: https://github.com/Xilinx/XilinxVirtualCable

zhuangzard avatar Apr 06 '22 15:04 zhuangzard

This protocol is in my TODO list yes. In fact both side must be implemented:

  • server: at the same level as jtag class. It receive commands and convert them to drive the cable
  • client: at the cable level. It send instructions to the server.

Thanks

trabucayre avatar Apr 07 '22 04:04 trabucayre

That sounds great!! Can not wait for release version of that! You are correct, the server is also very handy for the application of remote programing and debug. Thinking about some small like raspberry Pi or ESP32-Pico could become a remote wifi debugger. That is a very exciting things. BTW, the speed looks still an big issue, don't know that is because the send command via socket create a delay or we could improve that in our version.

zhuangzard avatar Apr 07 '22 13:04 zhuangzard

I have installed @kholia implementation for esp32 on a device to test -> I'm able to start with the client side.

XVC protocol is really similar to JLINK cable so I assume implementation will be fast.

For speed it's true: in my mind there is 2 bottleneck

  • network: usually slower than direct USB connection
  • server implementation: arduino is globally slower than real bare metal code. A simple pin toggle shows a slower wave with arduino. And for RPI with shared-time OS, playing with pins imply lot of moving between userspace and kernel space.

trabucayre avatar Apr 09 '22 05:04 trabucayre

@trabucayre Hi! I recommend looking at https://github.com/kholia/xvc-pico and https://github.com/tom01h/xvc-pico. The tom01h version is pretty fast in comparison to the esp32 version.

kholia avatar Apr 09 '22 06:04 kholia

Thanks to point this implementation. It's more easy to develop using that (no needs to have wifi or network connection). At the end I have to test with as much as possible implementation (okay all are theoritically exactly the same).

trabucayre avatar Apr 09 '22 18:04 trabucayre

@trabucayre, just implanted the code today with modification of ftdiJtagMPSSE.cpp file, which could send tms and tdi, which is utilizing the data from XVC server. new function called: int FtdiJtagMPSSE::writeTMSTDI(uint8_t *tms, uint8_t *tdi, uint8_t *tdo, uint32_t len)

1, added socket 2, could start server with --xvc command 3, setup port utilizing --port 3721, 3721 is default 4, add testing python readJTAG.py to read the idcode. 5, could send TMS TDI and TDO with xvc command.

This code could read from FPGA IDCODE utilizing the readJTAG.py file, but still not make vivado happy, very close to it. should be time setup. will check into it.

Check code out here is that could help you accelerate implantation. https://github.com/zhuangzard/openFPGALoader/tree/xvc

zhuangzard avatar Apr 09 '22 19:04 zhuangzard

I have to check / analyze your code thanks! The client side is now publicly available: tested with xvc-pico and a cycloneV FPGA (I know using an intel/altera device to test Xilinx's protocol is a bit funny). The real question about server side implementation is trying to avoid at much as possible a bitbanging solution, but this imply to analyze tms and tdi vectors to see it it's toggleClk, tms only or tdi only. But it's, maybe, the key to have a server at the same level as jtag and dfu and to avoid slowing transaction between server and device. This idea may allowing to have a direct compatibility with cables level (it's not to say jtagInterface is perfect -> I'm not sure myself).

trabucayre avatar Apr 10 '22 18:04 trabucayre

Update, just pushed on my work, the code works with Vivado now. https://github.com/zhuangzard/openFPGALoader/tree/xvc Could communication with Vivado and show the chip(s).

I my implantation is using MPSSE, just quick looked the https://github.com/kholia/xvc-pico code, I don't think its involved the solution for client to Jtag MPSSE, they directly write IO pins from Socket server, without utilizing FTDI's MPSSE protocol.

XVC is send three part of data with shift: they are "length", "TMS data" and "TDI data", with current Write_TMS or Write_TDI function, TMS and TDI could not be send at same time, that is I think the most challenge one. Thanks to his/her work, https://github.com/BerkeleyLab/XVC-FTDI-JTAG I created int FtdiJtagMPSSE::writeTMSTDI(uint8_t *tms, uint8_t *tdi, uint8_t *tdo, uint32_t len) function, which could send TMS and TDI in one clock.

For the server, that I think could be a revers process, take the write_TMS and write_TDI into sequence shift:0xXX...0xTMS...0xTDI data formate, which should be applicable.

I will try to work on that, and keep you posted.

zhuangzard avatar Apr 11 '22 04:04 zhuangzard

I have to read carefully your modications! But I'm not sure it is required to modify jtag.cpp since xvc (server side) is different and stateless.

MPSSE is specific to FTDI devices, so there is no reasons to see that in xvc-pico (and with a microcontroler it make sense to use a direct GPIOs access).

BerkeyLab code is interesting because it is not limited to bitbanging pins. I think it may interessting to see how to adapt this one with a highest level of abstract to allows using any cable and not only FTDIs.

trabucayre avatar Apr 11 '22 06:04 trabucayre

I did modification of jtag.cpp at begin because I was tried to use your write_TMS and write_TDI directly to build the send out logic, but with deeper understanding the code, I realize I need to create a new function could send TMS and TDI and read TDO function, I tried to merge your write_TMS and write_TDI function together, and utilizing the BerkeyLab's creative logic, which could be a lower level function . But I agree with you if we don't want to modify other jtag protocol, change jtag.cpp will have no point.

BerkeyLab code is true interesting code, the logic could send TMS and TDI parallel under the USB JTAG limitation is very creative. There is other Bitbang code, where BerkeyLab was reference to, https://github.com/tmbinc/xvcd, also very nice code to read. They are using BitBang, so, don't have the hard logic like Berkeylab.

zhuangzard avatar Apr 12 '22 03:04 zhuangzard

Yep modifying jtag.cpp is not required since xvc is a different approach: merging both in a same class will only increase complexity (but maybe a common super class is required to avoid dupplicate some part of initialisation). BerkeyLab and tmbinc approachs are interesting: tms and tdi vectors are analyzed instead of simply writting bits. Maybe it's not required to keep memory of current JTAG state, and transitions, but at least seeing if vectors contains state moving or shiftDR/shiftIR made sense. In my mind there is 3 cases:

  • both vectors (or subparts with same length) contains 0xff or 0x00 -> this is something like toggleClk: no transition nor data shifting
  • tms contains only a 0xff, or 0x0, serie but tdi no -> something like writeTDI
  • tdi contains only a 0xff, or 0x00, serie but tms no -> something like writeTMS

I have to dump a full XVC sequence to test my idea with a replay

trabucayre avatar Apr 12 '22 05:04 trabucayre

I thought that work like logic as you described at the begin, and I thought I could analysis the logic and send write_TMS and write_TDI and toggleClk, but there are not like that way. Here is the Xilinx XVC protocol info https://github.com/Xilinx/XilinxVirtualCable They send data with shift and "shift:<num bits><tms vector><tdi vector>"

<num bits> : is a integer in little-endian mode. This represents the number of TCK clk toggles needed to shift the vectors out

<tms vector> : is a byte sized vector with all the TMS shift in bits Bit 0 in Byte 0 of this vector is shifted out first. The vector is num_bits and rounds up to the nearest byte.

<tdi vector> : is a byte sized vector with all the TDI shift in bits Bit 0 in Byte 0 of this vector is shifted out first. The vector is num_bits and rounds up to the nearest byte.

<tdo vector> : is a byte sized vector with all the TDO shift out bits Bit 0 in Byte 0 of this vector is shifted out first. The vector is num_bits and rounds up to the nearest byte.

The logic is after receive the socket data, send the TMS maximum for 6 bits if the TDI keep the same (that equal to send TMS and TDI at same time, as TDI stay the same and TDI info be attached to the end), and attach the 7th bit with TDI info. During the 6 bits of send TMS, if TDI changed, then stop the TMS and start write TDI into buffer, but only if the TMS stay the same as the one we just send(because TDI could not attach the TMS, if TMS is changed, stop send TDI at this loop, go back to TMS loop at the begin with new TDI info attached at the 7th bit), in this way system don't have to know what is sending, don't have to record the JTAG state, but just make it send data like sending TMS and TDI at same time with one same clock toggle.

zhuangzard avatar Apr 12 '22 20:04 zhuangzard

Texte source Looks good. But instead of using 6bits for TMS (it's true for FTDI MPSSE but not for others) it's the longest sequence to search. Anyway it's relevant to have a first implementation to have the base material for improvment. There is no need to have directly the most efficient approcach, it may be reviewed in a second time.

Thanks

trabucayre avatar Apr 13 '22 06:04 trabucayre

In fact I'm wrong. your proposal with a generic method to pass tms & tdi vector directly to the lowlevel driver seems better. In fact having, at xvc level, a generic way to analyze stream is not efficient: for jlink, with a really similar protocol, these buffers may be sent directly, with ftdi in bitbang mode (ft232r, ft231x) and anlogic cable you have to send both tms & tdi more or less bit by bit -> using directly buffer content is, again, the good way. MPSSE is more or less a specific case where it's possible to do dictinction between tms & tdi so it make sense to have this analyze at probe level to adapt correctly the stream to the situation.

trabucayre avatar Apr 16 '22 07:04 trabucayre

Thanks for the update. Implanting to the lower level of the code definitely is a major change, a strategized planning for sure will will benefit this project a lot.

For XVC protocol, Vivado first send one getinfo: request, and the client will send back the xvc version and buffer size information. I do found your ftdi buffer has 512 bytes limitation, I did not change your lower level FTDI buffer code, just a quick note for you, when you implanting the code, make the buffer size is correct send back to Vivado or you could make your lower level FTDI buffer size adjustable for XVC class.

BTW, another crazy idea about the speed I have thought for days, I tested XVC to flash a SPI flash, the bin file is about 2.9MB, it takes me about 15-25mins utilizing diligent_hs2 programmer. If you just program into FPGA ROM, which will be just about 1-2mins.

I also did test utilizing VirtualHere to virtualize the USB port into my computer via a raspberry pi 4B board, the speed is SPI flash is about 12mins.

I think the the implantation of the code with XVC is suited for the applications like remote debugging or remote testing the code, if you would like the flash the SPI, direct flashing will be the best.

If for future, if we could make the remote system's USB port become virtualized and we could utilize open source project like USB/IP to make this project works more generic for other chips like Altera or GoWin for remote debug and programing, also very interesting topic to discuss.

zhuangzard avatar Apr 16 '22 18:04 zhuangzard

I try,usually, to find the best way to avoid to have in future to rewrite/modify/rethink class (I have already done that for jtag.cpp to be less ftdi compliant and more generic). I have updated jlink with a method to receive and adapt buffer to send data accordingly to the protocol. XVC and getinfo is know (since 7bfce0fb2be45af587645dca09be87c6806bee6b openFPGALoader is able to communicate with an xvc server). I prefer to keep a generic size: 2048 seems to be the usual size (it's pagesize), building USB transaction must be done by lowlevel driver (it's already done to adapt jtag packet to devce)..

I have to read a bit more about USB/IP but it's already possible to use netcat to send configuration data through network.

trabucayre avatar Apr 17 '22 10:04 trabucayre

Hi. I have pushed a first draft for XVC server side protocol (currently limited to be used with FTDI devices). I'm interested by feedbacks/remarks/complains Thanks

trabucayre avatar Jul 06 '22 19:07 trabucayre

Any news? Can I close this issue? Thanks

trabucayre avatar Sep 04 '22 09:09 trabucayre

I am trying to make a remote XVC server and connect it to Vivado. Unfortunately not entirely successful. I am using FT2322 (Digilent Zybo board rev. 1). Cable, board and devices in JTAG chain is detected as expected.

I've tried following configurations:

sudo openFPGALoader -c digilent --verbose-level 2 --xvc --port 3121
sudo openFPGALoader -b zybo_z7_10 --verbose-level 2 --xvc --port 3121

All they are finished with error on connection attempt from Vivado:

Jtag frequency : requested 6.00MHz   -> real 6.00MHz
INFO: To connect to this xvcServer instance, use: TCP:small-rpi:3121


Press to quit
connection accepted - fd 9
setting TCP_NODELAY to 1

invalid cmd 'E'
connection closed - fd 9

Do I miss something?

nick-petrovsky avatar Dec 25 '22 19:12 nick-petrovsky

I realized my mistake, I've tried to connect using hw_server, but Virtual Cable is required. Then I tried to make xvc-client\server working on localhost, but still no luck.

This woking fine:

$ openFPGALoader -c digilent --board zybo_z7_10 --verbose-level 2 --bitstream ~/system_top.bit --freq 15000000
Jtag frequency : requested 15.00MHz  -> real 15.00MHz
Raw IDCODE:
- 0 -> 0x13722093
- 1 -> 0x4ba00477
- 2 -> 0xffffffff
- 3 -> 0xffffffff
- 4 -> 0xffffffff
found 2 devices
index 0:
        idcode   0x4ba00477
        type     ARM cortex A9
        irlength 4
index 1:
        idcode 0x3722093
        manufacturer xilinx
        family zynq
        model  xc7z010
        irlength 6
File type : bit
Open file DONE
Parse file DONE
bitstream header infos
date: 2022/12/09
design_name: system_top
hour: 20:53:18
part_name: 7z010clg400
toolVersion: 0XFFFFFFFF;Version=2020.1
userID: TRUE
load program
Flash SRAM: [==================================================] 100.00%
Done

Same from openFPGAloader xvc-server:

 $ openFPGALoader --cable  digilent  --board zybo_z7_10 --verbose-level 2 --xvc --port 2542 --freq 15000000               
Jtag frequency : requested 15.00MHz  -> real 15.00MHz
INFO: To connect to this xvcServer instance, use: TCP:small-rpi:2542


Press to quit
connection accepted - fd 9
setting TCP_NODELAY to 1

1672046873 : Received command: 'getinfo'
         Replied with xvcServer_v1.0:2048

connection closed - fd 9

Client side:

$ openFPGALoader --verbose-level 2 -c xvc-client --port 2542 ~/system_top.bit  --freq 15000000
received 20 Bytes (160)
        78 76 63 53 65 72 76 65 72 5f 76 31 2e 30 3a 32 30 34 38 0a
detected xvcServer version v1.0 packet size 1024

Somewhy it stoping in getinfo state. For vivado situation is the same.

nick-petrovsky avatar Dec 26 '22 09:12 nick-petrovsky

Okey, my appologizes for spaming you too much, the issue is partically demistifyed. There is exists some bugs with socket handling in xvc_server.cpp, look at this peace of code:

} else {
					int ret = handle_data(fd);
					printInfo("connection closed - fd " + std::to_string(fd));
                    close(fd);
                    FD_CLR(fd, &conn);
					if (ret == 1)
						throw std::runtime_error("communication failure");
				}

In the original XAPP the same behavour is achived with following code:

else if (handle_data(fd,ptr)) {

               if (verbose)
                  printf("connection closed - fd %d\n", fd);
               close(fd);
               FD_CLR(fd, &conn);
            }

I.e. connection is closed only if something goes wrong, not every time. With following patch I am able to detect devices on JTAG chain and flash the FPGA.

diff --git a/src/xvc_server.cpp b/src/xvc_server.cpp
index 93d5d96..52e5900 100644
--- a/src/xvc_server.cpp
+++ b/src/xvc_server.cpp
@@ -186,8 +186,10 @@ void XVC_server::thread_listen()
                 } else {
                                        int ret = handle_data(fd);
                                        printInfo("connection closed - fd " + std::to_string(fd));
+          if (ret) {
                     close(fd);
                     FD_CLR(fd, &conn);
+          }
                                        if (ret == 1)
                                                throw std::runtime_error("communication failure");
                                }

Commands and their output for verification:

$ ./openFPGALoader --verbose-level 2 --board zybo_z7_10 --port 2542  --xvc                                                                                                 

Jtag frequency : requested 6.00MHz   -> real 6.00MHz  
INFO: To connect to this xvcServer instance, use: TCP:helios:2542


Press to quit
connection accepted - fd 11
setting TCP_NODELAY to 1

1672068508 : Received command: 'getinfo'
	 Replied with xvcServer_v1.0:2048

connection closed - fd 11
Jtag frequency : requested 15.15MHz  -> real 15.00MHz 
1672068508 : Received command: 'settck'
	 Replied with 'B'

connection closed - fd 11
1672068508 : Received command: 'shift'
	Number of Bits  : 10
	Number of Bytes : 2


1672068508 : Received command: 'shift'
	Number of Bits  : 32
	Number of Bytes : 4


1672068508 : Received command: 'shift'
	Number of Bits  : 32
	Number of Bytes : 4


1672068508 : Received command: 'shift'
	Number of Bits  : 32
	Number of Bytes : 4


1672068508 : Received command: 'shift'
	Number of Bits  : 32
	Number of Bytes : 4


1672068508 : Received command: 'shift'
	Number of Bits  : 32
	Number of Bytes : 4


1672068508 : Received command: 'shift'
	Number of Bits  : 6
	Number of Bytes : 1


connection closed - fd 11
terminate called after throwing an instance of 'std::runtime_error'
  what():  communication failure
[1]    14218 abort (core dumped)  ./openFPGALoader --verbose-level 2 --board zybo_z7_10 --port 2542 --xvc

$ ./openFPGALoader --cable xvc-client --board zybo_z7_10 --fpga-part xc7z010clg400 --freq 15000000 --port 2542 --detect                               

Board default cable overridden with xvc-client
Board default fpga part overridden with xc7z010clg400
detected xvcServer version v1.0 packet size 1024
freq 15000000 66.666667 66 0
42 0 0 0
index 0:
	idcode   0x4ba00477
	type     ARM cortex A9
	irlength 4
index 1:
	idcode 0x3722093
	manufacturer xilinx
	family zynq
	model  xc7z010
	irlength 6

At this point a deeper debugging is needed, you can remove the exception, but its source is not obvious to me. Vivado still does not like this virtual cable. I will try to investigate why Vivado does not work with this implementation. I would like to have full remote debugging.

nick-petrovsky avatar Dec 26 '22 15:12 nick-petrovsky

Hi, and sorry for the delay. It's true closing connection must be done if an error is present or when client close connection, not each time (it's weird I have tested the code before pushing it maybe a typo before the commit).

I have to retest this code entierely

trabucayre avatar Dec 28 '22 08:12 trabucayre

I have updated xvc_server: now

  • exception is catched
  • no exeception is raised for when the socket is closed by client

I have only tested using openFPGALoader as xvc server and client. Could you try with vivado?

Thanks!

trabucayre avatar Jan 02 '23 17:01 trabucayre

Thank you for patching the xvc-server, code now looks very close to reference xilinx implementation! I can confirm that in my setup openFPGALoader works perfect locally, but still no luck with Vivado.

$ ./openFPGALoader --verbose-level 0 --board zybo_z7_10 --port 2542  --xvc --freq 15000000
Jtag frequency : requested 15.00MHz  -> real 15.00MHz
INFO: To connect to this xvcServer instance, use: TCP:small-rpi:2542


Press to quit
connection accepted - fd 9
setting TCP_NODELAY to 1

Jtag frequency : requested 10.00MHz  -> real 10.00MHz

Vivado requesting another JTAG frequency and produce an error that hardware isn't powered up.

connect_hw_server: Time (s): cpu = 00:00:02 ; elapsed = 00:00:08 . Memory (MB): peak = 1081.625 ; gain = 0.000
open_hw_target -xvc_url 192.168.104.21:2542
INFO: [Labtools 27-2285] Connecting to hw_server url TCP:localhost:3121
INFO: [Labtools 27-3415] Connecting to cs_server url TCP:localhost:3042
INFO: [Labtools 27-3414] Connected to existing cs_server.
INFO: [Labtoolstcl 44-466] Opening hw_target localhost:3121/xilinx_tcf/Xilinx/192.168.104.21:2542
ERROR: [Labtools 27-2269] No devices detected on target localhost:3121/xilinx_tcf/Xilinx/192.168.104.21:2542.
Check cable connectivity and that the target board is powered up then
use the disconnect_hw_server and connect_hw_server to re-register this hardware target.
ERROR: [Common 17-39] 'open_hw_target' failed due to earlier errors.
ERROR: [Labtoolstcl 44-513] HW Target shutdown. Closing target: localhost:3121/xilinx_tcf/Xilinx/192.168.104.21:2542
disconnect_hw_server localhost:3121

I have no good ideas how to debug this issue. How can I help to higlight the problem? In my point of view two options is exists:

  1. Record the trace of commands from Vivado
  2. Compare it with reference implementation (time consuming and pin soldering is required)

Previously @zhuangzard mentioned that he succeeded with Vivado, also I have tried his implementation and with few compiler depended fixes, but same errors occurs.

nick-petrovsky avatar Jan 03 '23 07:01 nick-petrovsky

Could you provides command line used. I have tried with vivado 2019: indeed it's not working. I see openFPGALoader receives getinfo message, answer it, but nothing more... This piece of code is similar to others implementation: I have no idea why it's not working.

trabucayre avatar Jan 05 '23 20:01 trabucayre

I use Vivado in batch mode vivado -mode tcl after open_hw_server something like this command is required open_hw_target -xvc_url 192.168.104.21:2542 and same errors occurs like in upper message. Allmost the same behaviour happends when you add your target device in GUI.

In my case, Vivado performs transactions all the time, even when it writes an error that the device is not turned on, you can spectate this with more verbose mode.

I tried several other implementations and found a working one without any issues xvcd . Implementation from Berkeley XVC-FTDI-JTAG desn't work out of the box. Without any strong reason, I can come to preliminary conslusion that some reset logic is missing. XVC-FTDI-JTAG add option for pin direction & state changing with 100 ms resolution. Unfortunatly board documentation miss proper JTAG schematics to change the values in any conscious way.

nick-petrovsky avatar Jan 13 '23 15:01 nick-petrovsky

Hi and sorry for the delay. After reading I have xvcd code, I have discovered this part. After small adaptation of my code: vivado as client is able to communicate with openFPGALoader as server. Currently not tried to program the FPGA but at least both are able to communicate. Once code cleaned I will push fix.

Thanks!

trabucayre avatar Feb 17 '23 18:02 trabucayre

@nick-petrovsky I have pushed a commit with the fix to use Vivado. Thanks again

trabucayre avatar Feb 18 '23 08:02 trabucayre

@nick-petrovsky I have pushed a commit with the fix to use Vivado. Thanks again

I have checked last commit with remote openFPGAloader as XVC server: flashed PL of Zybo_1, everything related to FPGA looks fine (even using RPi1 SSH tunnelled to Linux host with Vivado). I haven't tried ILA or some JTAG related cores, but hope it will work (I will report it later).

But some issue is still exists: I'am not able to debug ARM core. I will double check my setup and compare with direct connection. I do not see limitations why it isn't able to communicate with the CPU. Vitis gives the following error:

Error while launching program: no targets found with "name =~"APU*"". available targets: 1 DAP (Cannot open JTAG port: Invalid DAP ACK value: 3) 2* xc7z010

nick-petrovsky avatar Feb 24 '23 10:02 nick-petrovsky

It's great if your able to reproduce for the PL part. PS part seems weird, openFPGALoader do nothing specific: it just convert client requests to ftdi transaction, so I don't see why it's not possible to debug the PS core. Your board is configured in JTAG mode instead of QSPI or SDRAM? I have to try with my arty z7 board.

trabucayre avatar Feb 24 '23 20:02 trabucayre