logicanalyzer icon indicating copy to clipboard operation
logicanalyzer copied to clipboard

Sigrok driver

Open gusmanb opened this issue 2 years ago • 27 comments

Would be desirable to create a Sigrok driver as it's a widely used software to handle logic analyzers.

gusmanb avatar Jul 07 '22 11:07 gusmanb

There has been some work done, maybe you can take a look? https://github.com/perexg/picoprobe-sump

Also, it might be worth looking into the protocol used by DSLogic. DSview as a GUI is nice (and open source) and has a ton of analysers.

DatanoiseTV avatar Jul 07 '22 11:07 DatanoiseTV

For sigrok users, there's this implementation for RP2040. Triggering is done in software on the PC, on the last revisions (starting from rev2), and it's more limited since you can only capture at 120Msps for 400ks (so typically, less than 1/3 of a second at full resolution, else it's limited to 500ksps, for USB streaming bandwidth).

However, this repository contains everything in one place and it's very well explained, so if the software were running on Linux, it'd be perfect to use. Sigrok is more complex to set up and use, not sure about the linked driver.

X-Ryl669 avatar Jul 07 '22 12:07 X-Ryl669

There has been some work done, maybe you can take a look? https://github.com/perexg/picoprobe-sump

Also, it might be worth looking into the protocol used by DSLogic. DSview as a GUI is nice (and open source) and has a ton of analysers.

From what I see in that implementation it seems to use the SUMP protocol instead of implementing a custom driver for Sigrok. I will take a look but for now I will prefer to go the driver route as the analyzer exposes some non-standard functions (for the trigger type) and I think that with a custom driver as it exposes its own config options (that's what I understand from the documentation of Sigrok driver programming) it will be more flexible.

And about DSView, it seems that it uses a modified libsigrok version, so in theory the same driver for Sigrok should be enough to use it.

gusmanb avatar Jul 07 '22 12:07 gusmanb

Comments from someone who's built a sigrok driver:

  1. Don't expect your changes to get pulled into the main pulseview repo any time soon, maybe ever. There are currently 60 pending pull requests at https://github.com/sigrokproject/libsigrok , including #181 (mine posted Mar 1,2022) and #137 (tiny logic friend) among some others that use some kind of MCU to implement a logic analyzer. So far we've seen no responses from a sigrok moderator as to which if any would ever be pulled. See this thread on the sigrok email reflector: https://sourceforge.net/p/sigrok/mailman/message/37678040/

  2. One option might to try to merge your PIO code with the PICO code that I have and my sigrok driver. I already support reading data from a dma'd stream and sending to the host. And I have support for trigger settings and run length encodings. If your interested in that path let me know. I currently release windows builds to a dropbox location (because pulseview installers are greater than the max github file size), and many folks have built pulseview for Mac and Linux. I could probably also generate Mac and Linux installers as well, but nobody has pushed me to do it...

  3. I'm not sure how you might try to communicate the three triggering modes you have. While PV has trigger settings for each channel (rising/falling/level) there is no way to send miscellaneous information to the device that I'm aware of. There is some kind of "firmware" mode for devices that need to load a firmware and it might be possible to leverage that to just read a user defined text file to send configuration. Or you might just build 3 different images, or select the mode via strapping pins or .... If you find a way to communicate misc settings from the user to the device I'd be interested in it.

Thanks, Shawn (pico-coder)

pico-coder avatar Jul 09 '22 02:07 pico-coder

Any opensource logical analyzer (and perhaps even oscilloscope) project should focus on sigrok support. i don't understand why you would handle development of your own protocol decoders, when sigrok (pulseview) already has all you need.

Harvie avatar Jul 09 '22 07:07 Harvie

Comments from someone who's built a sigrok driver:

1. Don't expect your changes to get pulled into the main pulseview repo any time soon, maybe ever.  There are currently 60 pending pull requests at https://github.com/sigrokproject/libsigrok , including #181 (mine posted Mar 1,2022) and #137 (tiny logic friend) among some others that use some kind of MCU to implement a logic analyzer.  So far we've seen no responses from a sigrok moderator as to which if any would ever be pulled.  See this thread on the sigrok email reflector: https://sourceforge.net/p/sigrok/mailman/message/37678040/

2. One option might to try to merge your PIO code with the PICO code that I have and my sigrok driver.  I already support reading data from a dma'd stream and sending to the host.  And I have support for trigger settings and run length encodings.  If your interested in that path let me know.  I currently release windows builds to a dropbox location (because pulseview installers are greater than the max github file size), and many folks have built pulseview for Mac and Linux.   I could probably also generate Mac and Linux installers as well, but nobody has pushed me to do it...

3. I'm not sure how you might try to communicate the three triggering modes you have.  While PV has trigger settings for each channel (rising/falling/level) there is no way to send miscellaneous information to the device that I'm aware of.  There is some kind of "firmware" mode for devices that need to load a firmware and it might be possible to leverage that to just read a user defined text file to send configuration.  Or you might just build 3 different images, or select the mode via strapping pins or ....  If you find a way to communicate misc settings from the user to the device I'd be interested in it.

Thanks, Shawn (pico-coder)

Thanks a lot for all the info Shawn, it's really useful.

I have re-read the sigrok wiki and it seems I misunderstood how the driver worked, I thought the driver told sigrok which kind of settings wanted to present to the user but now that I read it more carefully it asks for a predefined set, not custom defined settings...

I will take a look again in the future but for now I will put this idea on hold, I will concentrate on the multiplatform app that is going to be a lot faster to develop and will allow me to implement the settings explicitly for my device.

Also, now that I'm reading the docs it seems that sigrok triggering specification is very limited, only edge triggers for a channel what completelly breaks the purpose of the device. I do a lot of retrocomputing stuff (programming, hardware development and repairs) and having 24 channels is not a coincidence, my target is to have a device that is capable of capturing data when a concrete address is accessed as it helps a lot for diagnostics (and even debugging), so if sigrok is not able to configure complex patterns like a 16 bit number for the address bus makes no sense for me to implement it.

Maybe what can be a lot easier and flexible enough would be to add to my app support for exporting captured data in a compatible format so you can capture with it and then analyze the data with sigrok. Even a more simple program, a console app, could be created to configure and trigger the captures and then use sigrok to analyze.

Cheers.

gusmanb avatar Jul 09 '22 08:07 gusmanb

Any opensource logical analyzer (and perhaps even oscilloscope) project should focus on sigrok support.

Sorry but I completelly disagree, that's a nice way to create a monopoly, having a single option even if it's opensource is always a bad idea, who knows what can happen to a project in the future, it may disappear, it can be bought or whatever.

i don't understand why you would handle development of your own protocol decoders

The answer is simple, for fun primarily. If I do this kind of projects is not to gain anything, I do these in my spare time because I enjoy creating hardware and software, and then I publish them as opensource as a bit of a kind of retribution. I have learnt a lot from open source projects, I use open source tools, and this is my way to contribute to the community. My projects may not have the goal to become a "golden standard" or commercial products, but if someone can learn anything from what I publish, finds it useful or has fun using it then I'm more than happy.

Cheers.

gusmanb avatar Jul 09 '22 08:07 gusmanb

If you want to use the LogicAnalyzer under Linux or MacOSX before the multiplatform app is completed or want to use it with Sigrok/PulseView check Release 1.0.0.3

gusmanb avatar Jul 13 '22 15:07 gusmanb

Being able to import data CSV to pulseview is certainly good start. But providing real sigrok driver to be able to configure device and capture directly through sigrok or pulseview would be even better :innocent:

Harvie avatar Aug 02 '22 08:08 Harvie

I finally got some more time to have a look around at your implementation. I was at first wanting to somehow merge the best attributes of each ours into a common code base, but we make a lot of different fundamental assumptions on how DMA engines etc are managed (yours is very well focused on accurate pre/post trigger capture, mine focuses on overlapping data transmit with capture to maximize usb link bandwidth and also to optimize trace storage (at 4 channels I can store 800k samples)). And while such a merge is possible it would likely get full of bugs...

I'm now thinking the way to do this is to leave both of our code separate but combined into the same build and based on the user configs figure out which to use. Both of our code should easily fit in the 1MB of flash, then all I have to do is tweak your DMA dumps to be send over the host protocol I set up. (I pick my protocol because of the RLE and because it supports analog channels) But I want to make sure I understand your three PIO modes, as I think I originally thought they were more complicated then they are. Specifically I thought the "complex" mode would support triggering on a sequence of values but I think it instead is a single level value. So assuming the user sets up a bunch of triggers in pulseview which could be level/rising I think the algorithm is something like this: A flow chart starting at the top: 1)If more than 32k samples are selected, or if any analog channels are selected, or if no trigger is specified then use my flows. 2) If a single channel is picked as rising or falling use your simple.. 3) If up to 5 channels of level value triggers are picked use your fast. 4) If 5-16 channels of level value triggers are picked use your complex. 5) If none of 2-4 apply then use my flows, with the understanding that high speed captures likely won't be successful.

That sound about right? Also, I've had lots of success running the PICO sys_clk overclocked to 240Mhz running PIO, ADC, DMA, and the cpu cores. So I would tend to just assume that we'd run sys_clk at 2x the sample rate in high speed cases. That still might miss a trigger in complex mode, but usually users run the sample rate a bit higher than the signal rates anyway. And perhaps we might run sys_clk at 3x sample rate if it remained under 240Mhz. Thanks, Shawn https://github.com/pico-coder/sigrok-pico

pico-coder avatar Sep 25 '22 23:09 pico-coder

I finally got some more time to have a look around at your implementation. I was at first wanting to somehow merge the best attributes of each ours into a common code base, but we make a lot of different fundamental assumptions on how DMA engines etc are managed (yours is very well focused on accurate pre/post trigger capture, mine focuses on overlapping data transmit with capture to maximize usb link bandwidth and also to optimize trace storage (at 4 channels I can store 800k samples)). And while such a merge is possible it would likely get full of bugs...

I'm now thinking the way to do this is to leave both of our code separate but combined into the same build and based on the user configs figure out which to use. Both of our code should easily fit in the 1MB of flash, then all I have to do is tweak your DMA dumps to be send over the host protocol I set up. (I pick my protocol because of the RLE and because it supports analog channels) But I want to make sure I understand your three PIO modes, as I think I originally thought they were more complicated then they are. Specifically I thought the "complex" mode would support triggering on a sequence of values but I think it instead is a single level value. So assuming the user sets up a bunch of triggers in pulseview which could be level/rising I think the algorithm is something like this: A flow chart starting at the top: 1)If more than 32k samples are selected, or if any analog channels are selected, or if no trigger is specified then use my flows. 2) If a single channel is picked as rising or falling use your simple.. 3) If up to 5 channels of level value triggers are picked use your fast. 4) If 5-16 channels of level value triggers are picked use your complex. 5) If none of 2-4 apply then use my flows, with the understanding that high speed captures likely won't be successful.

That sound about right? Also, I've had lots of success running the PICO sys_clk overclocked to 240Mhz running PIO, ADC, DMA, and the cpu cores. So I would tend to just assume that we'd run sys_clk at 2x the sample rate in high speed cases. That still might miss a trigger in complex mode, but usually users run the sample rate a bit higher than the signal rates anyway. And perhaps we might run sys_clk at 3x sample rate if it remained under 240Mhz. Thanks, Shawn https://github.com/pico-coder/sigrok-pico

Hi.

It may be more complex than it looks. The biggest problem you will face is transferring the data while the PIO is capturing. I don't think you will be able to stream the data through USB directly, the bus fabric is running at its limits, when I tried to receive data using USB at the same time that capturing using DMA the device most of the times crashed because it got clogged, it's writting to memory on each 2 cycles and needs to read instructions at the same time, so there is no time left for the bus fabric to serve more access requests. If you check my code the capture is stored directly to memory and processed afterwards and meanwhile I keep the CPU activity as low as possible, it guarantees the possibility to write a 32bit word each two cycles and that nothing will delay those writes, else the bus may stall and you will lose captures making the device unreliable (in the best case, the worst would be a crash). The only safe way that comes to my mind to make it work is let the capture as is without messing with its DMA's and send the data after processing like it is doing right now, you can then apply the RLE encoding to maintain the protocol compatibility.

Next, I use all the available memory in the device, there is very little room to add anything else, if I recall it right there are left 16Kb more or less, so adding both implementations may get very hard (not sure how much ram does your implementation use), maybe you can reuse the buffers I create, but beware, these are aligned for the DMA ring so they can't be moved.

About overclocking, the device is already overclocked to 200Mhz, I choosed that speed because is the minimum that I found that all my boards were stable, I have some boards that reach up to 320Mhz, but others top at 200Mhz so I used that speed to make it safe no matter which board you have, if more speed is needed then a custom PCB could be created instead of using the pico as-is, increasing the core voltage and using a faster flash could make it reach 400Mhz, but that is out of the scope of this project (create a powerful enough device that anyone can assemble without spending much money).

Cheers.

gusmanb avatar Oct 10 '22 12:10 gusmanb

I have been studying for a long time the possibility of developing a fast and cheap logic analyzer. My first tests were with Arduino. And then I've been studying with ESP32. I always thought about the possibility of using an external high speed static memory, like these:

  • 23LC1024 ( 128k x 8 - 20 Mhz)
  • old CY7C199 ( 32K x 8 - 100 MHz)
  • old IS61C6416AL ( 64K x 16 - 80 MHz)

Another crazy possibility would be to use old SDRAM DIMM cards like 128 MB (3,3V - 133 MHz). But it is very complex to refresh the memory. I still haven't discarded this idea.

When I first met Raspberry Pico I was amazed by the state machines. I'm still studying and learning. But from what I'm seeing there are still some limitations.

Gustavomurta avatar Nov 21 '22 13:11 Gustavomurta

I have been studying for a long time the possibility of developing a fast and cheap logic analyzer. My first tests were with Arduino. And then I've been studying with ESP32. I always thought about the possibility of using an external high speed static memory, like these:

* 23LC1024 ( 128k x 8 - 20 Mhz)

* old CY7C199 ( 32K x 8 - 100 MHz)

* old IS61C6416AL ( 64K x 16 - 80 MHz)

Another crazy possibility would be to use old SDRAM DIMM cards like 128 MB (3,3V - 133 MHz). But it is very complex to refresh the memory. I still haven't discarded this idea.

When I first met Raspberry Pico I was amazed by the state machines. I'm still studying and learning. But from what I'm seeing there are still some limitations.

Hi.

The problem with an external static memory is that there are not enough IO's for it. For example, 64Kb of 8 bits memory will require at least 26 IO's, address + data + control (r/w and CE minimum) and the pico exposes only 30 IO's, and the worst, the analyzer is transferring a full 32 bit word each 2 cycles at 200Mhz, so to be fast enough it would need to be 32 bit wide with 10ns of access time what would require 50 IO's to control the memory, that's why I only used the internal memory, it is able to transfer 32bits of data per cycle without using any IO.

Usually logic analyzers this fast are based on hybrid technologies, they use an FPGA to capture the data and an ARM cpu to interface with it, but precisely the idea of this analyzer is to create something simple that does not require such advanced knowledge.

BTW, If you want to use a more modern static asynchronous memory I'm using an IS61WV20488FBLL (2Mb x 8, 10ns) in another project that I still have not published and it works like a charm, it's for a very fast 8 bit computer that I'm designing based on the ez80190F CPU.

imagen

imagen

Cheers!

gusmanb avatar Nov 21 '22 16:11 gusmanb

Great job! Congratulations!

Gustavomurta avatar Nov 21 '22 18:11 Gustavomurta

It may be more complex than it looks. The biggest problem you will face is transferring the data while the PIO is capturing. I don't think you will be able to stream the data through USB directly, the bus fabric is running at its limits, when I tried to receive data using USB at the same time that capturing using DMA the device most of the times crashed because it got clogged, it's writting to memory on each 2 cycles and needs to read instructions at the same time, so there is no time left for the bus fabric to serve more access requests. If you check my code the capture is stored directly to memory and processed afterwards and meanwhile I keep the CPU activity as low as possible, it guarantees the possibility to write a 32bit word each two cycles and that nothing will delay those writes, else the bus may stall and you will lose captures making the device unreliable (in the best case, the worst would be a crash).

Dr Gusman, have you thought about using the second processor of Rasp Pico to distribute the tasks? learnembeddedsystems.co.uk => basic-multicore-pico-example

Gustavomurta avatar Nov 22 '22 01:11 Gustavomurta

The problem with an external static memory is that there are not enough IO's for it. For example, 64Kb of 8 bits memory will require at least 26 IO's, address + data + control (r/w and CE minimum) and the pico exposes only 30 IO's, and the worst, the analyzer is transferring a full 32 bit word each 2 cycles at 200Mhz, so to be fast enough it would need to be 32 bit wide with 10ns of access time what would require 50 IO's to control the memory, that's why I only used the internal memory, it is able to transfer 32bits of data per cycle without using any IO.

Dr Gusman, this year I bought this great thing - ESP32S3. I have done some tests.

ESP32-S3 is a dual-core XTensa LX7 MCU, capable of running at 240 MHz. Apart from its 512 KB of internal SRAM, it also comes with integrated 2.4 GHz, 802.11 b/g/n Wi-Fi and Bluetooth 5 (LE) connectivity that provides long-range support. It has 45 programmable GPIOs and supports a rich set of peripherals. ESP32-S3 supports larger, high-speed octal SPI flash, and PSRAM with configurable data and instruction cache. My ESP32-S3-DevKitC-1 has 32 MB Flash (can use "only" 16 MB at this time) and 8 MB PSRAM. But it has only 36 accessible GPIOS. I still don't know if I can use them all

Regards.

Gustavomurta avatar Nov 22 '22 02:11 Gustavomurta

It may be more complex than it looks. The biggest problem you will face is transferring the data while the PIO is capturing. I don't think you will be able to stream the data through USB directly, the bus fabric is running at its limits, when I tried to receive data using USB at the same time that capturing using DMA the device most of the times crashed because it got clogged, it's writting to memory on each 2 cycles and needs to read instructions at the same time, so there is no time left for the bus fabric to serve more access requests. If you check my code the capture is stored directly to memory and processed afterwards and meanwhile I keep the CPU activity as low as possible, it guarantees the possibility to write a 32bit word each two cycles and that nothing will delay those writes, else the bus may stall and you will lose captures making the device unreliable (in the best case, the worst would be a crash).

Dr Gusman, have you thought about using the second processor of Rasp Pico to distribute the tasks? learnembeddedsystems.co.uk => basic-multicore-pico-example

Yeah, I know how to use the two cores, in fact I created an VGA graphics card for the 8 bit computer and it uses one core to receive commands and process graphic functions, the second core renders the framebuffer and four PIO programs generate the VGA signal :) ( here is an example if you're curious about it: https://twitter.com/gusmanb/status/1362948159333359617 )

The cores of the pico while capturing data are basically sleeping, they do nothing, the problem is not a lack of CPU resources but a saturation of the bus fabric. The PIO processors are running at maximum speed all the time writting 32 bit data to the FIFOs and a chain of DMA's in ring transfer this data to the main memory continuously, that saturates the bus fabric.

As you can see in this diagram everything is connected to the bus fabric, so if it gets saturated it affects everything. imagen

gusmanb avatar Nov 22 '22 08:11 gusmanb

The problem with an external static memory is that there are not enough IO's for it. For example, 64Kb of 8 bits memory will require at least 26 IO's, address + data + control (r/w and CE minimum) and the pico exposes only 30 IO's, and the worst, the analyzer is transferring a full 32 bit word each 2 cycles at 200Mhz, so to be fast enough it would need to be 32 bit wide with 10ns of access time what would require 50 IO's to control the memory, that's why I only used the internal memory, it is able to transfer 32bits of data per cycle without using any IO.

Dr Gusman, this year I bought this great thing - ESP32S3. I have done some tests.

ESP32-S3 is a dual-core XTensa LX7 MCU, capable of running at 240 MHz. Apart from its 512 KB of internal SRAM, it also comes with integrated 2.4 GHz, 802.11 b/g/n Wi-Fi and Bluetooth 5 (LE) connectivity that provides long-range support. It has 45 programmable GPIOs and supports a rich set of peripherals. ESP32-S3 supports larger, high-speed octal SPI flash, and PSRAM with configurable data and instruction cache. My ESP32-S3-DevKitC-1 has 32 MB Flash (can use "only" 16 MB at this time) and 8 MB PSRAM. But it has only 36 accessible GPIOS. I still don't know if I can use them all

Regards.

Yeah, I have some wroom and wrover and personally I don't like them, not by the hardware but the documentation, is very poor and I like to know the guts of the things I use, I usually use STM's that are very well documented, the RP2040, PIC's, ATMEGA, etc. In any case, any serial memory like the external ones that are implemented in the ESP modules are too slow for this, a 100Mhz memory even using a quad channel SPI is too slow, you can transfer at most 4 bits per clock, and for the logic analyzer it needs to write 32 bits per cycle (an that without even taking in account the SPI overhead) so it's not suitable for this. Also, the ESP32 assembler is not deterministic, it's a pipelined processor so you cannot be 100% sure how much time an instruction will take what may lead to incorrect samplings (and interrupts affect the code flow, that's why the PIO units are so great, they're 100% deterministic and isolated, you can be 100% confident that unless you saturate the fifos and do PUSH or PULL in wait mode an instruction will always take a single cycle). Also at 240Mhz you only have approximately two cycles to sample the GPIO and write to memory and I'm not sure if that is even possible with its assembler (forget doing this with anything else that is not assembler, it will be too slow in comparison).

I have some STM32H750 modules which run at 480Mhz and have 1Mb of RAM, with those it is possible to implement everything, but they're a lot more expensive, a 750 module is around 30€ while the pico is only 3€ so it is 10 times more expensive, and the idea behind this is to keep it as cheap as possible so anyone can build it.

Cheers!

gusmanb avatar Nov 22 '22 09:11 gusmanb

Really the Espressif documentation is flawed. Some information is missing and I have already encountered numerous errors. I thought about even helping with the correction, but I saw that there is a lot of work to be done. But I think microcontrollers have a lot of potential.

Cheers!

Gustavomurta avatar Nov 22 '22 12:11 Gustavomurta

The problem with an external static memory is that there are not enough IO's for it. For example, 64Kb of 8 bits memory will require at least 26 IO's, address + data + control (r/w and CE minimum) and the pico exposes only 30 IO's,

Rather than reading the data into the pico and then writing to external ram, what about having the level shifters connect to the ram's data bus? You really only need to access the ram sequentially, so a counter chip could be used to drive the ram's address bus. That gets rid of the need for address I/O on the pico.

I/O pins would still be needed for

  • ram data bus (to read ram content) and control pins
  • clear and increment on the counter
  • enable going to the level shifters (or to tri-state buffers if the level shifters lack tri-state output).

That should both free up pins and reduce bus contention.

mpictor avatar Sep 11 '23 03:09 mpictor

The problem with an external static memory is that there are not enough IO's for it. For example, 64Kb of 8 bits memory will require at least 26 IO's, address + data + control (r/w and CE minimum) and the pico exposes only 30 IO's,

Rather than reading the data into the pico and then writing to external ram, what about having the level shifters connect to the ram's data bus? You really only need to access the ram sequentially, so a counter chip could be used to drive the ram's address bus. That gets rid of the need for address I/O on the pico.

I/O pins would still be needed for

* ram data bus (to read ram content) and control pins

* `clear` and `increment` on the counter

* `enable` going to the level shifters (or to tri-state buffers if the level shifters lack tri-state output).

That should both free up pins and reduce bus contention.

Still not feasible. If you take in account the delays between the clock for the counters and the propagation delays you will end with at least 30ns of delay (and I'm being extremely optimistic, something more real would be around 50-80ns) and to capture at 100Mhz you only have 10ns. Also you would need to reset the counters in the middle of the capture (the memory is used as a circular buffer to be able to preserve pre-trigger samples) and that would also take instructions to be executed adding even more delay when the limit of the memory is reached...

It is a lot easier to create a new analyzer based on an FPGA that uses DDR for storage, there are really cheap modules like this, it has 20kLUT and 128Mb of DDR3 just for 25€, with that is feasible to create a true beast of analyzer. In fact I have one of these and I'm in the process of understanding the provided IPs for DDR and so on but my health has restricted me how much time I can spend on projects and I also have more projects on the go so I didn't had enough time to learn how to properly use it...

gusmanb avatar Sep 11 '23 03:09 gusmanb

I enjoyed this discussion. I have implemented pico-coder's continuous sample LA on my board. Today I'm testing gusmanb's capture and dump LA on my board. I came to the same conclusions mentioned - that the two types really can't play well in the same firmware.

Thank you both for all your work! It's been fun!

DangerousPrototypes avatar Nov 07 '23 13:11 DangerousPrototypes