rosflight_firmware icon indicating copy to clipboard operation
rosflight_firmware copied to clipboard

F4 External I2C Bug

Open superjax opened this issue 5 years ago • 16 comments

I just found a bug over the weekend.

TL;DR if the I2C tries to contact a device that isn't there, it throws a fit and won't talk any more. That's why the onboard I2C works (those devices are always there) and why the external I2C seems buggy.

Long Version: The STM32F4 I2C peripheral does not report an ACK. It throws an interrupt on an NACK, but (probably for performance reasons), will let you continue on writing bytes to an absent device without any error indication even if the device isn't there.

I think this is the reason I had never seen a pure "polling" driver from STM. All the blocking implementations are still interrupt-based, and I think it's because you need to be able to handle the ACKNOWLEDGE_FAILURE interrupt, which can come after you've already started sending bytes to the device.

Anyway, upon close inspection of the datasheet, it tells me that i have to make sure to send a STOP condition if I get one of these interrupts and reset the peripheral. I'm not looking for the interrupt in my blocking I2C::write function so I end up leaving the peripheral in a bad state, and the SDA and SCL lines just remain low all the time, even though from a software perspective, things look like they are working.

This is a priority 1 bug, because a lot of people have been having poor performance from the airspeed. #140 #305. I'll work on it as soon as I can, probably sometime this week.

superjax avatar Jun 24 '19 16:06 superjax

Funny that this came up this weekend.

I also spent some time this weekend with the revo's I2C. I noticed the same thing about AF not being set until after you send some bytes on the bus.

https://github.com/plusk01/airdamon_f3/issues/11

At least I got to use a goto...

plusk01 avatar Jun 24 '19 17:06 plusk01

Nice! I've never used goto. haha!

Yeah, I played with it a bit, and it looks like AF will get sent either by trying to send more bytes, or if you wait 25 milliseconds. Which is like 100 years in microcontroller land... I would really love a "got an ACK" status bit. (Maybe I could read the GPIO line?).

Is the airdamon driver working nicely? I might just start there in my re-write of the write function.

superjax avatar Jun 24 '19 17:06 superjax

Yeah, it seems to be working okay---I started bringing up I2C and testing with the baro ... but man, it is such a lame sensor to work with (the whole conversion and waiting for ADC before you read).

A couple of thoughts:

  • airdamon I2C works, but is currently blocking.

  • I would like to understand the need for / get rid of the write(0, 0, 0) call. I noticed that in a few places in airbourne as well.

  • It would be nice to make the I2C driver more flexible. I was actually setting up the I2C so I could use it as a slave to a Snapdragon flight board, translating from I2C to PWM for ESCs (if you have any strong opinions on doing that, I'd love to hear them). Anyways, I was thinking of an interface that did things like:

    • begin(uint8_t address)
    • write(uint8_t byte)
    • end(bool async, callback)
    • readFrom(uint8_t address, bool blocking, callback)
    • read() / available()

    which basically seems like Arduino's Wire interface. These calls (+/-) would cover the four I2C use cases: master tx/rx, slave tx/rx.

  • Also, I should probably compare notes with you more often... In some ways it would be nice to get a nice custom library with a cool HAL for STM32 chips; but in other ways I wonder why not just use an existing library like libopencm3?

plusk01 avatar Jun 24 '19 17:06 plusk01

I had no idea about libopencm3. That looks really cool! Seems like the sort of thing we could contribute to even! Maybe I'm wrong, but I just read through their i2c driver and I don't think it's asynchronous.

After I stop being so busy, maybe I can reach out to those guys and propose an asychronous API?

My biggest reason for maintaining my own driver layer is for asynchronous communication. At the time, I couldn't find anyone doing DMA at all, and there were only a few asynchronous applications which were written for a specific sensor. Perhaps that's changed, but that was the main reason.

I am not opposed to the Wire-way of writing doing the blocking I/O configuration. However, it seems to make things more difficult in an asynchronous system. I am pretty confident/happy with the bulk i2c read in airbourne, and I like that it's DMA, however maybe moving towards your API would be better?

I'll try it out in my re-write.

superjax avatar Jun 24 '19 18:06 superjax

I agree that async I2C (and everything else) is the way to go.

I don't know too much about libopencm3, but I think the idea is that it is a better std periph lib. So, if you want async comms, you have to wrangle them yourself. So, maybe what I'm saying is that there is still a need for an airbourne driver layer for flight controllers, but we should leverage libopencm3 instead of std periph to give us chip abstraction out-of-the-box. Something to look into.

As far as the Wire API, I think the goodness is in the interface, not necessary the implementation. I don't think blocking is the right thing to do here. Instead of airbourne's way of prescribing the types of I2C transmission that can be made (write(addr, reg, data)), the API is split into divisible I2C chunks (i.e., master START, send bytes, read bytes, master STOP).

plusk01 avatar Jun 24 '19 18:06 plusk01

I guess another consideration is LGPLv3 (libopencm3) compatibility with BSD-3. I think it's okay as long as your linking to it as a library ... but I am always confused with licensing.

plusk01 avatar Jun 24 '19 18:06 plusk01

Okay, so, I tried the Wire API last night, and there is one big thing that I don't love:

While the blocking methods make sense, there is a lot of logic that goes into an I2C transfer, which means tying together a bunch of callbacks if you want to do something truly asynchronously.

Here is the logic for the bulk read. Each -> is an interrupt.

addr+write -> register -> addr+read -> DMA -> Stop ⌍ 
                                                   ⌎>callback ->

However, if we do the Wire Api we have to instead do

add+write ⌍      ⌌> register ⌍      ⌌> add+read ⌍      ⌌>  DMA -> Stop ⌍
          ⌎> cb1 ⌏           ⌎> cb2 ⌏           ⌎> cb3 ⌏               ⌎> cb4 ->

So, you now have 4 callbacks instead of 1, and the first three are basically trampolines that will get repeated for the vast majority of sensors. I don't really love that, it's a lot of trampolines.

Do you have any good solution? I can see the advantage of using a familiar API, but that's a lot of misdirection in the code that is simplified a lot by I2C::read(addr, reg, data, len, callback)

superjax avatar Jun 25 '19 15:06 superjax

One solution:

In the F1 driver, I ended up building a circular buffer of I2C jobs (because the duty cycle was really close to 100%) and API actually stacked jobs onto this queue while the driver pulled them off. If this were our API in this case, the jobs would be a lot simpler, but the trampolines would be handled for us.

i2c.begin(addr, WRITE, nullptr);
i2c.write(reg, nullptr);
i2c.begin(addr, READ, nullptr);
i2c.read(reg, data, len, callback);

these would all whip through super fast, because all they do is add jobs to the queue, and the driver would simply call the next one as it becomes available, finishing on the last job with the callback

That might be a nice way to handle it. What do you think?

superjax avatar Jun 25 '19 15:06 superjax

ha i had the same realization about all the misdirection last night. I was wondering if something like an i2c job queue would be useful and I like the way you described it. I'll see if I can try it out, too.

plusk01 avatar Jun 25 '19 15:06 plusk01

Okay, so @dpkoch and I talked about this a little bit.

The impetus for this is trying to support the TFMini-I2c laser altimeter. It has a very non-standard API, which is annoying.

what do you think of this? In the driver, we have a circular buffer of tasks.

enum {
  START,
  WRITE_MODE,
  READ_MODE,
  WRITE,
  READ,
  STOP,
} task_type_t;

struct {
  task_type_t task;
  uint8_t addr;
  uint8_t* data;
  size_t len;
  void (*cb)(uint8_t);
} task_t

task_t tasks_[BUFFER_SIZE];

Then, you'd queue up the jobs like so they show up like this (here is an example of what we would have to do for the TFMini)

| task       | addr | data* | len | cb |
|------------|------|-------|-----|----|
| START      |      |       |     |    |
| WRITE_MODE | addr |       |     |    |
| WRITE      |      | data* | len |    |
| START      |      |       |     |    |
| READ_MODE  | addr |       |     |    |
| READ       |      | data* | len |    |
| STOP       |      |       |     | cb |

And the event_handler would now be responsible for making sure that the last interrupt matches the last task, then performing the next task.

If there is ever an error, then we have to zoom up to the next STOP, send the STOP and call the callback.

I think this makes the API very flexible, and still asynchronous! DMA is still cool because reading and writing are done in bulk.

superjax avatar Jun 25 '19 23:06 superjax

that seems pretty reasonable. I haven't had a real chance to look at this much yet, but two questions:

  1. What happens if two sensors use the same i2c object to perform some task one right after another. Could that mess up the job buffer?
  2. For an multi-byte write, how are you planning on managing data?
    void do_the_thing()
    {
       i2c_.add_task(START);
       i2c_.add_task(WRITE_MODE, slave_addr);
       uint8_t data[2] = { 0x01, 0x02 };
       i2c_.add_task(WRITE, data, 2);
       i2c_.add_task(STOP, nullptr, 0, cb_);
    } // data is now out of scope
    

unless each task has a { uint8_t tx_data[LEN] } it gets copied to? Or it gets copied to some write buffer that all tasks share? that could be tricky, but would prevent having the struct have separate tx and rx data fields / having fixed-length buffers allocated for every task (which seems wasteful).

plusk01 avatar Jun 26 '19 16:06 plusk01

So, I actually worked out some of those details while I implemented this last night. In my solution, the I2C class has an internal write_buffer_ which holds the data-to-be-written. There is some funny logic to make sure that stays circular when you're writing, but it seems to work.

As for two sensors at the same time. I think that it can be handled in two ways. The first, and easiest, is to block or return BUSY until the bus is free. The second is to make sure that everything is safe. We don't acutally have multiple threads, so as long as we never queue a job from an interrupt, we should never have any issues.

I have to say, I really, really like this new implementation.

The new I2C driver is now on the I2C2 branch of airbourne_f4 in the i2c2 namespace. So far, the barometer and laser altimeter are working, and I'm actually blocking if a sensor tries to write a job while another is in progress. I'll work on the queue thing later.

superjax avatar Jun 26 '19 22:06 superjax

What is the status on this fix? The AUVSI-SUAS team is having issues with this now. Is there a good workaround, or a mostly working branch?

BillThePlatypus avatar Oct 21 '19 20:10 BillThePlatypus

@dpkoch and @mmmfarrell probably know best.

I will have some time this week to push on this too if I know what needs work.

superjax avatar Oct 21 '19 22:10 superjax

@BillThePlatypus There isn't a working branch or anything yet. I started to do some work off of the I2C2_dan branch of airbourne_f4, but it still needs a decent amount of work. If you have some time to work on this, I can walk you through what I understand about the new i2c driver

mmmfarrell avatar Oct 22 '19 14:10 mmmfarrell

Has anyone checked if the errata for the F405 i2c peripheral contribute to this issue? I have no idea what the status of this is, but I just ran into a errata-based issue the other day that made me think of this. ¯_(ツ)_/¯

len0rd avatar Feb 11 '20 16:02 len0rd

Going forward we'll be switching to H7 processors with new drivers, so I'm going to close this issue as I don't expect many people to use F4s at this point.

bsutherland333 avatar Oct 04 '23 20:10 bsutherland333