avr-hal icon indicating copy to clipboard operation
avr-hal copied to clipboard

i2c zero-sized read transfer hangs

Open sp4ghet opened this issue 1 year ago • 8 comments

I'm using an Arduino Mega 2560 R3 with a MH-Z16 CO2 sensor in I2C mode

I basically just took the sample for I2Cdetect verbatim (just swapped the order of Read/Write)

#![no_std]
#![no_main]

use arduino_hal::{prelude::*, I2c};
use panic_halt as _;

const NDIR_ADDR: u8 = 0x4D;

#[arduino_hal::entry]
fn main() -> ! {
    let dp = arduino_hal::Peripherals::take().unwrap();
    let pins = arduino_hal::pins!(dp);

    let mut serial = arduino_hal::default_serial!(dp, pins, 57600);

    let mut led = pins.d13.into_output();
    let mut i2c = arduino_hal::I2c::new(
        dp.TWI,
        pins.d20.into_pull_up_input(),
        pins.d21.into_pull_up_input(),
        100000,
    );

    ufmt::uwriteln!(&mut serial, "Read direction test:\r").unwrap_infallible();
    i2c.i2cdetect(&mut serial, arduino_hal::i2c::Direction::Read)
        .unwrap_infallible();

    ufmt::uwriteln!(&mut serial, "Write direction test:\r").unwrap_infallible();
    i2c.i2cdetect(&mut serial, arduino_hal::i2c::Direction::Write)
        .unwrap_infallible();

    loop{};
}
/// serial output
Read direction test:
-    0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
00:       -- -- -- -- -- -- -- -- -- -- -- -- -- --
10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
30: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
40: -- -- -- -- -- -- -- -- -- -- -- -- 4d

It seems to run until it finds the sensor, and then it freezes. the Write direction seems to run fine, but the read direction hangs. I've tried adding some code below the i2cdetect but it doesn't appear as if any of it actually runs.

I've tried using the normal Arduino/C++ code provided by the manufacturer and it worked (connected and sent measurements over serial) so I don't think I've setup the hardware incorrectly, but please let me know if I fumbled something obvious.

I also tried just calling ping and it seemed to behave the same way (unsurprisingly..) where the ping goes through, but all following calls to i2c hang/panic(?)

    let pinged = i2c
         .ping_device(NDIR_ADDR, arduino_hal::i2c::Direction::Read)
         .unwrap();
    ufmt::uwriteln!(&mut serial, "Pinged: {}\r", if pinged { 1 } else { 0 }).unwrap();

    ufmt::uwriteln!(&mut serial, "Read direction test:\r").unwrap_infallible();
// this code doesn't run (except the first row of labels)
    i2c.i2cdetect(&mut serial, arduino_hal::i2c::Direction::Read)
        .unwrap_infallible();

sp4ghet avatar Jan 01 '25 07:01 sp4ghet

If it is necessary this is the product page (links to datasheets and sample code)

https://github.com/SandboxElectronics/NDIR/tree/master/NDIR_I2C

https://sandboxelectronics.com/?product=mh-z16-ndir-co2-sensor-with-i2cuart-5v3-3v-interface-for-arduinoraspeberry-pi

sp4ghet avatar Jan 01 '25 08:01 sp4ghet

My advice to sp4ghet:

  • Close this issue
  • cargo generate --git https://github.com/Rahix/avr-hal-template.git --name ndir_mvp
  • Keep the mvp a minimum viable product, meaning: Stay away from swapping the order of Read/Write and other let me change this now, I will need it later away
  • Check, double check and triple check the hardware setup. That implies having a schematic of the electronics. And above all: Understand all the hardware involved
  • Document your prototype further. Do it for your future self and do it for your current self that is seeking help from others.
  • Enjoy the challenge of getting to work what you envision
  • Persist
  • Smile humbly upon reaching your goal
  • Maybe publish the achievement (It is 2025, there is something called Internet, every one is a publisher.)

My advice to everyone seeking help with their project:

  • Understand the gist of http://www.catb.org/~esr/faqs/smart-questions.html
  • Aim for win win

stappersg avatar Jan 01 '25 11:01 stappersg

I've ran into this too, it looks like something in the i2c implementation breaks if you request a read but don't actually read anything.

Adding a read stops it from getting stuck.

bdew avatar May 26 '25 22:05 bdew

Adding a read stops it from getting stuck.

The intention is to not do this because doing actual reads/writes can have side-effects for the remote devices. Ideally, we just want to adress the device, check whether it responds and then abort the transaction. If this is broken right now, we have to investigate why...

Rahix avatar Jun 05 '25 05:06 Rahix

Spec "ATmega48A/PA/88A/PA/168A/PA/328/P megaAVR® Data Sheet" Rev. B – 09/2020, paragraph 22.3.5 "Combining Address and Data Packets into a Transmission":

A transmission basically consists of a START condition, a SLA+R/W, one or more data packets and a STOP condition. An empty message, consisting of a START followed by a STOP condition, is illegal.

So the current strategy cannot work by design. However, what can be done is to forcefully release the bus by clearing twcr::twen when no data is transmitted. Because the pins are pulled-up anyway, it should behave the same way as a STOP.

booti386 avatar Sep 22 '25 19:09 booti386

@booti386, thanks for reviving the discussion here. Investigating your PR lead me to search for more clues in the official standardization. Unfortunately, not everything is entirely clear.

Actually, I am not so sure if that excerpt from the AVR datasheet you are quoting is actually applicable to this situation, as per my comment here: https://github.com/Rahix/avr-hal/pull/681#issuecomment-3330026657

However, a comment over in embedded-hal lead me to question my earlier reply to @bdew: https://github.com/rust-embedded/embedded-hal/issues/570#issuecomment-2014012905 I tried looking for an official statement on this. I think the assertion in that comment is most likely correct, but it isn't really spelled out in the official specification... Here is a relevant excerpt:

Master reads slave immediately after first byte (see Figure 12). At the moment of the first acknowledge, the master-transmitter becomes a master-receiver and the slave-receiver becomes a slave-transmitter. This first acknowledge is still generated by the slave. The master generates subsequent acknowledges. The STOP condition is generated by the master, which sends a not-acknowledge (A) just before the STOP condition.

Image

Importantly, in Figure 12, we can see that the initial acknowledge and the first byte of data immediately following it are sent by the slave after initiating a read transmission. In my understanding, there is no way for the master to prevent this first byte of data. All it can do is fully clocking it out and then sending a NACK afterwards.

So... maybe doing the read as suggested by @bdew is the only way to handle this? I guess we cannot fully prevent the side-effects of reading in any case then, so the next best thing is to stick to the bus standard...

However, this has additional implications beyond just i2cdetect()/ping_device() use-cases, as #681 shows. We can also run into the same illegal situation when a driver invokes a zero-length read transaction. So we need to fix these cases at the same time...

Rahix avatar Sep 24 '25 18:09 Rahix

Ok, so i double-checked everything in details, and i think that you're right: if we want to be compatible with every i2c device that comply with the spec, then we shouldn't be able to do read/write transactions without data, even after a successful SLA+R/W.

About the statement i quoted :

An empty message, consisting of a START followed by a STOP condition, is illegal. I believe that it should be considered as an error condition (compliant with i2c spec) that is handled explicitly by the MCU.

However, there are other error conditions that are not handled by the MCU and that leave it in an invalid state. This issue is probably one of them.

I carried out a few tests on the hardware to assert this hypothesis. Below is my (ugly) test code:

    let mut twi = dp.TWI;

    println!("\nNormal transaction:");

    println!("\nSTART");
    twi.raw_start(0x24, arduino_hal::i2c::Direction::Read).unwrap();
    println!("twcr = {:02x}", twi.twcr.read().bits());
    println!("twsr = {:02x}", twi.twsr.read().tws().bits() << 3);

    arduino_hal::delay_ms(1);
    println!(" after 1ms:");
    println!("twcr = {:02x}", twi.twcr.read().bits());
    println!("twsr = {:02x}", twi.twsr.read().tws().bits() << 3);

    println!("\nREAD");
    twi.raw_read(&mut [0]).unwrap();
    println!("twcr = {:02x}", twi.twcr.read().bits());
    println!("twsr = {:02x}", twi.twsr.read().tws().bits() << 3);

    arduino_hal::delay_ms(1);
    println!(" after 1ms:");
    println!("twcr = {:02x}", twi.twcr.read().bits());
    println!("twsr = {:02x}", twi.twsr.read().tws().bits() << 3);

    println!("\nSTOP");
    twi.raw_stop().unwrap();
    println!("twcr = {:02x}", twi.twcr.read().bits());
    println!("twsr = {:02x}", twi.twsr.read().tws().bits() << 3);

    arduino_hal::delay_ms(1);
    println!(" after 1ms:");
    println!("twcr = {:02x}", twi.twcr.read().bits());
    println!("twsr = {:02x}", twi.twsr.read().tws().bits() << 3);

    println!("\nInvalid transaction:");

    println!("\nSTART");
    twi.raw_start(0x24, arduino_hal::i2c::Direction::Read).unwrap();
    println!("twcr = {:02x}", twi.twcr.read().bits());
    println!("twsr = {:02x}", twi.twsr.read().tws().bits() << 3);

    arduino_hal::delay_ms(1);
    println!(" after 1ms:");
    println!("twcr = {:02x}", twi.twcr.read().bits());
    println!("twsr = {:02x}", twi.twsr.read().tws().bits() << 3);

    println!("\nSTOP");
    twi.raw_stop().unwrap();
    println!("twcr = {:02x}", twi.twcr.read().bits());
    println!("twsr = {:02x}", twi.twsr.read().tws().bits() << 3);

    arduino_hal::delay_ms(1);
    println!(" after 1ms:");
    println!("twcr = {:02x}", twi.twcr.read().bits());
    println!("twsr = {:02x}", twi.twsr.read().tws().bits() << 3);

    println!("\nState check:");

    println!("\nSTART (minimal)");
    // Write start condition
    twi.twcr
        .write(|w| w.twen().set_bit().twint().set_bit().twsta().set_bit());
    // The following line will loop forever if uncommented.
    //while twi.twcr.read().twint().bit_is_clear() {}
    println!("twcr = {:02x}", twi.twcr.read().bits());
    println!("twsr = {:02x}", twi.twsr.read().tws().bits() << 3);

    arduino_hal::delay_ms(1);
    println!(" after 1ms:");
    println!("twcr = {:02x}", twi.twcr.read().bits());
    println!("twsr = {:02x}", twi.twsr.read().tws().bits() << 3);

    println!("\nSTOP");
    twi.raw_stop().unwrap();
    println!("twcr = {:02x}", twi.twcr.read().bits());
    println!("twsr = {:02x}", twi.twsr.read().tws().bits() << 3);

    arduino_hal::delay_ms(1);
    println!(" after 1ms:");
    println!("twcr = {:02x}", twi.twcr.read().bits());
    println!("twsr = {:02x}", twi.twsr.read().tws().bits() << 3);

Here is the output:

Normal transaction:

START
twcr = 84
twsr = 40
 after 1ms:
twcr = 84
twsr = 40

READ
twcr = 84
twsr = 58
 after 1ms:
twcr = 84
twsr = 58

STOP
twcr = 14
twsr = f8
 after 1ms:
twcr = 04
twsr = f8

Invalid transaction:

START
twcr = 84
twsr = 40
 after 1ms:
twcr = 84
twsr = 40

STOP
twcr = 14
twsr = f8
 after 1ms:
twcr = 14
twsr = f8

State check:

START (minimal)
twcr = 24
twsr = f8
 after 1ms:
twcr = 24
twsr = f8

STOP
twcr = 14
twsr = f8
 after 1ms:
twcr = 14
twsr = f8

We can compare the STOP status between the normal and invalid transaction. For the normal transaction, the twcr.twsto bit (40) is cleared after a delay (as it should: "When the STOP condition is executed on the bus, the TWSTO bit is cleared automatically."). However in the invalid transaction case, this bit never gets reset. Then the i2c engine becomes unresponsive, as the following START never sets the twcr.twint bit (80), and the final STOP never clears the twcr.twsto bit. And the only way to unstuck it seems to be a reset by clearing twcr.twen (04).

My conclusion is the same as yours,that we should prevent this broken behavior from happening, by rejecting transactions without data. I'll also close my PR.

But maybe it could also make sense to detect a broken state, e.g. by checking that twcr.twsto is not set before sending a new START.

booti386 avatar Sep 25 '25 08:09 booti386

But I wonder about the impl of the new I2c::transaction. The contract defined upstream does not forbid empty data buffers. So we could potentially hit the same illegal situation here (L452). I do not think the forced stop is really appropriate, unless this is really the behavior embedded-hal wants here. But silently not doing the whole transaction because it is empty does not look better either. What are your thoughts?

Even if the contract does not require full non-empty data buffers, it still requires that the transaction is performed or to return an error. As there is no proper/meaningful way for us to handle the transaction in this case, we should return an error. We should also check upfront that all the buffers passed to I2c::transaction() are not empty, and return an error before transmitting anything (avoid cancelling in the middle of the transaction), i guess?

I think there is more nuance here: When you chain multiple reads, it is quite fine when there is one in there with a zero-sized buffer. Allowing that can make downstream-code more robust and easy to write for dynamic transfer lengths.

I think a good starting point is to also check what other HAL crates are doing with this. I poked the upstream issue in e-h about the topic as well so let's see what they have to say.


In any case, I'll merge #665 now to resolve the immediate issue in ping_device() for starters.

Rahix avatar Sep 25 '25 09:09 Rahix