toit icon indicating copy to clipboard operation
toit copied to clipboard

Issues with I2C deadlocks

Open addshore opened this issue 11 months ago • 8 comments

Case 1 & 2: I2C device has been communicated with before, but stops part way through operations for a period of time

Potential dead-lock detected:
  Process: 3
  Program: 0f3f261e-541b-c138-fef7-199b44a7a1dc
  BCI: 0x3db
  Primitive: 6:2
fatal: Potential dead-lock

abort() was called at PC 0x4202e367 on core 0
Core  0 register dump:
MEPC    : 0x40806520  RA      : 0x408142ae  SP      : 0x4082fd90  GP      : 0x40822e84
TP      : 0x4082ffe0  T0      : 0x37363534  T1      : 0x7271706f  T2      : 0x33323130
S0/FP   : 0x4082fdcc  S1      : 0x4082fdcc  A0      : 0x4082fdcc  A1      : 0x4082fdae
A2      : 0x00000000  A3      : 0x4082fdf9  A4      : 0x00000001  A5      : 0x4082b000
A6      : 0x00000000  A7      : 0x76757473  S2      : 0x4082fdb0  S3      : 0x00000001
S4      : 0x027e1356  S5      : 0x00000005  S6      : 0x40836140  S7      : 0x004c4b40
S8      : 0x00000000  S9      : 0x00000000  S10     : 0x00000000  S11     : 0x00000000
T3      : 0x6e6d6c6b  T4      : 0x6a696867  T5      : 0x66656463  T6      : 0x62613938
MSTATUS : 0x00001881  MTVEC   : 0x40800001  MCAUSE  : 0x00000007  MTVAL   : 0x00000000
MHARTID : 0x00000000
Potential dead-lock detected:
  Process: 3
  Program: 8ba5f2ca-8927-4edc-5056-5ba029f2eb90
  BCI: 0x3e5
  Primitive: 6:7
fatal: Potential dead-lock

abort() was called at PC 0x4202e367 on core 0
Core  0 register dump:
MEPC    : 0x40806520  RA      : 0x408142ae  SP      : 0x4082fd90  GP      : 0x40822e84
TP      : 0x4082ffe0  T0      : 0x37363534  T1      : 0x7271706f  T2      : 0x33323130
S0/FP   : 0x4082fdcc  S1      : 0x4082fdcc  A0      : 0x4082fdcc  A1      : 0x4082fdae
A2      : 0x00000000  A3      : 0x4082fdf9  A4      : 0x00000001  A5      : 0x4082b000
A6      : 0x00000000  A7      : 0x76757473  S2      : 0x4082fdb0  S3      : 0x00000002
S4      : 0x02c49e66  S5      : 0x00000005  S6      : 0x40836208  S7      : 0x004c4b40
S8      : 0x00000000  S9      : 0x00000000  S10     : 0x00000000  S11     : 0x00000000
T3      : 0x6e6d6c6b  T4      : 0x6a696867  T5      : 0x66656463  T6      : 0x62613938
MSTATUS : 0x00001881  MTVEC   : 0x40800001  MCAUSE  : 0x00000007  MTVAL   : 0x00000000
MHARTID : 0x00000000

Case 3: I2C device I am talking to will not respond to anything.

[jaguar] INFO: program 9d8d7d0a-1109-640c-b897-53a2b2f25735 starPotential dead-lock detected:
  Process: 3
  Program: 9d8d7d0a-1109-640c-b897-53a2b2f25735
  BCI: 0x377a
  Primitive: 6:4
fatal: Potential dead-lock

abort() was called at PC 0x4202e367 on core 0
Core  0 register dump:
MEPC    : 0x40806520  RA      : 0x408142ae  SP      : 0x4082fd90  GP      : 0x40822e84
TP      : 0x4082ffe0  T0      : 0x37363534  T1      : 0x7271706f  T2      : 0x33323130
S0/FP   : 0x4082fdcc  S1      : 0x4082fdcc  A0      : 0x4082fdcc  A1      : 0x4082fdae
A2      : 0x00000000  A3      : 0x4082fdf9  A4      : 0x00000001  A5      : 0x4082b000
A6      : 0x00000000  A7      : 0x76757473  S2      : 0x4082fdb0  S3      : 0x00000001
S4      : 0x021b3f67  S5      : 0x00000005  S6      : 0x40836140  S7      : 0x004c4b40
S8      : 0x00000000  S9      : 0x00000000  S10     : 0x00000000  S11     : 0x00000000
T3      : 0x6e6d6c6b  T4      : 0x6a696867  T5      : 0x66656463  T6      : 0x62613938
MSTATUS : 0x00001881  MTVEC   : 0x40800001  MCAUSE  : 0x00000007  MTVAL   : 0x00000000
MHARTID : 0x00000000

addshore avatar Jan 30 '25 12:01 addshore

Normally, primitive operations are not allowed to block for longer periods of time. The i2c operation is currently blocking, which usually isn't a problem, since the operation shouldn't take long. However, if the transaction takes too long (potentially because of a bad device, clock stretching, ...) a watchdog triggers and reboots the device.

We need to make the i2c operation asynchronous to fix this. I already need to update the driver to use the newer ESP-IDF APIs. When doing so, I will try to change it so it isn't blocking anymore.

floitsch avatar Jan 30 '25 12:01 floitsch

Another one today, and I noticed its Primitive: 6:5 instead, so figured dumping it here might be useful?

Potential dead-lock detected:
  Process: 8
  Program: bae9bfbc-56ff-a982-a636-e0d62c461578
  BCI: 0x3c8d
  Primitive: 6:5
fatal: Potential dead-lock

abort() was called at PC 0x4202e001 on core 0
Core  0 register dump:
MEPC    : 0x408065de  RA      : 0x408145e8  SP      : 0x4082fd90  GP      : 0x40823204
TP      : 0x4082ffe0  T0      : 0x37363534  T1      : 0x7271706f  T2      : 0x33323130
S0/FP   : 0x4082fdcc  S1      : 0x4082fdcc  A0      : 0x4082fdcc  A1      : 0x4082fdae
A2      : 0x00000000  A3      : 0x4082fdf9  A4      : 0x00000001  A5      : 0x4082c000
A6      : 0x00000000  A7      : 0x76757473  S2      : 0x4082fdb0  S3      : 0x00000002
S4      : 0x5ce7e162  S5      : 0x00000005  S6      : 0x408316f0  S7      : 0x004c4b40
S8      : 0x00000000  S9      : 0x00000000  S10     : 0x00000000  S11     : 0x00000000
T3      : 0x6e6d6c6b  T4      : 0x6a696867  T5      : 0x66656463  T6      : 0x62613938
MSTATUS : 0x00001881  MTVEC   : 0x40800001  MCAUSE  : 0x00000007  MTVAL   : 0x00000000
MHARTID : 0x00000000

addshore avatar Feb 19 '25 16:02 addshore

6:5 is I2C-read. Almost certainly the same reason: the device pulls the clock low and delays the I2C transaction. -> we run into a timeout.

floitsch avatar Feb 19 '25 16:02 floitsch

So after the updates recently, we are now seeing 6:7 6:8 6:9 causing deadlocks. Again, still trying to track down the root issue for us here, and it's nice to see that some of the I2C error situations are nicer now.

What are 6:8 and 6:9? Not sure that they came up further up this thread. (Will be looking into this more tomorrow)

Image

Image

Image

addshore avatar Apr 14 '25 15:04 addshore

6:7, 6:8 and 6:9 are now "write", "read" and "write_read".

I haven't been able to convert the I2C to asynchronous operation yet, so this is probably still related to the same issue as before. I did add a timeout to the operations, but maybe the esp-idf isn't respecting that timeout.

I still have plans to revisit this, but I would like to upgrade to the latest esp-idf version first, as they have improved i2c-slave support (which makes testing easier).

floitsch avatar Apr 14 '25 15:04 floitsch

Ahh gotcha, I thought the last toit bumps came with the latest esp-idf, as I saw a large difference in behaviour, but it looks like I still have that to look forward to!

I did add a timeout to the operations, but maybe the esp-idf isn't respecting that timeout.

I think some of the timeouts certainly seem to be working, and I guess that is why im getting the different errors, often starting with i2c.master etc

addshore avatar Apr 14 '25 15:04 addshore

It comes with an esp-idf update, but not yet to the latest esp-idf.

floitsch avatar Apr 14 '25 15:04 floitsch

Not got one of these in a while, but assuming Primitive: 6:7 is still this ticket.

esp32c6 v2.0.0-alpha.188

Potential deadlock detected:
  Process: 20
  Program: 7505b566-a5bc-4c9a-dc8d-35d69e510cc1
  BCI: 0x3cf7
  Primitive: 6:7
fatal: Potential dead-lock

abort() was called at PC 0x4202f62f on core 0
Core  0 register dump:
MEPC    : 0x40806b76  RA      : 0x4081546e  SP      : 0x40831da0  GP      : 0x40824824
TP      : 0x40831fe0  T0      : 0x37363534  T1      : 0x7271706f  T2      : 0x33323130
S0/FP   : 0x40831ddc  S1      : 0x40831ddc  A0      : 0x40831ddc  A1      : 0x40831dbe
A2      : 0x00000000  A3      : 0x40831e09  A4      : 0x00000001  A5      : 0x4082d000
A6      : 0x00000000  A7      : 0x76757473  S2      : 0x40831dc0  S3      : 0x00000002
S4      : 0x004c4b40  S5      : 0x00000005  S6      : 0xff4bfeec  S7      : 0x408332d8
S8      : 0x00000000  S9      : 0x00000000  S10     : 0x00000000  S11     : 0x00000000
T3      : 0x6e6d6c6b  T4      : 0x6a696867  T5      : 0x66656463  T6      : 0x62613938
MSTATUS : 0x00001881  MTVEC   : 0x40800001  MCAUSE  : 0x00000002  MTVAL   : 0x00000000
MHARTID : 0x00000000

Stack memory:
40831da0: 0xff4bfeec 0xffffffff 0x40831ddc 0x40819f5a 0x0000000a 0x00000000 0x4087c3bc 0x40030030
40831dc0: 0x32303234 0x66323666 0x004c4b00 0x408275f8 0x40831dc0 0x40827614 0x40831dbc 0x726f6261
40831de0: 0x20292874 0x20736177 0x6c6c6163 0x61206465 0x43502074 0x34783020 0x66323032 0x20663236
40831e00: 0x63206e6f 0x2065726f 0x00000030 0x42110000 0x40831e96 0x423ae000 0x42132654 0x4202f632
40831e20: 0x00000000 0x4082f908 0x4087c3bc 0x40831e44 0x00000000 0x423ae000 0x423b7757 0x4202e9ce
40831e40: 0x0000001d 0x40831a88 0x00000000 0x00000001 0x00000000 0x00000000 0x00000001 0x000035d6
40831e60: 0x9e510cc1 0x00400000 0xffffffff 0x00000000 0x40831ed0 0x4083d998 0x40831eb0 0x00000007
40831e80: 0x00000006 0x00004c9a 0x0000a5bc 0x7505b566 0x00000000 0x00070000 0x35303537 0x36363562
40831ea0: 0x6235612d 0x63342d63 0x642d6139 0x2d643863 0x36643533 0x31356539 0x31636330 0x00000000
40831ec0: 0x40831f44 0x00000001 0x40833168 0x00000000 0x00000000 0x4083d998 0x408331e8 0x40831f80
40831ee0: 0x40831f44 0x4083db10 0x40833168 0x4202eb4a 0xffffffff 0x00000000 0x00000000 0x00000000
40831f00: 0x00000000 0x4084b128 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x40832c48
40831f20: 0x00000000 0x40831f80 0x40833168 0x4202ec30 0x408334f8 0x00000000 0x00000000 0x42158a18
40831f40: 0x00000000 0x40833200 0x00000000 0x40832c00 0x00000000 0x00000000 0x00000001 0x4202f48c
40831f60: 0x4087f564 0x4087d000 0x00000002 0x42158a18 0x00000000 0x40833168 0x408334f8 0x40833508
40831f80: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
40831fa0: 0x00000000 0x00000000 0x00000000 0x42126c10 0x00000000 0x00000000 0x00000000 0x00000000
40831fc0: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
40831fe0: 0x4082ffd8 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
40832000: 0x40833000 0x4087c000 0x00080000 0x00000000 0x08080008 0x00000000 0xa5a5a5a5 0xa5a5a5a5
40832020: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
40832040: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
40832060: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
40832080: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
408320a0: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
408320c0: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
408320e0: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
40832100: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
40832120: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
40832140: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
40832160: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
40832180: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5



ELF file SHA256: 03574d58de75be9f

Rebooting...
ESP-ROM:esp32c6-20220919
Build:Sep 19 2022

addshore avatar Sep 25 '25 17:09 addshore