colorchord icon indicating copy to clipboard operation
colorchord copied to clipboard

WDT Resets in station mode

Open acoulson2000 opened this issue 5 years ago • 16 comments

After frying the power on my Wemos D1, which was running my modified build just fine, and also losing my build environment after a harddrive upgrade, I just rebuilt the whole "base" build environment and flashed a new Wemos. Now it's working fine in AP mode, but when I switch to Station mode, I'm getting a situation where it connects to the AP for a few seconds, then performs a WDT reset (serial output below). Any thoughts on what's going on here?

Opmode: 1
Station mode: "GL-MT300N" (bssid_set:0)
Loading Settings: af / 0 / 69 / 69
Settings Loaded: ESP_31E5FC / Default
RST REASON: 0
sleep enable,type: 2
mode : sta(84:f3:eb:31:e5:fc)
add if0
scandone
state: 0 -> 2 (b0)
state: 2 -> 3 (0)
state: 3 -> 5 (10)
add 0
aid 1
cnt 

connected with GL-MT300N, channel 1
dhcp client start...
ip:192.168.8.124,mask:255.255.255.0,gw:192.168.8.1
IGMP Joining: 7c08a8c0 fb0000e0
STAT: 5
IP: 192.168.8.124
NM: 255.255.255.0
GW: 192.168.8.1
WCFG: /GL-MT300N/
IGMP Joining: 7c08a8c0 fb0000e0
Fatal exception 0(IllegalInstructionCause):
epc1=0x4023b20c, epc2=0x00000000, epc3=0x00000000, excvaddr=0x00000000, depc=0x00000000
⸮
 ets Jan  8 2013,rst cause:4, boot mode:(3,6)

wdt reset
load 0x40100000, len 29384, room 16 
tail 8
chksum 0x74
load 0x3ffe8000, len 1304, room 0 
tail 8
chksum 0xcb
load 0x3ffe8520, len 3524, room 0 
tail 4
chksum 0x84
csum 0x84
<gobledygoop>
Opmode: 1
Station mode: "GL-MT300N" (bssid_set:0)
...rinse..repeat...

acoulson2000 avatar Feb 02 '19 19:02 acoulson2000

@aefeinstein were you running into an issue like this when you were porting the ESP82xx stuff?

cnlohr avatar Feb 03 '19 21:02 cnlohr

No, but I never attempted using station mode.

AEFeinstein avatar Feb 03 '19 21:02 AEFeinstein

@acoulson2000 did you compile this yourself? If so can you examine the program.lst and see where 0x4023b20c falls?

cnlohr avatar Feb 04 '19 03:02 cnlohr

I am also having the same issue. The exception occurs almost at the same address, 0x4023b208. I just built .lst file and here is what I have:

4023b205: 000000 ill

4023b208 <read_sar_dout>: 4023b208: aa6791 l32r a9, 40225ba4 <system_restart_hook+0x10> 4023b20b: a2ee81 l32r a8, 40223dc4 <strdup+0x44> 4023b20e: 0b0c movi.n a11, 0

lightsarefun avatar Feb 04 '19 03:02 lightsarefun

How do I build a.lst file? I can try it I'm there morning. Will also look at whether there are recent commits? My working version was forced back around October, I think.

acoulson2000 avatar Feb 04 '19 04:02 acoulson2000

@acoulson2000 make debug I get a file called image.lst

lightsarefun avatar Feb 04 '19 04:02 lightsarefun

Hmm.. This seems like the issue where EnterCritical and ExitCritical are incorrectly being called. Can you verify that if the device is connecting, it calls the EnterCritical function? Just put a printf( "EnterCritical\n" ); and printf( "ExitCritical\n" ); in there.

Sorry I'm not in a position where I can test this myself.

cnlohr avatar Feb 04 '19 04:02 cnlohr

@cnlohr Looks like ExitCritical is being called, but should there be a EnterCritical prior to it?

Opmode: 1 Station mode: "race2" (bssid_set:0) EnterCritical. Loading Settings: af / 0 / 69 / 69 Settings Loaded: ESP_867E81 / Default ExitCritical. RST REASON: 0 sleep enable,type: 2 mode : sta(84:0d:8e:86:7e:81) add if0 ExitCritical. scandone state: 0 -> 2 (b0) state: 2 -> 3 (0) state: 3 -> 5 (10) add 0 aid 1 cnt

connected with race2, channel 1 dhcp client start... ip:192.168.0.15,mask:255.255.255.0,gw:192.168.0.1 IGMP Joining: 0f00a8c0 fb0000e0 STAT: 5 IP: 192.168.0.15 NM: 255.255.255.0 GW: 192.168.0.1 WCFG: /race2/ ExitCritical. IGMP Joining: 0f00a8c0 fb0000e0 Fatal exception 0(IllegalInstructionCause): epc1=0x4023b214, epc2=0x00000000, epc3=0x00000000, excvaddr=0x00000000, depc=0x00000000 ⸮ ets Jan 8 2013,rst cause:4, boot mode:(3,6)

wdt reset load 0x40100000, len 29292, room 16 tail 12 chksum 0xd7 ho 0 tail 12 room 4 load 0x3ffe8000, len 1304, room 12 tail 12 chksum 0x1a ho 0 tail 12 room 4 load 0x3ffe8520, len 3556, room 12 tail 8 chksum 0x56 csum 0x56

lightsarefun avatar Feb 04 '19 06:02 lightsarefun

How long is it between the ExitCritical to the crash? I think this looks right so it may be a different problem.

Also, there is an EnterCritical first. It's called much earlier on.

cnlohr avatar Feb 04 '19 07:02 cnlohr

It's about 8 seconds until the crash.

I see that EnterCritical is called first, but then ExitCritical is called twice. I don't see anything wrong with that though.

lightsarefun avatar Feb 04 '19 07:02 lightsarefun

Interesting... It should be in EnterCritical when negotiating for WPA2, so this is probably the wrong behavior but why it's crashing I'm still not sure. This is very strange.

cnlohr avatar Feb 04 '19 08:02 cnlohr

Charles, I'm curious - what were you referring to in the "stab at the esp8266 port" commit?

acoulson2000 avatar Feb 05 '19 00:02 acoulson2000

I'm referring to the commits surrounding this one: https://github.com/cnlohr/colorchord/commit/2a0c78d5261332f727e3c73dc590d401b97ad922

cnlohr avatar Feb 05 '19 02:02 cnlohr

I have my build compiled with three_samples and I am experiencing exactly the same behavior. I am also using Wemos D1 mini version of ESP8266. The first reset was manual to ensure I have a clean start. Following is the log:

;l<break>
[21:53:47:292] d<0x9c><0xdf>|<break>
[21:53:47:292] <0x8c>l<0xe0>|<0x03><0x04><0x0c><0x04><0x8c><0x04>l<0xec><0x04>c|<0x8f><0xc3><0x03><0xe4><0x13><0x9b>r<0x93>#<0x8c><0x04>c<0x84><0xfb>'o<0xdf>dg'<0x9c><0xe3><0xe4><0x04>c<0x1c>p<0x8c><0xc7>ds$sdp<0xfb>'<0xe0><0x10><0x03><0x04><0x0c><0x83><0x0c>l<0x04><0x0c><0x04><0x0c><0x04><0x04>c<0x04>g<0xe3>|<0x03><0x8c>$l<0x8e><0x0c><0x04>c<0x8c><0xfb>g'<0xe7><break>
[21:53:47:320] l<0xc4><0x87>d`<0x02><0x90><0x1b><0x13>ogd<0x8c>d`<0x02><0x07><0x03>gs<0x8e><0xdb><0x93>o<0x0c><0x04><0xc3>c$`<0x03>`<0xf2>'<0x0c><0x0c><0x04><0x9f><0xe0>c<0xc3>'$<0x8c><0x04><0x8c><0xf3>og<0xe7><break>
[21:53:47:360] <0x0c><0x8f><0x07>dp<0xfb>g<0xe0><0x10><0x02><0x04><0x0c>s<0xc4><0x9c><0x9c><0xe3><0xe0><0x04><0x04><0x0c><0x04>c<0x04>o<0xe3><<0x03>l<0xe4><0x0c><0x04><0x8f>c<0x84><0xf2>'o<0xef><break>
[21:53:47:364] $<0x8c><0x04>l <0x03><0x98><0x13><0x1b>gol<0x84>l <0x03><0x07><0x03>n;<0x87><0x93><0xdb>g<0x04><0x0c>c<0xdb>d`<0x02> <0xfb>g<0x04><0x0c><0x0c><0x9f><0xe0>#<0x83>od<0xc4><0x0c><0x84><0xf2>'o<0xef><break>
[21:53:47:374] <0x04><0x87><0x07>lx<0xf3>o<0xe0><0x18><0x03><0x0c><0x04>;<0x8c><0x9c><0x9c><0xe3><0xe0>l<0x8e><0x1c><0x80><0x0c>b<0x0c>'<0xe3>|<0x03><0xe4>l<0x8f><0x87><0x8e>c<0x8c><0xfb>g'<0xe7><break>
[21:53:47:383] l<0xc4><0x0c>d`<0x02><0xd8><0x1b><0x1b>'ol<0xc4>l <0x03><0x07><0x03>o;<0xc7><0x9b><0xdb>'<0x04><0x0c><0x9b><0x8c><0x93>`<0x02><0x07>{<0x92><0x9b>o<0x04><0x04><0x93><0xc4><0x92> <0x03>{<0x13>o<;<0x1b><0xc3>s<0x03>g$g<0xe0><0x80><0x03><0x84><0x04>c|<0x80><0x1b>'<0x92>|#c<0x92><0xfb><0x93>'<0xe0><0x80><0x02><0x04><0x87><0x0f>l<0x04><0xf3>ng<0x9e><0x8c>go<0x9f><0xe4><0xdb><0x83><0xdb>`<0x03><0x7f><0x82><0x1b><0x1b><0x0c><0xc4>gn<0x9f><0xe4><0xdb><0x83><0x9b>`<0x03><0x7f><0xc3><0x13>c<0x0c><0x84>o'<0x9f><0xec><0x9b><0x82><0x93>`<0x02><0xc7><0x1b>r<0x83><0xdb><0x93>o<0x1b><0x1b>b<0x83><0x1b>g<0x9f><0xec>?<0xe3>o<0x1b><0x9f><0xe0><0x04><0x83>g<0xe3><0xfe>#<0x93><0xfb><0x12>n|<0x98><0x03><0x04><0xc7><0xe4><0x92>s;<0x93>{r<0x1b>ld`<0x03><0xfc><0x84><0x0c><0x04><0x0c>s<0xc4><0x16>,⇥	<0xc2><0xcd>fc000<0x8c><0xe3><0x03><0xe4><0x1b><0x83>o<0xec><0x9b>;<0x83><0xfb>g|<0xec><0x0c>d<0x04>ldl <0x03><0x1c>c<0x9b><0x1b><0x03><0x04><0x9f>|<0x03>;<0x93><0x03>l<0x9c>o<0xe0><0x0c><0x83>g<0xe3><break>
[21:53:47:436] <0x0c>d`<0x03><0xc4><0xe3>;<0x9b>$<0x8c>d<0x13><0x84><0x04><0x0c><0x04><0xfe>C<0xa1><0xa8><0x8b><0xeb>K␍<0xa1><0xbd><0xc9><0x91>5␊
[21:53:47:440] Opmode: 2␍␊
[21:53:47:440] Default SoftAP mode: "ESP_00887D":""␍␊
[21:53:47:444] Loading Settings: af / 0 / 67 / 67␍␊
[21:53:47:447] Settings Loaded: ColorC2 / 2nd ColorChord␍␊
[21:53:47:451] RST REASON: 6␊
[21:53:47:451] sleep enable,type: 2␍␊
[21:53:47:454] mode : softAP(2e:3a:e8:00:88:7d)␍␊
[21:53:47:458] add if1␍␊
[21:53:47:458] dhcp server start:(ip:192.168.4.1,mask:255.255.255.0,gw:192.168.4.1)␍␊
[21:53:47:462] bcn 100␍␊
[21:53:47:465] IGMP Joining: 0104a8c0 fb0000e0␍␊
[21:53:47:835] add 1␍␊
[21:53:47:835] aid 1␍␊
[21:53:47:835] station: 58:00:e3:e6:9f:f5 join, AID = 1␍␊
[21:54:04:569] Switching to: "wlan_home2"/"Klaucovi2015" (10/12). BSSID_SET: 0 [1]␍␊
[21:54:04:574] station: 58:00:e3:e6:9f:f5 leave, AID = 1␍␊
[21:54:04:579] rm 1␍␊
[21:54:04:579] bcn 0␍␊
[21:54:04:579] del if1␍␊
[21:54:04:585] usl␍␊
[21:54:04:585] mode : sta(2c:3a:e8:00:88:7d)␍␊
[21:54:04:585] add if0␍␊
[21:54:04:731] Switching.␍␊
[21:54:07:523] scandone␍␊
[21:54:08:463] state: 0 -> 2 (b0)␍␊
[21:54:08:463] state: 2 -> 3 (0)␍␊
[21:54:08:471] state: 3 -> 5 (10)␍␊
[21:54:08:471] add 0␍␊
[21:54:08:471] aid 4␍␊
[21:54:08:471] cnt ␍␊
[21:54:08:480] ␍␊
[21:54:08:480] connected with wlan_home2, channel 6␍␊
[21:54:08:536] dhcp client start...␍␊
[21:54:09:287] ip:192.168.11.65,mask:255.255.255.0,gw:192.168.11.254␍␊
[21:54:09:290] IGMP Joining: 410ba8c0 fb0000e0␍␊
[21:54:09:362] STAT: 5␍␊
[21:54:09:362] IP: 192.168.11.65␍␊
[21:54:09:362] NM: 255.255.255.0␍␊
[21:54:09:369] GW: 192.168.11.254␍␊
[21:54:09:369] WCFG: /wlan_home2/␍␊
[21:54:09:369] IGMP Joining: 410ba8c0 fb0000e0␍␊
[21:54:18:468] Fatal exception 0(IllegalInstructionCause):␍␊
[21:54:18:471] epc1=0x4023b238, epc2=0x00000000, epc3=0x00000000, excvaddr=0x00000000, depc=0x00000000␍<0xff>␍␊
[21:54:18:480]  ets Jan  8 2013,rst cause:2, boot mode:(3,6)␍␊
[21:54:18:484] ␍␊
[21:54:18:502] load 0x40100000, len 29580, room 16 ␍␊
[21:54:18:523] tail 12␍␊
[21:54:18:523] chksum 0xd2␍␊
[21:54:18:523] ho 0 tail 12 room 4␍␊
[21:54:18:530] load 0x3ffe8000, len 1304, room 12 ␍␊
[21:54:18:530] tail 12␍␊
[21:54:18:530] chksum 0xb5␍␊
[21:54:18:538] ho 0 tail 12 room 4␍␊
[21:54:18:538] load 0x3ffe8520, len 3544, room 12 ␍␊
[21:54:18:546] tail 12␍␊
[21:54:18:546] chksum 0x5b␍␊
[21:54:18:546] csum 0x5b␍␊
[21:54:18:553] s<0x1b>'

sanchosk avatar Feb 18 '19 20:02 sanchosk

The read_sar_dout call in embedded8266/user/adc.c is definitely troublesome in station mode and tripping the wdt reset. It seems to be part of the (closed source?) libphy.a library. My guess would be that it is a problem upstream of esp-open-sdk in the ESP8266_NONOS_SDK-2.1.0 release.

I came across source code for the function here:

https://github.com/pvvx/esp8266web/blob/master/info/libs/phy/phy_get_vdd33.c

However, the SAR_BASE mentioned in the comment didn't seem to be the correct value. The value defined here worked for me:

https://github.com/PetteriAimonen/esp-walkie-talkie/blob/master/fast_adc.c

I was able to get it stable in station mode by adding this function and calling it in place of read_sar_dout:

void hs_read_sar_dout(uint16 * buf)
{
   volatile uint32 * sar_regs = &((volatile uint32_t*)0x60000D00)[32];
   int i, x, z;
   for(i = 0; i < 8; i++) {
      x = ~(*sar_regs++);
      z = (x & 0xFF) - 21;
      x &= 0x700;
      if(z > 0) x = ((z * 279) >> 8) + x;
      buf[i] = x;
   }
}

astateofblank avatar Apr 09 '19 00:04 astateofblank

I have been working with the latest master by @cnlohr (making sure the 2 submodules esp82xx and eps_nonos_sdk are at the same commits as he uses) and gradually bringing in my changes. I use station mode all the time. I do not get resets when in station or soft AP mode, but there is still some strange behaviour that I am trying to sort out. It is present in master (as well as my branch with the additions). Trying the new hs_read_sar_dout does not seem to make a difference. The problems I observe are

    1. If my router is running, often when restarting it will NOT reconnect to the station (and it is waiting to do, stuck at stats = STATION_CONNECTING). It will only connect AFTER I restart my router.
    1. If the router is not running it will try to connect to the station, give up, then connect as a soft AP.
    1. While running if I set GPIO0 to 0 (either in the gui, or by grounding the pin), it it goes into a loop trying to connect to soft AP. If I restart it will then create soft AP. (I have connected RST and GPIO16 on nodemcu so deepsleep can run)
    1. DCHP does not work. ESP_D0F9CA.local in soft AP or Station mode connects sometimes but constantly resets. I can only connect directly to the assigned IP.

@astateofblank I notice on https://github.com/pvvx/esp8266web pvvx mentions implementing an UDP Wave server (Integrated SAR ADC): Sending 14-bit samples at 1 Hz .. 48 kHz (max 192 kHz 12 bits). This is here https://github.com/pvvx/esp8266web/blob/master/app/driver/adc.c in addition to the code you referred to. Has anyone tried this?

bbkiwi avatar Jun 06 '19 04:06 bbkiwi