Random hangup on token refresh
I have been having a curious problem - I am experiencing random hangups during calls on my SPA504G phones. Some calls will work fine - over an hour of talking time with no issues. Others will hang up 13+ times in the same period. In each case, it is necessary to re-dial the call, with obvious frustration. This seemingly random issue has been difficult to log and reproduce.
Reproduction Steps:
- Commence call - either inbound or outbound
- Wait for random hangup to occur. When this happens, the line LED returns to the 'off' state, and a brief 'Asterisk connected' message appears on the display. The call is terminated within Asterisk
- Redial call and wait for it to happen again
chan-sccp version: Git revision fdcfaf1
asterisk version: 17.2.0
Expected behavior:
Calls should not be disconnected unless either end hangs up
Observed behavior:
Calls are randomly disconnected. I have managed to capture the event with some logging enabled - and it seems to have something to do with handling SPCP tokens. Although I am not familiar with the process, it looks like chan_sccp goes through a process of removing and reconfiguring lines every time a new token is received, even if a call is in progress. It seems to be this process that causes the call to be disconnected.
In the example below, an outbound call was placed by a user. This was answered at 10:28:04, and a token request appeared at 10:28:08. It even logged a warning to say that a hangup was taking place on an open channel, yet proceeded to hang up, and reconfigure the phone.
As the issue is somewhat difficult to catch - it doesn't always happen, and there is no way to force it to occur either, gathering more information is difficult. I'm very happy to do some additional experiments (code changes, logging etc.) but need some guidance on how I can best help... unless the issue is obvious from the logs below!
Logs
[May 4 10:28:04] > 3004 (hint_updateLineStateForSingleChannel) Set singleLineState to CONNECTED(30)
[May 4 10:28:04] -- SCCP: (sccp_hint_notifySubscribersViaPbx) Notify asterisk to set state to sccp channelstate 'CONNECTED' (30) => asterisk: 'Device is in use' (2) on channel SCCP/3004
[May 4 10:28:04] -- 3004 (getLinestate) state:CONNECTED, party:xxxxxxxxxxx/, calltype:Outbound
[May 4 10:28:04] > SCCP: (devicestate) sccp_hint returned state:CONNECTED for '3004'
[May 4 10:28:04] > SCCP: (devicestate) PBX requests state for '3004' - state In use
[May 4 10:28:04] -- SCCP: (sccp_hint_notifyLineStateUpdate) Notified asterisk to set state to sccp channelstate 'CONNECTED' (30) => asterisk: 'Device is in use' (2) on channel SCCP/3004
[May 4 10:28:08] NOTICE[20563][C-0000001e]: sccp_actions.c:716 handle_SPCPTokenReq: SEP580A2087793D: Cleaning previous session, come back later (tokenState:OK)
[May 4 10:28:08] SCCP: Clean Device SEP580A2087793D, remove from global:No, restart_device:No
[May 4 10:28:08] WARNING[20563][C-0000001e]: sccp_device.c:2437 _sccp_dev_clean: SCCP: Hangup open channel on line 3004 device SEP580A2087793D
[May 4 10:28:08] == SEP580A2087793D: Ending call SCCP/3004-00000088 (state:CONNECTED)
[May 4 10:28:08] -- SEP580A2087793D: Sending hangupRequest to Call SCCP/3004-00000088 (state: CONNECTED)
[May 4 10:28:08] == SCCP: Remove Line 3004 from device SEP580A2087793D
[May 4 10:28:08] == SEP580A2087793D (hint_eventListener) device SEP580A2087793D detached from line 3004
[May 4 10:28:08] > 3004 (hint_updateLineState) Update Line Channel State: CONNECTED(30)
[May 4 10:28:08] > SCCP: (sccp_hint_updateLineState) 0 devices register on linename: 3004
[May 4 10:28:08] -- SCCP: (sccp_hint_notifySubscribersViaPbx) Notify asterisk to set state to sccp channelstate 'CONGESTION' (42) => asterisk: 'Device is unavailable' (5) on channel SCCP/3004
[May 4 10:28:08] -- SCCP: (sccp_hint_notifyLineStateUpdate) Notified asterisk to set state to sccp channelstate 'CONGESTION' (42) => asterisk: 'Device is unavailable' (5) on channel SCCP/3004
[May 4 10:28:08] == SCCP: Remove Line resolve from device SEP580A2087793D
[May 4 10:28:08] == SEP580A2087793D (hint_eventListener) device SEP580A2087793D detached from line xxxxxxx
[May 4 10:28:08] > xxxxxxx (hint_updateLineState) Update Line Channel State: ONHOOK(1)
[May 4 10:28:08] > xxxxxxx (hint_updateLineStateForSingleChannel) NO CHANNEL
[May 4 10:28:08] > xxxxxxx (hint_updateLineStateForSingleChannel) Set singleLineState to ONHOOK(1)
[May 4 10:28:08] -- SCCP: (sccp_hint_notifySubscribersViaPbx) Notify asterisk to set state to sccp channelstate 'ONHOOK' (1) => asterisk: 'Device is not in use' (1) on channel SCCP/xxxxxxx
[May 4 10:28:08] -- SCCP: (sccp_hint_notifyLineStateUpdate) Notified asterisk to set state to sccp channelstate 'ONHOOK' (1) => asterisk: 'Device is not in use' (1) on channel SCCP/xxxxxxx
[May 4 10:28:08] == SCCP: Unregister Device SEP580A2087793D
[May 4 10:28:08] -- SEP580A2087793D: (devstate::deviceRegisterListener) device unregistered
[May 4 10:28:08] > SCCP: (devicestate) sccp_hint returned state:CONGESTION for '3004'
[May 4 10:28:08] > SCCP: (devicestate) PBX requests state for '3004' - state Unavailable
[May 4 10:28:08] > SCCP: (devicestate) sccp_hint returned state:CONGESTION for '3004'
[May 4 10:28:08] > SCCP: (devicestate) PBX requests state for '3004' - state Unavailable
[May 4 10:28:08] -- SCCP: Accepted Client Connection from 10.3.12.252:1027
[May 4 10:28:09] -- SCCP: Alarm Message: Severity: Informational (2), Name=SEP580A2087793D Load=7.6.2f Last=Hard+Unknown [0/-66321654]
[May 4 10:28:09] -- SEP580A2087793D: nat=Off set manually in config file
[May 4 10:28:09] -- SEP580A2087793D: Accessory 'Headset' is 'On Hook' (0)
[May 4 10:28:09] -- SEP580A2087793D: Device has 12 Capabilities
[May 4 10:28:09] -- SEP580A2087793D: Accessory 'Headset' is 'On Hook' (0)
[May 4 10:28:09] -- SEP580A2087793D: Sending ConfigStatMessage, lines 2, speeddials 0
[May 4 10:28:09] == SEP580A2087793D (hint_eventListener) device SEP580A2087793D attached on line 3004
[May 4 10:28:09] -- SEP580A2087793D (hint_attachLine) Create new hint_lineState for line: 3004
[May 4 10:28:09] > 3004 (hint_updateLineState) Update Line Channel State: DOWN(0)
[May 4 10:28:09] > 3004 (hint_updateLineStateForSingleChannel) NO CHANNEL
[May 4 10:28:09] > 3004 (hint_updateLineStateForSingleChannel) Set singleLineState to ONHOOK(1)
[May 4 10:28:09] -- SCCP: (sccp_hint_notifySubscribersViaPbx) Notify asterisk to set state to sccp channelstate 'ONHOOK' (1) => asterisk: 'Device is not in use' (1) on channel SCCP/3004
[May 4 10:28:09] -- SCCP: (sccp_hint_notifyLineStateUpdate) Notified asterisk to set state to sccp channelstate 'ONHOOK' (1) => asterisk: 'Device is not in use' (1) on channel SCCP/3004
[May 4 10:28:09] == SEP580A2087793D (hint_eventListener) device SEP580A2087793D attached on line xxxxxxx
[May 4 10:28:09] > resolve (hint_updateLineState) Update Line Channel State: ONHOOK(1)
[May 4 10:28:09] > resolve (hint_updateLineStateForSingleChannel) NO CHANNEL
[May 4 10:28:09] > resolve (hint_updateLineStateForSingleChannel) Set singleLineState to ONHOOK(1)
[May 4 10:28:09] -- SCCP: (sccp_hint_notifySubscribersViaPbx) Notify asterisk to set state to sccp channelstate 'ONHOOK' (1) => asterisk: 'Device is not in use' (1) on channel SCCP/xxxxxxx
[May 4 10:28:09] -- SCCP: (sccp_hint_notifyLineStateUpdate) Notified asterisk to set state to sccp channelstate 'ONHOOK' (1) => asterisk: 'Device is not in use' (1) on channel SCCP/xxxxxxx
[May 4 10:28:10] -- SEP580A2087793D: Got Forward Status Request. Line: 1
[May 4 10:28:10] -- SEP580A2087793D: Got Forward Status Request. Line: 2
[May 4 10:28:10] -- SEP580A2087793D: Device registered; performing post registration tasks...
[May 4 10:28:10] -- SEP580A2087793D: (devstate::deviceRegisterListener) device registered
[May 4 10:28:27] -- SEP580A2087793D: Taken Offhook
Addition / Optional Information:
- Problem started happening recently, didn't happen in an older version of Chan-SCCP: [Yes/No] (previously known to work in 69fdf95)
- Problem can be reliably reproduced, doesn't happen randomly: [Yes/No]
- Platform:
- Intel(R) Core(TM) i3-4150 CPU @ 3.50GHz
- [32/64 Bit]
- [Big/Small Endian]
- OpenSUSE
- 13.1
- Direct Contact information: [You can contact me here]
- Log Files (attach .txt / pastbin / gist): See above
- Comments: See above
BTW: Consider making a donation via paypal:
- https://www.paypal.com/cgi-bin/webscr?&cmd=_donations&business=chan.sccp.b.pp%40gmail.com
@patrickkeys Hi Patrick, thanks for logging the issue. SPCP Phones (this is the specific protocol SPA phones use (similar to SCCP)), send a so called Token before they try to connect to a server. I think there are two issue here that need to be addressed:
- We should not be accepting or denying a token request when there is an ongoing call. We should back off and not respond to the request for a limited amount of time. However this might not be so easy, the phone might think it has lost the connection to the server, and this might or might not be true, this is hard to tell for sure.
- Phones should not be sending token requests to a server they are already connected to, and this might be cause by a configuration error on your side. Do you have multiple member servers or srst entries in your SEPxxxx.cnf.xml which point to (or could resolve to) the same server. If so make sure that each server is only entered into the cnf.xml file once.
Please set sccp debug core, device, session, line, channel while working on this issue.
@dkgroot Hi Diederik - many thanks for your quick reply!
In answer to (2) above, I only have a single member server configured in the configuration file, and it is specified by IP address. The IP address is specified twice - in the
In terms of (1) above - am I correct in my understanding that once registered, I shouldn't see any token requests from the phone? I do see them quite frequently on the console even when no calls are in progress - if that's true, then it's much easier for me to capture this happening. Previously I was looking for calls which were abnormally terminated, which obviously require people to be making calls at the time. Would it also be useful to capture network packets on the server?
You can attach/link/email the cnf.xml file so i can have a look, just to make sure. Please do also check the XMLDefault.cnf.xml, just to make sure.
If you only have a single member server in the cnf.xml then you should only see a single tokenrequest before it attempts to register and we should respond to that using an Ack. Once it is registered it should not send another. Except if it is loosing tcp connection or missing keepalive messages. When you set the debug level to "core, device, session" we should be able to figure out what is going on.
I've managed to do a bit more investigation. Firstly, there is no XMLDefault.cnf.xml in use at present - although all phones use the same configuration file on the TFTP server. I have simply simlinked each phone's SEPxxxxxxxxxxxx.cnf.xml to the same file. I've included it here (SEP-spa50x-30x.cnf.xml).
I managed to catch one phone doing this last night with the additional debug types enabled. I also happened to have tcpdump running on the server, listening to port 2000. The console log is asterisk.txt, and the capture file is spcp.pcap. There was no call in progress at the time, but this is exactly the same sequence of events that would result in a hangup.
I noticed a few things here:
- the instance ID (16777216) is the same for all phones. I'm not sure if it's used for anything though
- the phone was communicating without issue - regular TimeDateReq/Res and KeepAliveReq/Ack messages
- when the phone sends a SPCPRegisterTokenReq, the server responds with an SPCPRegisterTokenReject, with a wait time of 10 (not sure of the units). I guess this 'reject' is to give the channel driver on Asterisk time to clean up ready for a new token request
- the TCP session is not closed until after the reject message, which means that it must have been established before this (i.e. no timeout detected at the server end at least)
- in Asterisk, running 'sccp show devices' lists all devices, but each one has a token of 'None' at all times. Is this correct?
I also left tcpdump running on the server for a longer period (30 minutes). I don't have the console log for this time, but did notice some interesting things:
- of the 15 or so phones I have in my system, only 5 tried to re-register their token. The other phones were all sending KeepAlive and TimeDateReq messages, but no token requests
- Of these five phones, almost all tried to register a new token 3 or 4 times in the 30 minute period, then stopped
- the phones in this log are a mixture of remote (via IPsec or SSTP) and local connections, so the issue seems to occur across all types of connection
I guess the thing that I don't quite understand about this process (I don't have the protocol document) is why a token request can't be matched to an existing session and acknowledged by chan_sccp, rather than setting it to a failed state, and forcing the phone to be reconfigured. The fact that the phone is sending this request at all sounds an awful lot like a very tight timeout to me - but every request sent by the phone is acknowledged immediately by the server. So I really don't see what conditions it is using to determine whether a token request needs to be sent or not.
I've managed to do a bit more investigation. Firstly, there is no XMLDefault.cnf.xml in use at present - although all phones use the same configuration file on the TFTP server. I have simply simlinked each phone's SEPxxxxxxxxxxxx.cnf.xml to the same file. I've included it here (SEP-spa50x-30x.cnf.xml). I managed to catch one phone doing this last night with the additional debug types enabled. I also happened to have tcpdump running on the server, listening to port 2000. The console log is asterisk.txt, and the capture file is spcp.pcap. There was no call in progress at the time, but this is exactly the same sequence of events that would result in a hangup.
Thanks for the files. Always nice to have a pcap side by side to the debug log.
I noticed a few things here:
- the instance ID (16777216) is the same for all phones. I'm not sure if it's used for anything though
- the phone was communicating without issue - regular TimeDateReq/Res and KeepAliveReq/Ack messages
- when the phone sends a SPCPRegisterTokenReq, the server responds with an SPCPRegisterTokenReject, with a wait time of 10 (not sure of the units). I guess this 'reject' is to give the channel driver on Asterisk time to clean up ready for a new token request
- the TCP session is not closed until after the reject message, which means that it must have been established before this (i.e. no timeout detected at the server end at least)
- in Asterisk, running 'sccp show devices' lists all devices, but each one has a token of 'None' at all times. Is this correct?
One thing to know, the SPCPtoken request/ack/rej messages are used a little differently on the SPA phones (compared to the normal token request message in the SCCP protocol). Cisco decided to make the phone send a token request before registration so that the SPA phones cannot be used on CUCM servers and are only allowed to register to CME small business machines. Under normal circumstances (as far as we knew) a phone would send a tokenrequest (and chan-sccp acknowledge's this) before it had any registered connection. However, you are showing that this is not always the case. Our current strategy was to reject the request when they already had an established connection.
I guess the thing that I don't quite understand about this process (I don't have the protocol document) is why a token request can't be matched to an existing session and acknowledged by chan_sccp, rather than setting it to a failed state, and forcing the phone to be reconfigured. The fact that the phone is sending this request at all sounds an awful lot like a very tight timeout to me - but every request sent by the phone is acknowledged immediately by the server. So I really don't see what conditions it is using to determine whether a token request needs to be sent or not.
Regarding timeout: We use the keepalive value you set in sccp.conf and take 75% keepalive + random amount of 25% of that keepalive value. This is the keepalive value we give to the phone (we vary this amount so all the devices have a slightly different keepalive and they don't all come at once). The window we reserve for a keepalive request however to arrive is keepalive + 25%. So if you have keepalive=60. The phone would get "45+ random(15)" and the server would use "60+15" for the keepalive window. Meaning a phone could be up to 15 second late without being scrapped.
Regarding TokenRequest during established connection: I will have to go through the debug log and pcap and then make up my mind to see if we should just ack the request if the connection is already establish. We can make a small git branch to see if that works to remedy the issues you are seeing. I will try to make this happen later today.
Hi Patrick,
Created a temporary branch for you to try. It will return a SCPCTokenAck when we receive a TokenRequest during an established connection. I am not sure how the phone will react, it might still re-register which would defeat the purpose or it will just continue. If it does re-register, we might just skip the acknowledgement altogether and see how the device reacts then.
Git Checkout this temporary branch
cd /usr/src/chan-sccp
git fetch origin
git branch issues/XXX origin/issues/517
git checkout issues/517
git pull
./configure [....]
make -j4 && make install && make reload
move back to develop afterwards:
cd /usr/src/chan-sccp
git checkout develop
git pull
./configure [....]
make -j4 && make install && make reload
Can you send me the phones debug log from SEP580A20877756. I would like to know why this device is sending the TokenRequest. Maybe it does have SRST enabled and is looking for a fallback server (We can switch that off using the SEPxxx.cnf.xml file). Or there might be some other reason we don't know about yet.
I'll be sure to install and test the new version after business hours today - we have a long weekend in the UK, so should give me some time to look at it in more detail. For the phone debug log - how does one access this?
I'll be sure to install and test the new version after business hours today - we have a long weekend in the UK, so should give me some time to look at it in more detail.
Ok :-)
For the phone debug log - how does one access this?
You can setup syslog (via port 514) on the SPA phone. But it can be a little tricky to get up and running. My SPA504 used to run syslog just fine, but some time ago it stopped and have not been able to get the syslog up and running since. Maybe you have more luck.
-
Configure your phone by directing your browser to: http://<IP_address_of_Phone>/admin/advanced Example: http://192.168.0.100/admin/advanced if the phone's IP address is 192.168.0.100
-
Enable Debug Messages by editing the following fields:
- Phone's web-ui > System tab > Optional Network Configuration > Debug Server: <IP address of syslog server> Example: 192.168.0.200 [if you are capturing with Wireshark, you do not need to use a valid syslog server's IP address, any "fake" address can be used, example 10.0.0.1]
- Phone's web-ui > System tab > Optional Network Configuration > Debug Level: 3
- Phone's web-ui > System tab > Settings > Debug Option: full
-
Enable Syslog Messages by editing the following: [step 3b is not needed if you have enabled Debug] Phone's web-ui > System tab > Optional Network Configuration > Syslog Server: <IP address of syslog server> Example: 192.168.0.200
-
Scroll down and click Submit All Changes
I've just checked the web interface for the phone - sadly all read only at the moment, which means (I suspect) that I'll need to add a syslog server to the XML configuration file, and somehow get the phone to re-read it. I'll see if I can get a syslog server set up - there used to be a time when this was easy, but now all of my Linux machines use systemd ;-)
You can enable the webinterface via the SEP file and restart the phone
<device>
....
<devicePool>
....
<srstInfo>
...
<!-- disable srst>
<srstOption>Disable</srstOption>
</srstInfo>
<!-- keepalive time (should match or be higher than the value in sccp.conf -->
<connectionMonitorDuration>60</connectionMonitorDuration>
</devicePool>
<commonProfile>
<!-- webinterface user='admin', pass='1234' -->
<phonePassword>1234</phonePassword>
</commonProfile>
<vendorConfig>
....
<!-- enable phones web interface -->
<webAccess>0</webAccess>
</vendorConfig>
....
</device>
Another way to enable the webinterface from the phone is going into the settting menu on the phone UI and entering the "Security Configuration", then press "**#" and select "Web Access Enabled".
If you don't want to use the webinterface you can create a profile file called: SPA504G-cfg.xml containing something like this.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<flat-profile>
<!-- https://www.youtube.com/watch?v=76IMzvfpsx8 -->
<Upgrade_Rule group="Provisioning/Firmware_Upgrade" ua="na">( $SWVER != 7.6.2 ) ? tftp://192.168.178.3/firmware/spa50x/spa50x-30x-762f.bin</Upgrade_Rule>
<Profile_Rule group="Provisioning/Configuration_Profile" ua="na">http://192.168.178.3:6970/settings/SEP$MA.cnf.xml</Profile_Rule>
<Report_Rule ua="na">tftp://192.168.178.3/reports/$PN_$SWVER_$MA_$SN.xml</Report_Rule>
<!--<DHCP_Option_To_Use ua="na">66,150,159,160</DHCP_Option_To_Use>-->
<DHCP_Option_To_Use ua="na">66,150</DHCP_Option_To_Use>
<Transport_Protocol ua="na">http</Transport_Protocol>
<Connection_Type group="Info/System_Information">DHCP</Connection_Type>
<Connection_Type group="System/Internet_Connection_Type_">DHCP</Connection_Type>
<Current_IP group="Info/System_Information">192.168.178.9</Current_IP>
<Host_Name group="Info/System_Information">SEP381C1AB6B976</Host_Name>
<Domain group="Info/System_Information">talon1.nl</Domain>
<Current_Netmask group="Info/System_Information">255.255.255.0</Current_Netmask>
<Current_Gateway group="Info/System_Information">192.168.187.2</Current_Gateway>
<Primary__DNS group="Info/System_Information">192.168.187.2</Primary__DNS>
<Secondary__DNS group="Info/System_Information"/>
<TFTP_Server group="Info/Network_Configuration">192.168.187.3</TFTP_Server>
<Call_Manager group="Info/Network_Configuration">192.168.187.3</Call_Manager>
<Call_Control_Protocol group="Info/Product_Information">SPCP</Call_Control_Protocol>
<Enable_Web_Server group="System/System_Configuration">Yes</Enable_Web_Server>
<Web_Server_Port group="System/System_Configuration">80</Web_Server_Port>
<Enable_Web_Admin_Access group="System/System_Configuration">Yes</Enable_Web_Admin_Access>
<Admin_Passwd group="System/System_Configuration">1234</Admin_Passwd>
<User_Password group="System/System_Configuration">0000</User_Password>
<Signaling_Protocol group="System/System_Configuration">SPCP</Signaling_Protocol>
<SPCP_Auto-detect group="System/System_Configuration">Yes</SPCP_Auto-detect>
<Phone-UI-readonly group="System/System_Configuration">No</Phone-UI-readonly>
<Phone-UI-user-mode group="System/System_Configuration">No</Phone-UI-user-mode>
<PoE_Power_Required group="System/Power_Setting">Normal</PoE_Power_Required>
<TFTP_Server group="System/Optional_Network_Configuration">192.168.187.3</TFTP_Server>
<Alternate_TFTP group="System/Optional_Network_Configuration">Yes</Alternate_TFTP>
<Syslog_Server group="System/Optional_Network_Configuration">192.168.187.3</Syslog_Server>
<Debug_Level group="System/Optional_Network_Configuration">3</Debug_Level>
<Debug_Server group="System/Optional_Network_Configuration">192.168.187.3</Debug_Server>
<Layer_2_Logging group="System/Optional_Network_Configuration">No</Layer_2_Logging>
<NTP_Enable group="System/Optional_Network_Configuration">Yes</NTP_Enable>
<Primary_NTP_Server group="System/Optional_Network_Configuration">0.nl.pool.ntp.org</Primary_NTP_Server>
<Secondary_NTP_Server group="System/Optional_Network_Configuration">1.nl.pool.ntp.org</Secondary_NTP_Server>
<Enable_VLAN group="System/VLAN_Settings">No</Enable_VLAN>
<Enable_CDP group="System/VLAN_Settings">Yes</Enable_CDP>
<Enable_LLDP-MED group="System/VLAN_Settings">Yes</Enable_LLDP-MED>
<Network_Startup_Delay group="System/VLAN_Settings">5</Network_Startup_Delay>
<VLAN_ID group="System/VLAN_Settings">1</VLAN_ID>
<Asset_ID group="System/Inventory_Settings">12345678</Asset_ID>
<Ring1 group="Phone/Ring_Tone">n=Chirp 1;w=2;c=1</Ring1>
<Ring2 group="Phone/Ring_Tone">n=Chirp 2;w=2;c=2</Ring2>
<Handset_Input_Gain group="Phone/Audio_Input_Gain__dB_">0</Handset_Input_Gain>
<Headset_Input_Gain group="Phone/Audio_Input_Gain__dB_">0</Headset_Input_Gain>
<Speakerphone_Input_Gain group="Phone/Audio_Input_Gain__dB_">0</Speakerphone_Input_Gain>
<Handset_Additional_Input_Gain group="Phone/Audio_Input_Gain__dB_">0</Handset_Additional_Input_Gain>
<Headset_Additional_Input_Gain group="Phone/Audio_Input_Gain__dB_">0</Headset_Additional_Input_Gain>
<Speakerphone_Additional_Input_Gain group="Phone/Audio_Input_Gain__dB_">0</Speakerphone_Additional_Input_Gain>
<CISCO_XML_EXE_Auth_Mode group="Phone/XML_Service">Trusted</CISCO_XML_EXE_Auth_Mode>
<SCCP_TOS_DiffServ_Value_1_ group="Ext_1/Network_Settings">0x68</SCCP_TOS_DiffServ_Value_1_>
<SCCP_CoS_Value_1_ group="Ext_1/Network_Settings">3</SCCP_CoS_Value_1_>
<Preferred_Audio_Device group="User/Supplementary_Services">Speaker</Preferred_Audio_Device>
<Send_Audio_To_Speaker group="User/Supplementary_Services">No</Send_Audio_To_Speaker>
<Shared_Line_DND_Cfwd_Enable group="User/Supplementary_Services">Yes</Shared_Line_DND_Cfwd_Enable>
<Ringer_Volume group="User/Audio">8</Ringer_Volume>
<Speaker_Volume group="User/Audio">8</Speaker_Volume>
<Handset_Volume group="User/Audio">10</Handset_Volume>
<Headset_Volume group="User/Audio">10</Headset_Volume>
<Handset_Version group="User/Audio">Auto</Handset_Version>
<Deep_Bass group="User/Audio">Default</Deep_Bass>
<LCD_Contrast group="User/LCD">8</LCD_Contrast>
<Back_Light_Timer group="User/LCD">Always On</Back_Light_Timer>
<Protect_IVR_FactoryReset>No</Protect_IVR_FactoryReset>
<Resync_On_Reset>Yes</Resync_On_Reset>
<Resync_Periodic>7200</Resync_Periodic>
<Upgrade_Enable ua="na">Yes</Upgrade_Enable>
<!--
<Dictionary_Server_Script ua="na">serv=tftp://192.168.178.3/firmware/spa50x/locale/;d0=English;x0=spa50x_30x_en_mx_v756.xml</Dictionary_Server_Script>
<Language_Selection ua="na">English</Language_Selection>
-->
</flat-profile>
Be carefull with the Upgrade_Rule lines which would cause an firmware update/change. Also, if you enabled the webinterface already, you can make the device dump this profile file to the tftp server using. This is latter option might not be the preferred method to setup one phone and should normally be used to roll out settings to multiple phones instead.
I've tested using the new code this morning - and although I can see the SPCPRegisterTokenAck now, the phone immediately responds with an Alarm, and then a RegisterReq message. The channel driver then sets everything to 'failed', tears down the session and resets as before, which would result in a dropped call. I've attached the asterisk log and packet capture below.
I'm still trying to see if I can get the phone to send information via syslog - have added the relevant lines to the web configuration and restarted, but even with Wireshark the phone does not appear to send anything at all. I'm going to keep investigating this as I suspect it would be very useful to know why the phone requests a token in the first place.
I've tested using the new code this morning - and although I can see the SPCPRegisterTokenAck now, the phone immediately responds with an Alarm, and then a RegisterReq message. The channel driver then sets everything to 'failed', tears down the session and resets as before, which would result in a dropped call. I've attached the asterisk log and packet capture below.
Ok that was one of the possible responses i was expecting and commented about. So the second solution might be to not react at all and see how the device responds.
I'm still trying to see if I can get the phone to send information via syslog - have added the relevant lines to the web configuration and restarted, but even with Wireshark the phone does not appear to send anything at all. I'm going to keep investigating this as I suspect it would be very useful to know why the phone requests a token in the first place.
I only got syslog working once and cannot get it back in to that state. I would love to figure out how to make it do that again though.
Can you insert the SEPxxxx.cnf.xml lines i mention above, to see if turning off srst will make a difference for this SPA device. Also can you see if the phones you have might contain different firmware versions and see if that lines up to their behavior differences. Note: there was a firmware update for the SPA phones earlier this year (Link: spa50x-30x-762f)
I can definitely try inserting the srst lines - will do that shortly. All phones are running the latest 7.6.2f firmware. In terms of configuration, all are configured using that same file that I sent before - I perform a factory reset on each one, and that file is downloaded... so identical configuration everywhere.
I'm still working on syslog - no joy yet though!
Updated the source, to not respond to the request at all.
Q: Is the asterisk machine also the firewall and gateway for these phones ? Because that would automatically designate this machine as the phones srst host, and would/might require the forceful disablement of srst (using those extra lines). Please do also add the other lines, like the keepalive value etc.
It's curious how phones of the same model and firmware react completely differently. This is the bain of reverse engineering and not being able to fully introspect all aspects. ie: trial & error + wishing :-)
OK, I'm trying the updated source now - haven't changed the config file yet, as I wanted to make a single change at a time so that we can tie down exactly what has worked, and what hasn't. In answer to your question: the asterisk machine does not function as a firewall or default gateway for the phones, but does function as the DNS (unused) and TFTP server.
My experience is that all of the phones in this system exhibit this behaviour from time to time - both the remote ones and the local ones. There doesn't seem to be any pattern to it - they all definitely have the same firmware, and are using the same configuration. But when they do start to request tokens, they do it several times over the space of 30 minutes or so, then stop again. It does sound like it could be a timeout issue - i.e. an internal timeout value is too short.
Completely understand the reverse engineering angle - I had to do that for an obsolete PLC system once. It used token bus, so I had to build the hardware interface and MAC before I could even start decoding packets on the wire. Managed to integrate it into Linux in the end, but discovered some horrible bugs in the PLC code in the process!
OK, I'm trying the updated source now - haven't changed the config file yet, as I wanted to make a single change at a time so that we can tie down exactly what has worked, and what hasn't. In answer to your question: the asterisk machine does not function as a firewall or default gateway for the phones, but does function as the DNS (unused) and TFTP server.
Good idea, to make one change at a time !
My experience is that all of the phones in this system exhibit this behaviour from time to time - both the remote ones and the local ones. There doesn't seem to be any pattern to it - they all definitely have the same firmware, and are using the same configuration. But when they do start to request tokens, they do it several times over the space of 30 minutes or so, then stop again. It does sound like it could be a timeout issue - i.e. an internal timeout value is too short.
Ok I thought it was only some of them :-). Maybe set the cnf.xml connectionMonitorDuration to 120 and update your sccp.conf keepalive to 120 and see if that makes a difference. From the pcap files you provided, the phones do send a keealive and we respond to them in time, within the next keepalive window, they already send the TokenRequest, so it's somewhat unlikely we are seeing a keepalive issue (except if the phones never received the keepAliveAck from the server (which would be difficult to verify). Would be nice if we could get syslog up and running, to make sure. I have tried to make some syslog related changes, for example switching to SIP mode, but even then it did not work :-(
Completely understand the reverse engineering angle - I had to do that for an obsolete PLC system once. It used token bus, so I had to build the hardware interface and MAC before I could even start decoding packets on the wire. Managed to integrate it into Linux in the end, but discovered some horrible bugs in the PLC code in the process!
Doesn't sound like you had a lot of fun getting that to work :-)
I'm running your new code version at the moment - has been running for an hour now with precisely zero token requests. Which in itself is strange - I would have expected to see at least one phone send a request by now. I'll keep an eye on it and post logs if/when it does.
Ok :-) BTW: We also have a chat server at https://gitter.im/chan-sccp/chan-sccp which might make frequent communication a little easier ?
Curiouser and Curiouser :-) (Free after Alice)
@patrickkeys Any progress ?