lita-slack icon indicating copy to clipboard operation
lita-slack copied to clipboard

keep getting disconnected from Slack and crashes

Open codewithtyler opened this issue 7 years ago • 7 comments

I've noticed for quite a while now that our lita bot goes offline at random times. We tend to say that the bot is asleep, but this past week we changed our Heroku account over so that it's no longer on the legacy dynos. Our app now runs 24/7 so that we know it's not an issue with Heroku putting the app to sleep.

I started going through our logs for the past 24 hours and at first I thought I saw a pattern where each time the app crashed the log right before it would say Error with code 1 received from Slack: Socket URL has expired but as I went through more crashes I started to notice that a few didn't have that at all. I'm not sure else what to try and would gladly take any suggestions you might have! I'll post snippets from our crashes below to help you get an idea what I've been seeing.

424 <134>1 2016-10-15T17:06:53+00:00 app heroku-redis - - source=REDIS sample#active-connections=2 sample#load-avg-1m=0.005 sample#load-avg-5m=0.04 sample#load-avg-15m=0.06 sample#read-iops=0 sample#write-iops=0 sample#memory-total=15405632.0kB sample#memory-free=12757316.0kB sample#memory-cached=1028088.0kB sample#memory-redis=1118792bytes sample#hit-rate=0.99753 sample#evicted-keys=0
» 15 Oct 2016 12:08:04.369 103 <45>1 2016-10-15T17:08:04.152774+00:00 heroku web.1 - - Cycling
» 15 Oct 2016 12:08:04.444 129 <45>1 2016-10-15T17:08:04.153440+00:00 heroku web.1 - - State changed from up to starting
» 15 Oct 2016 12:08:06.541 136 <45>1 2016-10-15T17:08:06.351778+00:00 heroku web.1 - - Starting process with command `bin/lita`
» 15 Oct 2016 12:08:07.073 131 <45>1 2016-10-15T17:08:06.705163+00:00 heroku web.1 - - Stopping all processes with SIGTERM
» 15 Oct 2016 12:08:07.556 150 <190>1 2016-10-15T17:08:07.318189+00:00 app web.1 - - [2016-10-15 17:08:07 UTC] INFO: Disconnected from Slack.
» 15 Oct 2016 12:08:07.744 126 <45>1 2016-10-15T17:08:07.446928+00:00 heroku web.1 - - Process exited with status 143
» 15 Oct 2016 12:08:09.450 292 <190>1 2016-10-15T17:08:09.004140+00:00 app web.1 - - [2016-10-15 17:08:09 UTC] WARN: Lita::Handler.default_config is deprecated and will be removed in Lita 5.0. Use Lita::Handler.config instead. The method was found defined in the static_meme handler.
» 15 Oct 2016 12:08:10.308 129 <45>1 2016-10-15T17:08:10.066086+00:00 heroku web.1 - - State changed from starting to up
424 <134>1 2016-10-15T17:08:23+00:00 app heroku-redis - - source=REDIS sample#active-connections=2 sample#load-avg-1m=0.07 sample#load-avg-5m=0.06 sample#load-avg-15m=0.065 sample#read-iops=0 sample#write-iops=0 sample#memory-total=15405632.0kB sample#memory-free=12757480.0kB sample#memory-cached=1028088.0kB sample#memory-redis=1155680bytes sample#hit-rate=0.99754 sample#evicted-keys=0
» 15 Oct 2016 12:09:49.437 150 <190>1 2016-10-15T17:09:49.217754+00:00 app web.1 - - [2016-10-15 17:09:49 UTC] INFO: Disconnected from Slack.
» 15 Oct 2016 12:09:49.534 124 <45>1 2016-10-15T17:09:49.322137+00:00 heroku web.1 - - Process exited with status 0
» 15 Oct 2016 12:09:49.665 128 <45>1 2016-10-15T17:09:49.332577+00:00 heroku web.1 - - State changed from up to crashed
» 15 Oct 2016 12:09:49.731 134 <45>1 2016-10-15T17:09:49.333476+00:00 heroku web.1 - - State changed from crashed to starting
» 15 Oct 2016 12:09:53.094 136 <45>1 2016-10-15T17:09:52.737953+00:00 heroku web.1 - - Starting process with command `bin/lita`
150 <190>1 2016-10-15T17:09:49.217754+00:00 app web.1 - - [2016-10-15 17:09:49 UTC] INFO: Disconnected from Slack.
» 15 Oct 2016 12:09:49.534 124 <45>1 2016-10-15T17:09:49.322137+00:00 heroku web.1 - - Process exited with status 0
» 15 Oct 2016 12:09:49.665 128 <45>1 2016-10-15T17:09:49.332577+00:00 heroku web.1 - - State changed from up to crashed
» 15 Oct 2016 12:09:49.731 134 <45>1 2016-10-15T17:09:49.333476+00:00 heroku web.1 - - State changed from crashed to starting
» 15 Oct 2016 12:09:53.094 136 <45>1 2016-10-15T17:09:52.737953+00:00 heroku web.1 - - Starting process with command `bin/lita`
» 15 Oct 2016 12:09:55.135 129 <45>1 2016-10-15T17:09:54.927255+00:00 heroku web.1 - - State changed from starting to up
» 15 Oct 2016 12:09:55.142 292 <190>1 2016-10-15T17:09:54.806043+00:00 app web.1 - - [2016-10-15 17:09:54 UTC] WARN: Lita::Handler.default_config is deprecated and will be removed in Lita 5.0. Use Lita::Handler.config instead. The method was found defined in the static_meme handler.
» 15 Oct 2016 12:10:36.526 426 <134>1 2016-10-15T17:09:53+00:00 app heroku-redis - - source=REDIS sample#active-connections=1 sample#load-avg-1m=0.045 sample#load-avg-5m=0.055 sample#load-avg-15m=0.065 sample#read-iops=0 sample#write-iops=0 sample#memory-total=15405632.0kB sample#memory-free=12757132.0kB sample#memory-cached=1028092.0kB sample#memory-redis=1097920bytes sample#hit-rate=0.99754 sample#evicted-keys=0
» 15 Oct 2016 12:10:44.305 150 <190>1 2016-10-15T17:10:44.105685+00:00 app web.1 - - [2016-10-15 17:10:44 UTC] INFO: Disconnected from Slack.
» 15 Oct 2016 12:10:44.420 124 <45>1 2016-10-15T17:10:44.200057+00:00 heroku web.1 - - Process exited with status 0
» 15 Oct 2016 12:10:45.179 128 <45>1 2016-10-15T17:10:44.219112+00:00 heroku web.1 - - State changed from up to crashed
198 <45>1 2016-10-15T17:41:12.839546+00:00 heroku api - - Starting process with command `curl https://r2d8.herokuapp.com/lita/info` by [email protected]
» 15 Oct 2016 12:41:15.866 178 <45>1 2016-10-15T17:41:15.707360+00:00 heroku scheduler.9953 - - Starting process with command `curl https://r2d8.herokuapp.com/lita/info`
» 15 Oct 2016 12:41:16.560 138 <45>1 2016-10-15T17:41:16.332156+00:00 heroku scheduler.9953 - - State changed from starting to up
» 15 Oct 2016 12:41:18.123 182 <190>1 2016-10-15T17:41:17.753509+00:00 app scheduler.9953 - - % Total % Received % Xferd Average Speed Time Time Time Current
» 15 Oct 2016 12:41:18.125 284 <158>1 2016-10-15T17:41:17.857574+00:00 heroku router - - at=info method=GET path="/lita/info" host=r2d8.herokuapp.com request_id=67444cad-79fd-4aa8-ae84-3700452d851e fwd="54.198.129.199" dyno=web.1 connect=1ms service=81ms status=200 bytes=235
» 15 Oct 2016 12:41:18.191 138 <45>1 2016-10-15T17:41:17.930765+00:00 heroku scheduler.9953 - - State changed from up to complete
» 15 Oct 2016 12:41:18.198 180 <190>1 2016-10-15T17:41:17.753596+00:00 app scheduler.9953 - - Dload Upload Total Spent Left Speed
» 15 Oct 2016 12:41:18.198 261 <190>1 2016-10-15T17:41:17.850569+00:00 app scheduler.9953 - - 
 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 144 100 144 0 0 1481 0 --:--:-- --:--:-- --:--:-- 1484
» 15 Oct 2016 12:41:18.198 247 <190>1 2016-10-15T17:41:17.910440+00:00 app scheduler.9953 - - {"adapter":"slack","lita_version":"4.7.1","redis_memory_usage":"1.05M","redis_version":"3.0.7","robot_mention_name":"r2d8","robot_name":"R2-D8"}
» 15 Oct 2016 12:41:18.198 133 <45>1 2016-10-15T17:41:17.914868+00:00 heroku scheduler.9953 - - Process exited with status 0
» 15 Oct 2016 12:41:44.987 188 <190>1 2016-10-15T17:41:44.820086+00:00 app web.1 - - [2016-10-15 17:41:44 UTC] ERROR: Error with code 1 received from Slack: Socket URL has expired
» 15 Oct 2016 12:41:45.063 150 <190>1 2016-10-15T17:41:44.824342+00:00 app web.1 - - [2016-10-15 17:41:44 UTC] INFO: Disconnected from Slack.
» 15 Oct 2016 12:41:45.265 128 <45>1 2016-10-15T17:41:44.954125+00:00 heroku web.1 - - State changed from up to crashed
» 15 Oct 2016 12:41:45.746 124 <45>1 2016-10-15T17:41:44.935228+00:00 heroku web.1 - - Process exited with status 0
178 <45>1 2016-10-15T17:41:15.707360+00:00 heroku scheduler.9953 - - Starting process with command `curl https://r2d8.herokuapp.com/lita/info`
» 15 Oct 2016 12:41:16.560 138 <45>1 2016-10-15T17:41:16.332156+00:00 heroku scheduler.9953 - - State changed from starting to up
» 15 Oct 2016 12:41:18.123 182 <190>1 2016-10-15T17:41:17.753509+00:00 app scheduler.9953 - - % Total % Received % Xferd Average Speed Time Time Time Current
» 15 Oct 2016 12:41:18.125 284 <158>1 2016-10-15T17:41:17.857574+00:00 heroku router - - at=info method=GET path="/lita/info" host=r2d8.herokuapp.com request_id=67444cad-79fd-4aa8-ae84-3700452d851e fwd="54.198.129.199" dyno=web.1 connect=1ms service=81ms status=200 bytes=235
» 15 Oct 2016 12:41:18.191 138 <45>1 2016-10-15T17:41:17.930765+00:00 heroku scheduler.9953 - - State changed from up to complete
» 15 Oct 2016 12:41:18.198 180 <190>1 2016-10-15T17:41:17.753596+00:00 app scheduler.9953 - - Dload Upload Total Spent Left Speed
» 15 Oct 2016 12:41:18.198 261 <190>1 2016-10-15T17:41:17.850569+00:00 app scheduler.9953 - - 
 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 144 100 144 0 0 1481 0 --:--:-- --:--:-- --:--:-- 1484
» 15 Oct 2016 12:41:18.198 247 <190>1 2016-10-15T17:41:17.910440+00:00 app scheduler.9953 - - {"adapter":"slack","lita_version":"4.7.1","redis_memory_usage":"1.05M","redis_version":"3.0.7","robot_mention_name":"r2d8","robot_name":"R2-D8"}
» 15 Oct 2016 12:41:18.198 133 <45>1 2016-10-15T17:41:17.914868+00:00 heroku scheduler.9953 - - Process exited with status 0
» 15 Oct 2016 12:41:44.987 188 <190>1 2016-10-15T17:41:44.820086+00:00 app web.1 - - [2016-10-15 17:41:44 UTC] ERROR: Error with code 1 received from Slack: Socket URL has expired
» 15 Oct 2016 12:41:45.063 150 <190>1 2016-10-15T17:41:44.824342+00:00 app web.1 - - [2016-10-15 17:41:44 UTC] INFO: Disconnected from Slack.
» 15 Oct 2016 12:41:45.265 128 <45>1 2016-10-15T17:41:44.954125+00:00 heroku web.1 - - State changed from up to crashed
» 15 Oct 2016 12:41:45.746 124 <45>1 2016-10-15T17:41:44.935228+00:00 heroku web.1 - - Process exited with status 0
134 <45>1 2016-10-15T19:29:07.078792+00:00 heroku web.1 - - State changed from crashed to starting
» 15 Oct 2016 14:29:09.226 136 <45>1 2016-10-15T19:29:08.980027+00:00 heroku web.1 - - Starting process with command `bin/lita`
» 15 Oct 2016 14:29:11.774 292 <190>1 2016-10-15T19:29:11.369858+00:00 app web.1 - - [2016-10-15 19:29:11 UTC] WARN: Lita::Handler.default_config is deprecated and will be removed in Lita 5.0. Use Lita::Handler.config instead. The method was found defined in the static_meme handler.
» 15 Oct 2016 14:29:12.811 129 <45>1 2016-10-15T19:29:12.658457+00:00 heroku web.1 - - State changed from starting to up
» 15 Oct 2016 14:29:07.295 134 <45>1 2016-10-15T19:29:07.078792+00:00 heroku web.1 - - State changed from crashed to starting
» 15 Oct 2016 14:29:09.226 136 <45>1 2016-10-15T19:29:08.980027+00:00 heroku web.1 - - Starting process with command `bin/lita`
» 15 Oct 2016 14:29:11.774 292 <190>1 2016-10-15T19:29:11.369858+00:00 app web.1 - - [2016-10-15 19:29:11 UTC] WARN: Lita::Handler.default_config is deprecated and will be removed in Lita 5.0. Use Lita::Handler.config instead. The method was found defined in the static_meme handler.
» 15 Oct 2016 14:29:12.811 129 <45>1 2016-10-15T19:29:12.658457+00:00 heroku web.1 - - State changed from starting to up
» 15 Oct 2016 14:29:53.675 145 <190>1 2016-10-15T19:29:53.290508+00:00 app web.1 - - [2016-10-15 19:29:53 UTC] INFO: Connected to Slack.

codewithtyler avatar Oct 16 '16 00:10 codewithtyler

Here's an update on what I've found so far. We set the logging level set to debug and I let the bot run for 3 days so that I'd have plenty of data to go through. Heroku says there were 37 crashes but I was only able to find 27 in the log. I have no idea where it got the other 10 crashes from.

Anyways, our bot disconnected from Slack 30 times. 26 out of those 30 times there was a crash immediately following the disconnect. 3 of the disconnects occurred when the app was starting up. 2 of the disconnects occurred immediately after the app finished connecting to the Slack Real Time Messaging API. Some of these last 5 disconnections I mentioned also had an error about the socket URL expiring.

Is there anything else that I can do on my end to help diagnose this further?

codewithtyler avatar Oct 22 '16 04:10 codewithtyler

I'm having this same issue.

frozenfoxx avatar Dec 28 '16 17:12 frozenfoxx

We are also seeing this issue - nothing of note in the logs except the Disconnect message.

rsanheim avatar May 18 '18 14:05 rsanheim

I am seeing something similar as well:

[2018-05-18 18:12:07 UTC] DEBUG: Connecting to the Slack Real Time Messaging API.
[2018-05-18 18:12:08 UTC] DEBUG: Connected to the Slack Real Time Messaging API.
[2018-05-18 18:12:08 UTC] ERROR: Error with code 1 received from Slack: Socket URL has expired
[2018-05-18 18:12:08 UTC] INFO: Disconnected from Slack.
[2018-05-18 18:12:08 UTC] DEBUG: Closing connection to the Slack Real Time Messaging API.

then a crash

this message appears approx 1min after startup

Twister915 avatar May 18 '18 18:05 Twister915

@rsanheim I was able to fix it, but I broke contracts (the user will not be auto-created, nor will the rooms be auto-created).

You can check my repo here: https://github.com/Twister915/lita-slack

I will do a pull request once I can implement it in such a way that it still creates the users & rooms if they don't exist.

For the time being, I was using it like this in my gemfile: gem "lita-slack", :git => 'https://github.com/Twister915/lita-slack.git', :ref => '8ab40a0'

Hope you find that useful

Twister915 avatar May 18 '18 19:05 Twister915

@Twister915 Have you had any luck getting this to work in a way that lets you still create the users & rooms? I reviewed your branch after your last comment. I don't have everything set up currently to where I could debug the application, but I'm happy to help where I can. This bug has been a major issue for our Slack bot and causes it to be down most of the time.

codewithtyler avatar May 30 '18 15:05 codewithtyler

We're suddenly getting this quite frequently as well. This thread implies that it's because our Slack size has grown significantly, and using rtm.connect instead of rtm.start might help. I don't see an option for that in litabot, unfortunately. Thanks for sharing your fix, @Twister915, that does help. It changed our lita karma levels somehow, though!

marilyn avatar Jul 10 '20 18:07 marilyn