eye icon indicating copy to clipboard operation
eye copied to clipboard

eye stuck up and resque not processing

Open nguyenchiencong opened this issue 9 years ago • 15 comments

From time to time, my resque tasks stop processing and eye info gives: eye timed out without responding...

Any advice? I think it's related to the latest updates as it has never happened before.

Thx

nguyenchiencong avatar Jan 19 '16 07:01 nguyenchiencong

More info: ruby 2.3.0

nguyenchiencong avatar Jan 19 '16 07:01 nguyenchiencong

timed out usually happens when eye moved to swap, may be you have so little memory on server. What the last ok eye version? Also with 1.9.3 eye used less memory.

kostya avatar Jan 19 '16 12:01 kostya

Hi, indeed we use a lot of memory but that should not happen like that. The last ok version was eye (0.6.4) with ruby 2.2.2.

nguyenchiencong avatar Jan 20 '16 12:01 nguyenchiencong

Yes it should not. May be you leave too few free memory on the server (how many?). How many memory eye daemon used? is eye in swap? (cat /proc/{PID}/status | grep Vm). From 0.6.4 eye updated celluloid dependency (0.15 -> 0.17) which is used more memory.

kostya avatar Jan 20 '16 14:01 kostya

Also what in eye log when the problem happens.

kostya avatar Jan 20 '16 14:01 kostya

at least you can try eye v '0.8.celluloid15', which is like 0.8 but used old celluloid (so it should be equally to 0.6.4).

kostya avatar Jan 20 '16 14:01 kostya

Thx a lot for the quick reply. I will try that and get back to you.

nguyenchiencong avatar Jan 21 '16 02:01 nguyenchiencong

I just checked this in normal state (cat /proc/{PID}/status | grep Vm). I will check it again when the server is under heavy memory load

VmPeak:  3421708 kB
VmSize:  3381256 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:     74848 kB
VmRSS:     74416 kB
VmData:  3314108 kB
VmStk:      8192 kB
VmExe:      2976 kB
VmLib:      6460 kB
VmPTE:      1708 kB
VmPMD:        28 kB
VmSwap:        0 kB

nguyenchiencong avatar Jan 21 '16 03:01 nguyenchiencong

eye info still hangs (ie eye timed out without responding...) with those info (ie no swap) and plenty of memory left (at least 40-50%):

VmPeak:  3421708 kB
VmSize:  3374600 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:     74912 kB
VmRSS:     74520 kB
VmData:  3307452 kB
VmStk:      8192 kB
VmExe:      2976 kB
VmLib:      6460 kB
VmPTE:      1692 kB
VmPMD:        28 kB
VmSwap:        0 kB

I will try to revert to a last version of eye.

nguyenchiencong avatar Jan 21 '16 03:01 nguyenchiencong

Looks like this is not memory problem. Maybe you have so big Load Average or Disk Usage on the server, so eye client responding slow. try EYE_CLIENT_TIMEOUT=100 eye i. if this not working too, you should find in eye log where the problem.

kostya avatar Jan 21 '16 12:01 kostya

Last time I checked EYE_CLIENT_TIMEOUT=100 eye i , it still hangs. Just to let you know that '0.8.celluloid15' seems to work and the problem is no more happening.

nguyenchiencong avatar Jan 21 '16 13:01 nguyenchiencong

if 0.8.celluloid15 works, nice, please help to find problem in 0.8.

add to config:

Eye.config do
  logger '/tmp/eye.log'
  logger_level Logger::DEBUG
end

and find in log where the problem was, may be some exception, or something.

kostya avatar Jan 21 '16 14:01 kostya

Sure I'll do that tomorrow when the service is no more critical

nguyenchiencong avatar Jan 21 '16 15:01 nguyenchiencong

Sorry for the late reply. I don't think I can risk my production environment this time. Do you have any other way to replicate this problem?

nguyenchiencong avatar Feb 02 '16 11:02 nguyenchiencong

so 0.8.celluloid15 works fine for long, and 0.8 have problems after some hours (on ruby 2.3.0)? diff between this versions is only https://github.com/kostya/eye/compare/v0.8.celluloid15?expand=1 (updated celluloid version)

may be you can reproduce it on some stagging environment?

kostya avatar Feb 02 '16 11:02 kostya