henchman icon indicating copy to clipboard operation
henchman copied to clipboard

Henchman sometimes fails with "unexpected end of JSON input" Error

Open baskaran-md opened this issue 10 years ago • 16 comments

02:06:39 echo '{"cmd":"yum install -y edge-router --enablerepo=epel --nogpgcheck","loglevel":"debug"}' | /bin/bash -c 'sudo -H -u root ${HOME}/.henchman/shell'
02:06:53 unexpected end of JSON input

Not stacktrace /error log found!

baskaran-md avatar Dec 02 '15 18:12 baskaran-md

Is this being run with the latest release?

jlin21 avatar Dec 02 '15 18:12 jlin21

Actually still fixing the latest release, something is up with the copy module

jlin21 avatar Dec 02 '15 18:12 jlin21

should be ready to go now

jlin21 avatar Dec 02 '15 18:12 jlin21

00:04:17.471 
00:04:29.129 time="2015-12-03T02:27:50Z" level=error msg="Error running task 'Install Apigee RPM'" error="While in exec_module :: While unmarshalling task results :: unexpected end of JSON input" host=10.17.6.32 plan="Install Router" task="Install Apigee RPM" 

baskaran-md avatar Dec 04 '15 15:12 baskaran-md

The issue at hand is some of the data is being dropped from ssh.Exec. So if the output should return {"status": "changed", "msg": "hello world"} sometimes the output will be {"status": "changed", "msg": "hello. The second output will result in the JSON input error because it does not follow proper JSON. So the temp fix to this is to slap on "} to the end. This will result in incomplete outputs for the time being. The user will be notified when this occurs though.

jlin21 avatar Dec 07 '15 07:12 jlin21

Notes:

  • I have tried removing concurrency in plan.go
  • Used ssh.go's methods in multiple ways including using stdoutPipe and what not

This could be an underlying issue with ssh.go

jlin21 avatar Dec 18 '15 23:12 jlin21

Any help on this? @madhurranjan / @sudharsh . This keeps happening and coulndt proceed with other tasks on playbook unless i have "ignore_errors: true" for that task.

baskaran-md avatar Dec 29 '15 16:12 baskaran-md

@baskaran-md Does temp fix prevent henchman from throwing "End of JSON Input error?". It should. If it does happen let me know please. We're still trying to find the root cause of that issues

jlin21 avatar Dec 30 '15 00:12 jlin21

Have removed the explicit appending of '}' if this occurs as more log output is needed

sudharsh avatar Jan 06 '16 21:01 sudharsh

@sudharsh did you get a chance to test this yet?

jlin21 avatar Jan 09 '16 06:01 jlin21

Here's what we know so far to be able reproduce this (1 in 3-5 times)

1.) The target node has to be, * amazon-linux No problems so far in others including centos 7.x, centos 6.5 and rhel 7.2 2.) Python version is 2.7.10 3.) There should be atleast one call to the Popen() constructor. Doesn't matter if this invocation is being used or is in the critical path. The moment a call to Popen() is made, things become unreliable.

sudharsh avatar Jan 19 '16 21:01 sudharsh

fat-fingered 'Close and comment'

sudharsh avatar Jan 19 '16 21:01 sudharsh

Tried downgrading 2.7.10 to 2.7.5 on amazon linux. No dice.

sudharsh avatar Jan 19 '16 23:01 sudharsh

Here's another clue, https://bugs.python.org/issue19612

sudharsh avatar Feb 09 '16 01:02 sudharsh

Issue resolved when shell module was ported to go

jlin21 avatar Feb 18 '16 02:02 jlin21

go shell module will break at 1.4 mil length for output

jlin21 avatar Feb 22 '16 02:02 jlin21