orchestrator-agent icon indicating copy to clipboard operation
orchestrator-agent copied to clipboard

Error on lvs-snapshots

Open hagay3 opened this issue 8 years ago • 25 comments

Hi, I`m facing some issue with orchestrator agent with fetching lvs-snapshots. As orchestrator is failed to show the node snapshots, I can see inside the orchestrator agent log file

[martini] Started GET /api/lvs-snapshots for 10.**:52318 2016-06-17 08:05:23 ERROR exit status 127*

Also tried it by myself with orchestrator agent api and it turns out to be unreachable: request: host:3002/api/lvs-snapshots?token=......

{ Code: "ERROR", Message: "exit status 127", Details: null }

Restarting orchestrator-agent bring this request to life, but it seems that after a while it become unavailable again.

hagay3 avatar Jun 17 '16 12:06 hagay3

@shlomi-noach ?

hagay3 avatar Jun 19 '16 17:06 hagay3

Can you --debug --trace and copy output?

shlomi-noach avatar Jun 19 '16 18:06 shlomi-noach

Maybe do you mean -stack ?

Usage of ./orchestrator-agent: -config="": config file name -debug=false: debug mode (very verbose) -stack=false: add stack trace upon error -verbose=false: verbose

hagay3 avatar Jun 19 '16 18:06 hagay3

Yes you are right --stack

shlomi-noach avatar Jun 19 '16 18:06 shlomi-noach

seems like the error not returns after the trace enabled. I will check it all over the cluster and let you know.

hagay3 avatar Jun 22 '16 18:06 hagay3

--trace should not change behavior. It merely prints stack trace given an error occurred. It does not compile differently or anything.

shlomi-noach avatar Jun 22 '16 18:06 shlomi-noach

Yes, this is why i`m testing it all over the cluster waiting for error to re-appear.

hagay3 avatar Jun 22 '16 18:06 hagay3

Hi, here is the error log

2
[martini] Started GET /api/lvs-snapshots for ****:57694
2016-06-23 10:59:20 DEBUG execCmd: lvs --noheading -o lv_name,vg_name,lv_path,snap_percent
2016-06-23 10:59:20 ERROR exit status 127
/root/go/src/github.com/outbrain/golib/log/log.go:178 (0x43d071)
/root/go/src/github.com/outbrain/golib/log/log.go:224 (0x43d4ca)
/root/orchestrator-agent/src/github.com/outbrain/orchestrator-agent/osagent/osagent.go:106 (0x4f5fd2)
/root/orchestrator-agent/src/github.com/outbrain/orchestrator-agent/osagent/osagent.go:156 (0x4f675b)
/root/orchestrator-agent/src/github.com/outbrain/orchestrator-agent/http/api.go:104 (0x50acb0)
/root/orchestrator-agent/src/github.com/outbrain/orchestrator-agent/http/api.go:456 (0x510ace)
/usr/lib/go/src/pkg/runtime/asm_amd64.s:339 (0x424582)
/usr/lib/go/src/pkg/reflect/value.go:474 (0x52d82b)
/usr/lib/go/src/pkg/reflect/value.go:345 (0x52c91d)
/root/go/src/github.com/codegangsta/inject/inject.go:102 (0x5bf904)
/root/go/src/github.com/go-martini/martini/env.go:1 (0x5039fc)
/root/go/src/github.com/go-martini/martini/router.go:408 (0x501074)
/root/go/src/github.com/go-martini/martini/router.go:285 (0x500492)
/root/go/src/github.com/go-martini/martini/router.go:132 (0x4ff4e2)
/root/go/src/github.com/go-martini/martini/martini.go:125 (0x501aa0)
/usr/lib/go/src/pkg/runtime/asm_amd64.s:340 (0x4245e2)
/usr/lib/go/src/pkg/reflect/value.go:474 (0x52d82b)
/usr/lib/go/src/pkg/reflect/value.go:345 (0x52c91d)
/root/go/src/github.com/codegangsta/inject/inject.go:102 (0x5bf904)
/root/go/src/github.com/go-martini/martini/martini.go:179 (0x4fd982)
/root/go/src/github.com/go-martini/martini/martini.go:170 (0x4fd8db)
/root/go/src/github.com/martini-contrib/gzip/gzip.go:40 (0x507962)
/root/go/src/github.com/martini-contrib/gzip/gzip.go:56 (0x507a05)
/usr/lib/go/src/pkg/runtime/asm_amd64.s:340 (0x4245e2)
/usr/lib/go/src/pkg/reflect/value.go:474 (0x52d82b)
/usr/lib/go/src/pkg/reflect/value.go:345 (0x52c91d)
/root/go/src/github.com/codegangsta/inject/inject.go:102 (0x5bf904)
/root/go/src/github.com/go-martini/martini/martini.go:179 (0x4fd982)
/root/go/src/github.com/go-martini/martini/martini.go:170 (0x4fd8db)
/root/go/src/github.com/go-martini/martini/recovery.go:142 (0x502036)
/root/go/src/github.com/go-martini/martini/martini.go:179 (0x4fd982)
/root/go/src/github.com/go-martini/martini/martini.go:170 (0x4fd8db)
/root/go/src/github.com/go-martini/martini/recovery.go:142 (0x502036)
/usr/lib/go/src/pkg/runtime/asm_amd64.s:339 (0x424582)
/usr/lib/go/src/pkg/reflect/value.go:474 (0x52d82b)
/usr/lib/go/src/pkg/reflect/value.go:345 (0x52c91d)
/root/go/src/github.com/codegangsta/inject/inject.go:102 (0x5bf904)
/root/go/src/github.com/go-martini/martini/martini.go:179 (0x4fd982)
/root/go/src/github.com/go-martini/martini/martini.go:170 (0x4fd8db)
/root/go/src/github.com/go-martini/martini/logger.go:25 (0x501862)
/usr/lib/go/src/pkg/runtime/asm_amd64.s:340 (0x4245e2)
/usr/lib/go/src/pkg/reflect/value.go:474 (0x52d82b)
/usr/lib/go/src/pkg/reflect/value.go:345 (0x52c91d)
/root/go/src/github.com/codegangsta/inject/inject.go:102 (0x5bf904)
/root/go/src/github.com/go-martini/martini/martini.go:179 (0x4fd982)
/root/go/src/github.com/go-martini/martini/martini.go:75 (0x4fccd3)
/usr/lib/go/src/pkg/net/http/server.go:1597 (0x4e222e)
/usr/lib/go/src/pkg/net/http/server.go:1167 (0x4e01a7)
/usr/lib/go/src/pkg/runtime/proc.c:1394 (0x417a50)
[martini] Completed 500 Internal Server Error in 8.186557ms

hagay3 avatar Jun 23 '16 13:06 hagay3

OK, this doesn't add much insight. It's just that the lvs command returns with error code. You say restarting orchestrator-agent brings this back to life. And -- without restarting orchestrator-agent, does it keep consistently return with this error? For how long?

And, assuming it is consistently returning same error, are you able to invoke that command in command line?

shlomi-noach avatar Jun 27 '16 11:06 shlomi-noach

It is consistently returns same error. Invoking the command is OK. output example

mysqldata mdata /dev/mdata/mysqldata mysqldata-snapd01-2016-06-30 mdata /dev/mdata/mysqldata-snapd01-2016-06-30 75.00 home outbrain /dev/outbrain/home opt outbrain /dev/outbrain/opt outbrain outbrain /dev/outbrain/outbrain root outbrain /dev/outbrain/root swap outbrain /dev/outbrain/swap tmp outbrain /dev/outbrain/tmp var outbrain /dev/outbrain/var

hagay3 avatar Jun 30 '16 12:06 hagay3

@shlomi-noach ?

hagay3 avatar Jul 03 '16 07:07 hagay3

Maybe this one is different error?

2016-07-03 03:00:08 DEBUG execCmd: lvs --noheading -o lv_name,vg_name,lv_path,snap_percent
2016-07-03 03:00:08 ERROR exit status 127
/usr/share/golang/src/github.com/outbrain/golib/log/log.go:111 (0x43d011)
/usr/share/golang/src/github.com/outbrain/golib/log/log.go:157 (0x43d37a)
/home/snoach/dev/outbrain/github/orchestrator-agent/src/github.com/outbrain/orchestrator-agent/osagent/osagent.go:97 (0x4f4202)
/home/snoach/dev/outbrain/github/orchestrator-agent/src/github.com/outbrain/orchestrator-agent/osagent/osagent.go:147 (0x4f4944)
/home/snoach/dev/outbrain/github/orchestrator-agent/src/github.com/outbrain/orchestrator-agent/http/api.go:103 (0x507c40)
/home/snoach/dev/outbrain/github/orchestrator-agent/src/github.com/outbrain/orchestrator-agent/http/api.go:444 (0x50d78e)
/usr/local/go/src/pkg/runtime/asm_amd64.s:339 (0x424582)
/usr/local/go/src/pkg/reflect/value.go:474 (0x52958b)
/usr/local/go/src/pkg/reflect/value.go:345 (0x52867d)
/usr/share/golang/src/github.com/codegangsta/inject/inject.go:102 (0x5ba8b4)
/usr/share/golang/src/github.com/go-martini/martini/env.go:1 (0x50105c)
/usr/share/golang/src/github.com/go-martini/martini/router.go:350 (0x4fe934)
/usr/share/golang/src/github.com/go-martini/martini/router.go:229 (0x4fdd82)
/usr/share/golang/src/github.com/go-martini/martini/router.go:112 (0x4fcdbc)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:119 (0x4ff330)
/usr/local/go/src/pkg/runtime/asm_amd64.s:340 (0x4245e2)
/usr/local/go/src/pkg/reflect/value.go:474 (0x52958b)
/usr/local/go/src/pkg/reflect/value.go:345 (0x52867d)
/usr/share/golang/src/github.com/codegangsta/inject/inject.go:102 (0x5ba8b4)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:173 (0x4fb3c2)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:164 (0x4fb31b)
/usr/share/golang/src/github.com/martini-contrib/gzip/gzip.go:33 (0x504ae2)
/usr/local/go/src/pkg/runtime/asm_amd64.s:340 (0x4245e2)
/usr/local/go/src/pkg/reflect/value.go:474 (0x52958b)
/usr/local/go/src/pkg/reflect/value.go:345 (0x52867d)
/usr/share/golang/src/github.com/codegangsta/inject/inject.go:102 (0x5ba8b4)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:173 (0x4fb3c2)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:164 (0x4fb31b)
/usr/share/golang/src/github.com/go-martini/martini/recovery.go:140 (0x4ff856)
/usr/local/go/src/pkg/runtime/asm_amd64.s:339 (0x424582)
/usr/local/go/src/pkg/reflect/value.go:474 (0x52958b)
/usr/local/go/src/pkg/reflect/value.go:345 (0x52867d)
/usr/share/golang/src/github.com/codegangsta/inject/inject.go:102 (0x5ba8b4)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:173 (0x4fb3c2)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:164 (0x4fb31b)
/usr/share/golang/src/github.com/go-martini/martini/logger.go:25 (0x4ff0f2)
/usr/local/go/src/pkg/runtime/asm_amd64.s:340 (0x4245e2)
/usr/local/go/src/pkg/reflect/value.go:474 (0x52958b)
/usr/local/go/src/pkg/reflect/value.go:345 (0x52867d)
/usr/share/golang/src/github.com/codegangsta/inject/inject.go:102 (0x5ba8b4)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:173 (0x4fb3c2)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:69 (0x4fa713)
/usr/local/go/src/pkg/net/http/server.go:1597 (0x4e020e)
/usr/local/go/src/pkg/net/http/server.go:1167 (0x4de217)
/usr/local/go/src/pkg/runtime/proc.c:1394 (0x417a50)
[martini] Completed 500 Internal Server Error in 3.646957ms
[martini] Started GET /api/mount for **

hagay3 avatar Jul 03 '16 07:07 hagay3

One last idea; is orchestrator still running as root? (It used to). If not, change command to sudo -i <command>

shlomi-noach avatar Jul 05 '16 08:07 shlomi-noach

Yes, orchestrator-agent and orchestrator (server) both run with root

hagay3 avatar Jul 07 '16 16:07 hagay3

Then, I'm afraid I don't know; I'm not sure what exit status 127 stands for, and this is unfortunately the only info I have here.

shlomi-noach avatar Jul 07 '16 19:07 shlomi-noach

I think the exit code 127 for this one means it found no snapshots on the host. As other commands get the same exit code when the output is empy in bash. I can see these errors within exit code 127 for other commands that fail inside the agent log file. For example there is sort of check if there is mounted snapshots on the host, and it keeps get exit code 127 because there is really no mounted snapshots on host.

Maybe there is a way to add some code inside orchestrator agent to debug this one? (for example print the output of "lvs no heading .....") My only thought is to add the full path to 'lvs', as the exit code 127 means "command not found" but I dont think its the issue here, maybe worth adding that and check.

hagay3 avatar Jul 08 '16 11:07 hagay3

I can see two paths for this:

  • Submit a PR where you use full path (you may be very correct in this), after having tested this
  • Take a look at https://github.com/outbrain/orchestrator-agent/pull/18, where generic commands are supported, so that you can configure any command you like. You will still need to change code for this (the person behind this PR, @jcesario, is also manipulating the orchestrator requests.

I suggest the former should be easy to do. Clone, modify, build via build.sh, deploy, test, PR

shlomi-noach avatar Jul 08 '16 11:07 shlomi-noach

If I use your second suggestion why I need to change the code? I cant just build the package with the updated code?

This line invoke the lvs command For debugging I want to show the exact command the agent going to execute

output, err := commandOutput(sudoCmd(fmt.Sprintf("lvs --noheading -o lv_name,vg_name,lv_path,snap_percent %s", volumeName)))

What I need to add to the code for adding the full command to the log file before it invokes? I`m pretty sure the issue is that the command just returns with empty output.

OK checked it again and the log file really tells about the exit code for bash(which is great). So maybe adding full path to lvs will solve it. I think it will be wise to use "locate" before setting the path because there is differences with paths on different linux dists.

hagay3 avatar Jul 08 '16 11:07 hagay3

locate is not installed by default, on RedHat nor on Debian. And, I should also note, if you can't find lvs, you probably wouldn't be able to find locate.

shlomi-noach avatar Jul 08 '16 11:07 shlomi-noach

The quickest for you would be to edit the path, hard code, build & deploy. If this works, then we'll open an issue where I will allows for a configurable path prefix.

shlomi-noach avatar Jul 08 '16 11:07 shlomi-noach

If I use your second suggestion why I need to change the code? I cant just build the package with the updated code?

Because the URL that will serve the customized command is not the URL orchestrator would call.

shlomi-noach avatar Jul 08 '16 11:07 shlomi-noach

@shlomi-noach It`s solved by using full path.

hagay3 avatar Jul 10 '16 10:07 hagay3

@shlomi-noach ?

hagay3 avatar Jul 13 '16 08:07 hagay3

@hagay3 thank you - I got this. Please avoid pinging me repeatedly and understand my own schedule has its own constraints. I'll be merging a fix.

shlomi-noach avatar Jul 13 '16 10:07 shlomi-noach

A thing that bothers me is that while lvs is typically on /usr/sbin, other commands are found on /usr/bin. I'm looking into a generic solution.

shlomi-noach avatar Aug 01 '16 10:08 shlomi-noach