chef-workstation
chef-workstation copied to clipboard
no_proxy doesn't work as advertised when using chef exec rspec
Description
chef exec rspec
is not able to function due to it trying to make connections to 127.0.0.1 through our HTTP proxy, because it shouldn't be doing that. It appears no_proxy
in knife.rb
is being ignored or the documentation for it does not accurately reflect the proper syntax for the setting.
ChefDK Version
Chef Development Kit Version: 0.14.25 chef-client version: 12.10.24 berks version: 4.3.3 kitchen version: 1.8.0
Platform Version
RHEL 7
Replication Case
- Configure
knife.rb
http_proxy 'http://proxy.our.org:80'
https_proxy 'http://proxy.our.org:80'
no_proxy '*.our.org, localhost, 127.0.0.1'
- Run
chef generate cookbook no_proxy_test
with the default cookbook generator -
cd no_proxy_test
- Run
chef exec rspec
and you'll see it trying to talk through a proxy (and failing) to, I assume... chef-zero. This can be seen more clearly with (on Linux):strace -qq -f -e trace=connect chef exec rspec
. The stacktrace in Stacktrace below shows the failure. - Comment out the
http_proxy
andhttps_proxy
lines inknife.rb
(thereby defining no proxies to use, likeno_proxy
should be doing...) and you will succeed as shown in the code output below.
.
Finished in 2.9 seconds (files took 2.84 seconds to load)
1 example, 0 failures
Stacktrace
F
Failures:
1) no_proxy_test::default When all attributes are default, on an unspecified platform converges successfully
Failure/Error: expect { chef_run }.to_not raise_error
expected no Exception, got #<Net::HTTPFatalError: 504 "Gateway Timeout"> with backtrace:
# ./spec/unit/recipes/default_spec.rb:13:in `block (3 levels) in <top (required)>'
# ./spec/unit/recipes/default_spec.rb:17:in `block (4 levels) in <top (required)>'
# ./spec/unit/recipes/default_spec.rb:17:in `block (3 levels) in <top (required)>'
# ./spec/unit/recipes/default_spec.rb:17:in `block (3 levels) in <top (required)>'
Finished in 0.5396 seconds (files took 3.1 seconds to load)
1 example, 1 failure
Failed examples:
rspec ./spec/unit/recipes/default_spec.rb:16 # no_proxy_test::default When all attributes are default, on an unspecified platform converges successfully
Do you continue to see this on chef-dk 0.15? I ran through your replication case and was not able to repro the error. Now if I commented out the no_proxy
setting in the knife.rb
then it does fail so I know the proxy settings are kicking in.
@mwrock Yes. What did your no_proxy
setting look like? As mentioned in the original issue-opening report, it is possible maybe that the documentation for no_proxy
is incorrect. I am using the setting per the docs, which state:
no_proxy A comma-separated list of URLs that do not need a proxy. Default value: nil. For example
no_proxy 'localhost, 10.*, *.example.com, *.dev.example.com'
Here's the latest failure, same as before, using ChefDK 0.15.16. I've obviously altered the hostnames and output to not reflect internal data:
[@gazoo:~/rcf-chef/site-cookbooks]↥ $ chef generate cookbook no_proxy_test
[@gazoo:~/rcf-chef/site-cookbooks]↥ $ cd no_proxy_test
[@gazoo:~/rcf-chef/site-cookbooks/no_proxy_test]↥ $ chef --version
Chef Development Kit Version: 0.15.16
chef-client version: 12.11.18
delivery version: master (444effdf9c81908795e88157f01cd667a6c43b5f)
berks version: 4.3.5
kitchen version: 1.10.0
[@gazoo:~/rcf-chef/site-cookbooks/no_proxy_test]↥ $
[@gazoo:~/rcf-chef/site-cookbooks/no_proxy_test]↥ $ grep proxy ~/.chef/knife.rb
http_proxy 'http://proxy.our.org:80'
https_proxy 'http://proxy.our.org:80'
no_proxy '*.our.org, localhost, 127.0.0.1'
[@gazoo:~/rcf-chef/site-cookbooks/no_proxy_test]↥ $ strace -qq -f -e trace=connect chef exec rspec
[pid 29600] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=29602, si_status=0, si_utime=0, si_stime=0} ---
[pid 29607] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=29606, si_status=0, si_utime=0, si_stime=1} ---
[pid 29600] connect(8, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
[pid 29600] connect(8, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
[pid 29600] connect(8, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("OUR_DNS_SERVER")}, 16) = 0
[pid 29600] connect(8, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("OUR_DNS_SERVER")}, 16) = 0
[pid 29614] connect(8, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("OUR_DNS_SERVER")}, 16) = 0
[pid 29614] connect(8, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("IP_ADDRESS_HERE_FOR_proxy.our.org")}, 16) = 0
[pid 29614] connect(8, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
[pid 29614] connect(8, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("IP_ADDRESS_HERE_FOR_proxy.our.org")}, 16) = 0
[pid 29614] connect(8, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("IP_ADDRESS_HERE_FOR_proxy.our.org")}, 16) = 0
...
I used your example knife.rb
verbatim. Just to double check I spun up a clean ubuntu vm and installed a fresh chef-dk 0.15.16. I got the same results: a successfull chef exec rspec
. Commenting out the no_proxy
in the knife.rb I get:
1) no_proxy_test::default When all attributes are default, on an unspecified platform converges successfully
Failure/Error: expect { chef_run }.to_not raise_error
expected no Exception, got #<Net::HTTPServerException: 404 "Not Found"> with backtrace:
# ./spec/unit/recipes/default_spec.rb:13:in `block (3 levels) in <top (required)>'
# ./spec/unit/recipes/default_spec.rb:17:in `block (4 levels) in <top (required)>'
# ./spec/unit/recipes/default_spec.rb:17:in `block (3 levels) in <top (required)>'
# ./spec/unit/recipes/default_spec.rb:17:in `block (3 levels) in <top (required)>'
Finished in 1.12 seconds (files took 2.04 seconds to load)
1 example, 1 failure
Failed examples:
rspec ./spec/unit/recipes/default_spec.rb:16 # no_proxy_test::default When all attributes are default, on an unspecified platform converges successfully
This is slightly different from your error - 404 vs 504.
Here's the latest case-by-case breakdown. In all cases below, knife.rb
is configured as:
http_proxy 'http://proxy.our.org:80'
https_proxy 'http://proxy.our.org:80'
no_proxy '*.our.org, localhost, 127.0.0.1'
ChefDK 0.15.16
Case 1
- No proxy-related shell environment variables configured
- Turns out this is why yours worked, and mine wasn't. See below.
- RESULT: Success, proxy not hit
Case 2
- Only shell environment variables
http_proxy
andhttps_proxy
configured, both pointing athttp://proxy.our.org:80
-
RESULT: Success, proxy not hit. I assume because the
no_proxy
line inknife.rb
is kicking in.
Case 3
- shell environment variables
http_proxy
andhttps_proxy
configured - shell environment variable
no_proxy
configured to be.our.org
(not*.our.org, localhost, 127.0.0.1
) - RESULT: Failure, proxy hit.
-
REASON: In our shells, we configure
no_proxy='.our.org'
because that is the format the GNUwget
command demands[fn1]. This interferes with Chef-DK as Chef-DK is making use of (PREFERRING) the shell environment variable and the value is not in the format that the underlying HTTP Ruby library wants.
Footnotes
- wget(1) man page snippet
ENVIRONMENT
...
no_proxy
This variable should contain a comma-separated list of domain extensions proxy should not be
used for. For instance, if the value of no_proxy is .mit.edu, proxy will not be used to
retrieve documents from MIT.
ok now this is making more sense. chef-config only applies the config values to your environment if those variables are already unset. So if your environment already specifies a no_proxy
variable, chef-config will not overwrite it.
see https://github.com/chef/chef/blob/master/chef-config/lib/chef-config/config.rb#L900-L903
So the problem then is that wget
and Chef compete for the no_proxy
environment variable, each with their own differing syntax expected.
Yeah thinking more about this, perhaps it would be better if chef just forcefully changed the proxy environment variables. The fact that one proactively set these values in the client.rb
would indicate thats what they want regardless any previous setting.
I'm running into a similar issue using config.rb under c:\users[name].chef\config.rb .
file_cache_path "c:/chef/cache" file_backup_path "c:/chef/backup" cache_options ({:path => "c:/chef/cache/checksums", :skip_expires => true}) log_level :info log_location STDOUT ssl_verify_mode :verify_none cookbook_path "C:\Dev\chef\cookbooks" http_proxy 'http://proxy.blah.com:8080' https_proxy 'http://proxy.blah.com:8080' no_proxy '127.0.0.1'
When I remove, the proxy values, then when I use test kitchen, it fails to connect to "https://supermarket.chef.io/universe" because the proxy is not configured. When I enable the proxy settings, it then fails to connect to http://127.0.0.1:1025/wsman because the proxy rejects it. Either way, because it's not honoring the no_proxy setting I'm hosed. Now, in your discussion above, you're suggesting that it's somehow being preset somewhere, preventing it from being set in the config.rb (which I understand is the replacement for knife.rb used by test kitchen to communicate via "chef_zero" to communicate to my Windows 10 VM on vagrant). Where is this preset? Just to find where the proxy settings took place, I tried the kitchen yml provisioner section (failed), the verifier section (failed), the suites section (failed), the client.rb file in my user's .chef directory (failed), the client.rb in my c:\chef\client.rb (failed), so I'm a little tired of figuring out where of the hundreds of places these settings should be placed to take effect, if not the c:\users[name].chef\config.rb where the actual proxy settings take effect...
I've also tried to run this disconnected, but that fails too, because my firewall gets more locked down when it's not connected to our corporate network, which leads to errors like:
[WinRM] Endpoint doesn't support config request for MaxEnvelopsizekb
I believe this is because the port is being blocked by the firewall, while not connected to the corporate network. So basically, no matter how I attempt to work around the issue, I'm hosed. It's basically designed to fail regardless. This stuff should work by default and not require endless magic to make it work. I've spend days just trying to do a basic converge which simply creates a directory. It's pretty ridiculous to be completely honest.
It turns out that I had a user environment variable configured in Windows for no_proxy that was set to "*127.0.0.1" and it appeared this setting trumped the setting from the config.rb, causing it to fail. Perhaps the wildcard was a problem in this case, perhaps? The whole wild cards only work sometimes situation... Now I'm on to the next error.... figuring out why the "chef infra client" fails to install....
$$$$$$ At C:\windows\temp\winrm-elevated-shell-9cc2dbb4-548d-4950-95ac-aa8cc90ee4de.ps1:1 char:78 $$$$$$ + ... Users\Tester\AppData\Local\Temp';cat > /tmp/chef-installer.sh <<"EOL" $$$$$$ + ~ $$$$$$ Missing file specification after redirection operator. $$$$$$ At C:\windows\temp\winrm-elevated-shell-9cc2dbb4-548d-4950-95ac-aa8cc90ee4de.ps1:1 char:77 $$$$$$ + ... Users\Tester\AppData\Local\Temp';cat > /tmp/chef-installer.sh <<"EOL" $$$$$$ + ~ $$$$$$ The '<' operator is reserved for future use. $$$$$$ At C:\windows\temp\winrm-elevated-shell-9cc2dbb4-548d-4950-95ac-aa8cc90ee4de.ps1:1 char:78 $$$$$$ + ... Users\Tester\AppData\Local\Temp';cat > /tmp/chef-installer.sh <<"EOL" $$$$$$ + ~ $$$$$$ The '<' operator is reserved for future use. $$$$$$ At C:\windows\temp\winrm-elevated-shell-9cc2dbb4-548d-4950-95ac-aa8cc90ee4de.ps1:33 char:8 $$$$$$ + exists() { $$$$$$ + ~ $$$$$$ An expression was expected after '('. $$$$$$ At C:\windows\temp\winrm-elevated-shell-9cc2dbb4-548d-4950-95ac-aa8cc90ee4de.ps1:34 char:5 $$$$$$ + if command -v $1 >/dev/null 2>&1
This definitely seems like we need to handle the situation where there are existing proxy settings in the environment and in the ~/.chef/config.rb
file. I think overwriting the existing proxy environment variables when a Chef tool is run seems like the right thing, but that is also a form of transparent magic that generally isn't easy to troubleshoot. So maybe merging them? We need to investigate this