yaws icon indicating copy to clipboard operation
yaws copied to clipboard

application:set_env/3 in yaws:get_app_dir/0 is bad

Open leoliu opened this issue 5 years ago • 10 comments

Just upgraded Yaws to 2.0.7 on a live node by changing the code path without bring down the node and then I am immediately seeing a lot of these:

Dynamic compile error:
/opt/myapp/lib/myapp-2.0.3/priv/www/layout.yaws:4: can't find include file "/opt/myapp/lib/yaws-2.0.6/include/yaws_api.hrl"

After some poking around I discovered it was the fault of yaws:get_app_dir/0, specifically this line

application:set_env(yaws, app_dir, AppDir)

I grepped the yaws source and see yaws:get_app_dir/0 only called rarely. So the above line does little and can cause harm.

leoliu avatar Jul 16 '19 02:07 leoliu

Another more threatening issue is delivering those errors to the page, which is a security nightmare. I think a much better and safer option is to let it crash in this case. Thoughts?

leoliu avatar Jul 16 '19 04:07 leoliu

I'm still trying to figure out exactly how you're trying to upgrade, since what you're describing doesn't seem safe. There are modules loaded and running, and then you're just changing the load path? What's forcing anything to load from the new path? Clearly there is state still pointing to old paths, as indicated by the error message.

vinoski avatar Jul 16 '19 11:07 vinoski

I use a combination of code:replace_path/2, code:modified_modules/0 and code:load_file/1. Is it unsafe? May be for example, when state record changes for gen processes. But it's been working 99% of the time so I am not too bothered. I intend to try release upgrade again when I have time.

Meanwhile yaws is caching a stale app_dir that doesn't get updated when I restart the Yaws application.

> application:get_env(yaws, app_dir).
{ok,"/opt/myapp/lib/yaws-2.0.6"}

leoliu avatar Jul 16 '19 12:07 leoliu

Is it better to replace get_app_dir/0 with get_app_dir() -> code:lib_dir(yaws).?

leoliu avatar Jul 16 '19 13:07 leoliu

If you really restarted the application, then application would no longer have app_dir cached in the environment, and the next call to get_app_dir/0 would cache a new value.

I've never before heard of anyone using the various code calls you mention to do a restart, so that's a concern. One general issue I can think of, not specific to Yaws, is that it doesn't allow for code change callbacks, supervised restarts, state changes, etc. To get what you're trying to do to work 100% of the time, we'd have to find all caches, state, etc. in Yaws and provide some custom way to flush them, allow for upgrades, etc., which I don't think makes much sense given OTP already provides what's needed for this. I'm not saying Yaws is 100% OTP-compliant in this regard though; I think the typical approach with Yaws is similar to how other web servers are updated: do a regular restart, or do a rolling restart over a load balanced cluster of replicas to avoid any downtime.

Otherwise, perhaps since you're already doing a custom upgrade, you just need to add a call to application:unset_env(yaws, app_dir) after running your list of code calls?

vinoski avatar Jul 16 '19 17:07 vinoski

I think if you restart the Yaws application without bring down the VM, the envvars are all kept intact unless the application is unloaded with application:unload/1. 100% agree OTP release upgrade is the right way to go though custom upgrade is probably more common than people think. For example, recently Richard Carlsson gave a talk on 10 years custom upgrade with minimum downtime.

I had a workaround in place but I didn't find application:unset_env/2 in a hurry. I'll update my workaround to use it. Thanks.

The reason I open this report is to provoke some rethink on this to see if it can be cleaned up or improved upon.

What do you think of the security issue I raised in the second comment. I think a lot of Yaws early decisions were prioritised for developers which have become vulnerabilities. At one point (2017) I had Yaws happily put my DB password on the page. The viewers are 99% of the time not the developers so putting errors on page seems to make little sense besides handing critical info to attackers.

leoliu avatar Jul 17 '19 01:07 leoliu

Yes, I agree the security issue is a concern. I might be wrong but I'm guessing finding and plugging all occurrences of such behavior will not be a simple task. As usual, the best way to get an issue resolved is to submit a small test case or multiple test cases that show the problem(s), as it can often be difficult for us to reproduce problems from textual descriptions.

vinoski avatar Jul 18 '19 19:07 vinoski

It is hard to find all of them in any software projects. The following two can be a good start and should help Yaws' security a bit. Put the following code in a .yaws file and view the page in a browser:

<erl>
%% case 1
out(_Arg) -> error("this-string-should-not-appear-on-page").
</erl>

<erl>
%% case 2
out(_Arg) -> ok;
</erl>

Case 1 crashes and case 2 deliberately introduces a syntax error. The error texts should never appear on page, at the least not when debug is off.

Personally I think case 2 should be treated at least as severe as the crash case, i.e. if a code block don't even compile, let the whole page crash. The current behaviour of partial success is painfully bad.

leoliu avatar Jul 19 '19 06:07 leoliu

Looking at this issue again. Note that the "Customizations" section of the Yaws tex documentation explains:

When actually deploying an application at a live site, some of the standard YAWS behaviors are not acceptable. Many sites want to customize the web server behavior when a client requests a page that doesn’t exist on the web server. The standard YAWS behavior is to reply with status code 404 and a message explaining that the page doesn’t exist.

Similarly, when YAWS code crashes, the reason for the crash is displayed in the Web browser. This is very convenient while developing a site but not acceptable in production.

With Yaws server configuration, it's already possible to deal with some of the problems you mention by providing a custom error handling module, specifically for the errormod_crash setting. But it doesn't solve everything, such as your second example, and also, development settings rather than production settings are on by default.

To address this issue, I plan to

  • make the default errormod good for production instead of development
  • add a development default errormod that's easy to enable
  • extend the errormod concept to also deal with compilation errors of .yaws files

vinoski avatar May 27 '20 23:05 vinoski

That’s a good plan. Secure by default is the only way actually. Secure is often backwards incompatible with insecure.

leoliu avatar May 28 '20 00:05 leoliu