rebar3 icon indicating copy to clipboard operation
rebar3 copied to clipboard

Add a helpful error message for when `make` is not available but is called from a hook

Open lpil opened this issue 6 months ago • 8 comments
trafficstars

Hello there!

A common problem we've noticed if folks trying to deploy a Gleam/Erlang application using OCI/Docker containers and being confused by their build failing when unable to compile a NIF, typically one used for hashing passwords.

To fix the issue they need to install whatever package is required to add make, such as build-essential on Alpine Linux.

This may be slightly unusual a feature for rebar3, but I thought we could greatly help newcomers to Erlang or containers if we special cased this particular error to print helpful information on how to fix the problem.

Here's an error as it exists today:

2.129 ===> Fetching rebar3_hex v7.0.9
2.216 ===> Fetching hex_core v0.10.3
2.256 ===> Fetching verl v1.1.1
2.289 ===> Analyzing applications...
2.424 ===> Compiling verl
2.638 ===> Compiling hex_core
3.395 ===> Compiling rebar3_hex
3.605 ===> Fetching rebar3_ex_doc v0.2.26
3.777 ===> Analyzing applications...
3.787 ===> Compiling rebar3_ex_doc
3.940 sh: exec: line 0: make: not found
3.940 ===> Hook for compile failed!

Perhaps it could look something like this for make specifically.

2.129 ===> Fetching rebar3_hex v7.0.9
2.216 ===> Fetching hex_core v0.10.3
2.256 ===> Fetching verl v1.1.1
2.289 ===> Analyzing applications...
2.424 ===> Compiling verl
2.638 ===> Compiling hex_core
3.395 ===> Compiling rebar3_hex
3.605 ===> Fetching rebar3_ex_doc v0.2.26
3.777 ===> Analyzing applications...
3.787 ===> Compiling rebar3_ex_doc
3.940 sh: exec: line 0: make: not found

The application `jargon` calls the `make` program as part of its
build process, but it could not be found. Is it installed?

If you are building your project using a Dockerfile you will need
to add `RUN` step to install `make` and any other build dependencies.

3.940 ===> Hook for compile failed!

The If you are building... sentence could be omitted if the OS is not Linux, as we know it's not an OCI container in that case.

What do you think? :)

Thank you, Louis

lpil avatar May 14 '25 11:05 lpil

My 2c: I wouldn't treat make any different than any other required binary, event though it's very common in building phase. make: not found seems relatively clear to me. Also, what heuristic should we use to determine whether make is called or not. You can write $MAKE install, sudo make install, nohup make install, MY_ENV=value make install, etc...

Maybe more general solution would be to mention for which application hook failed (IIRC, hooks are also executed for deps). That would narrow down the bug space pretty well.

MarkoMin avatar Jun 02 '25 22:06 MarkoMin

It's clear to you but this is overwhelming the most common problem new people trying Gleam have with rebar3. My goal here is to improve Erlang onboarding by helping one of the most common points at which people get stuck or bounce off.

Checking that the command is make would be sufficient. It is possible to use other commands, but all the packages I've checked call make directly so I believe it will cover a significant and impactful proportion of the packages on Hex that people would have this problem with. sudo etc can't be used in hooks like that, so we know it won't be those.

It does already indicate which dependency it fails in! Rebar3 prints the name beforehand. This doesn't help newcomers understand the problem unfortunately.

lpil avatar Jun 03 '25 10:06 lpil

My concern here is going to be to maintain good messaging across all OSes, which means that depending on the platform, hooks might need different commands for different failures, some of which will definitely be wrong.

The safest thing is also the one with most work, which would be a sort of hook that runs before the other one and checks for make being present and outputs platform-specific advice before failing, but then that needs to be added manually to every plugin/lib and will cause repeated checks, compared to just error handling.

Alternatively though, maintaining information about which app/hook is trying to do the thing is probably useful to put in (assuming overrides don't obscure the origin), but at least hook configured for app <name> would likely be useful regardless.

ferd avatar Jun 03 '25 18:06 ferd

Sorry I'm not quite grasping the hook idea. Is the idea that we patch every project that uses make with the same code to emit a nice warning? I think I'm likely misunderstanding.

What did you see as being the OS specific messaging? I've proposed an extra bit of the message only for Linux, and it's generic to all Linux rather than per distribution, so it won't need to change or understand how package management works on that platform.

lpil avatar Jun 04 '25 07:06 lpil

Yeah the specific suggestion of "make it an extra hook" is a per-project thing if you want to provide better error messaging for your own hooks. It's not a general solution, and it's not a great one. It's just the safest one, mostly because of how narrowly scoped it is.

The issue is that we often don't know ahead of time if a hook is running on linux or not; people are free to set up whatever regexes they want. You can see for example how epcap mixes the same hook for macos, solaris, and linux, how rebar3 itself only distinguishes between windows and non-windows.

Someone who does not gate the hook at all might therefore get unique errors for the command not existing on Windows as well, because the hook is configured to run there even if it makes no sense.

That's why I meant we would need good messaging across all OSes — or alternatively, good detection of OSes for most issue types regardless of the regex (or lack thereof) to output useful or non-misguiding diagnostics. Basically if we go "this message is valid for commands of this shape on OS descriptors with this shape" then we could bet better messaging when it makes sense and fallback to the current unhelpful stuff otherwise.

For context, the regular expression matching that can be done is broad and includes the Erlang/OTP version, the OS architecture, etc. See https://github.com/erlang/rebar3/blob/6a5805d8295523ff5ac34a69a43c0b5c350901c7/apps/rebar/src/rebar_utils.erl#L116-L130 for the implementation.

ferd avatar Jun 04 '25 15:06 ferd

Sorry about this, but I'm not understanding.

Why would we need to look at the hook definition to determine the current operating system? If the hook is a call to make and fails due to the executable not existing, could we not check os:type() =:= {unix, linux} and append the Linux specific message in that case?

I'm sure I'm overlooking many details here. Thank you

lpil avatar Jun 06 '25 11:06 lpil

Yeah that's the "good detection of OSes for most issue types" approach. Means each message and error type we show would would need to look at the hook definition (is it calling make?), the hook error (is it make that is failing because it's missing?), then match them against some dictionary of os-specific messages for the error based on some definitions, and pick that message if it exists or fallback to something else otherwise.

If the failure message is different on some OSes, we need somewhat fancier logic (error + os + command I guess) and match them together.

The alternative is "good messaging across all OSes" but it's probably impractical because detection may vary.

We currently do "don't care just dump the error". I think if we want to do the level of error messaging suggested here, we need to develop the proper matching of command command regex match + error regex match + os match => non-default error message with a default being just the error. Anything else short of that risks showing inadequate error messages.

ferd avatar Jun 06 '25 15:06 ferd

Producing and maintaining OS specific error messages is a lot of work and decays quickly, so it would be good to avoid that.

I think only letting the person know that the make executable was not found would help folks get from "something is wrong with my first Erlang project" to "I don't have make installed", which is an easier problem for them to solve with some internet searching.

We currently do "don't care just dump the error". I think if we want to do the level of error messaging suggested here, we need to develop the proper matching of command command regex match + error regex match + os match => non-default error message with a default being just the error. Anything else short of that risks showing inadequate error messages.

Parsing the error message seems like it would be very difficult, seeing sub-programs can print anything at all. Rather than do this I would perhaps handle only the case that the subprocess was unable to be created due to ENOENT.

lpil avatar Jun 06 '25 16:06 lpil