otp
otp copied to clipboard
Feature: internal_export
Internal export
What it is
internal_export
is an attribute that gives you more control over function exporting. In some cases, functions are only exported for internal usage, for testing purposes or because you must implement behavior callbacks. Those functions are not meant to be used by the end user of your application and they shouldn't exists in public interface of a module. Using them is currently considered bad practice, but with this feature it can be integrated into testing pipelines.
Best description of this problem I've found is from dr. sc. Richard A. O'Keefe: "there are many “applications” in the Erlang/OTP system which consist of a flotilla of modules. A module in such a flotilla cannot export it to another member or other members of the flotilla without exporting it to absolutely every module"
By internal usage I mean used by some module that belongs to the application. Module belongs to application if it is listed in modules
property in application resource file. Most of Erlang programmers use LSP which by default suggest all exported functions - even undocumented ones, functions documented with %% testing only
and callbacks. The user can, on purpose or not, use those functions it won't get noticed by any tool. Also, user can in the same way use all undocumented modules - a quick script showed that over 66% of modules in kernel
, stdlib
, mnesia
, ssh
, ssl
, crypto
and inets
are undocumented and therefore considered internal.
By combining internal_export
attribute and xref
analysis, it is possible to detect static calls to functions marked as internal_export
and decide whether they are legal or not.
Motivation
Current state is that a function is either exported or not. Although very simple, I think this mechanism lacks some control over scoping those exports because of already mentioned reasons. I eventually started a thread some time ago and I'd say it received a good feedback. EEP05 was the main source of inspiration, but I changed some things in design (explained in the next chapter).
This PR is related only for static analysis, but I also plan to work on dynamic checks if the static part receives a good feedback.
Design & implementation
How much control do we want?
In EEP05, it is proposed that the user should manually specify the scope of a function export. That would allow fine-grained control over exporting a function, but would potentially bloat the source file and produce some amount of boilerplate. It would also require maintenance each time new modules are added. Ut would require developers to know more stuff about the dependencies they're using - e.g. if you want to internally export handle_call/3
- do you export it to gen_server
which defines it or gen
which calls it dynamically? Do you really need to know that information?
I propose a mechanism where you only declare functions that are not meant to be used by the user - callbacks and internal functions. Those functions will be available to all modules in your application and its dependencies. Dependencies of an application are applications listed in applications, included_applications or optional_applications in the application resource file including their own dependencies (e.g. if A depends on B and B depends on C then A also depends on C).
I didn't find the use-case where you want to export a function to a module/application that is not captured by the mechanism above. Counterexamples are welcome.
Exporting to modules or to applications?
There is an open question whether to export functions to modules or to applications. Exporting to modules is currently proposed in EEP05.
If we don't allow user to declare additional scope - then it doesn't really matter (at least for static analysis), it is just the matter of implementation. It is easier to implement it by exporting to applications. If we do allow user to declare additional scope, specifying applications would be easier to maintain. Otherwise, if you want to export functions internally for your application, you'd have to enumerate all modules in the application, which would be pain. Another example is exporting a callback - if dynamic checks ever get implemented, that would require you to know exactly which module is calling your function (e.g. you'd need to export handle_call/3
to gen
which shouldn't be directly used, not gen_server
). That could also cause backward incompatibilities if some internal changes occur in gen_server
. User shouldn't rely on internal implementation.
Notice that it is possible to convert current approach to module-wise by just replacing applications with value of modules
property from application resource file.
Considering dynamic call checking, tradeoff must be made. Application-wise approach would be significantly less memory hungry, but would eventually introduce more overhead to dynamic call checks because you'd need to map caller and callee modules to the applications they belong to and then decide whether the call is legal or not. At the moment I can't really estimate how big this would be, but I guess pretty high / unacceptable. On the other side, module-wise approach would be very memory hungry. It would take O(1) per MFA as suggested in EEP05, but due to the current mechanism, there would be a lot of unused entries that wouldn't be used even once, but would clog the VM.
For now, my best idea is to combine both - store information about applications, but cache the results. That would make large part overhead disappear from every-but-first dynamic check, but would introduce moderate memory consumption. Any suggestions about dynamic checks are very welcome.
Xref modifications
New predefined variable IX
added - list of all functions tagged with both internal_export
and export
. It is a subset of already existing variable X and is available in both xref_mode
s.
Two new analysis types are defined: internal_use
and illegal_internal_use
- ONLY AVAILABLE IN FUNCTIONS MODE (because they depend on predefined variable E).
-
internal_use
- all usages of functions from IX -
illegal_internal_use
- all usages of functions from IX which are not allowed
Syntax
Syntax is simple:
-internal_export([f/a,...])
I considered shorter name - internal
, but find it kinda confusing because we refer to internal functions in another way. Also, original name export_to
was the option, but it's also confusing since we don't explicitly state to what we're exporting.
Usage
To test your applications you need to add them to xref
server. After that, you can call request illegal_internal_use
analysis to return a list of all calls to internal functions that aren't allowed
Pros and cons
Pros | Cons |
---|---|
Safer codebase - you can detect calls to functions that shouldn't be used | Added complexity and responsibility for developers |
Spawn problem - I'd like to cite dr. Richard A. O’Keefe here: "if we wish to spawn a call to a function, that function must be exported even if we have no intention whatsoever of making it available to other modules" | Overhead (only for dynamic checks) |
Documentation - no more need to mark functions with %%hidden and modules with only internal exports can be excluded from documentation |
|
LSP integration - internal functions not shown when consuming dependencies | |
CI/CD pipeline integration | |
Shell integration - Autocomplete could separate exports and internal exports | |
Loader integration - do the analysis in code loader |
Summary
Even without dynamic checks, I think this is quite good feature to have because you can force developers not to rely on something they shouldn't. If proven useful, I plan to extend this mechanism to support checks at runtime. Static calls can be checked in loader, as proposed in EEP05. Dynamic calls would be way harder to implement and it is questionable do we want to sacrifice performance and simplicity for those checks.
Every critic is very welcome