`ets`: add ability to suppress heir/gift data
The current implementation of the heir option as given to ets:new/2 and ets:setopts/2 requires that a term is given as HeirData in a {heir, HeirPid, HeirData} 3-tuple, which will be sent in an {'ETS-TRANSFER', ...} message when the owner process dies and ownership passes to the heir process.
This is not always feasible, for example when the heir process is a supervisor which passes the table to a child process on (re-)starts. If the child process dies, the supervisor will receive an 'ETS-TRANSFER' message, which it can't do anything with, and will just log it as an unexpected message.
This PR changes the behavior of ets:new/2 and ets:setopts/2 such that the heir option to also accepts a 2-tuple of the form {heir, HeirPid}, ie omitting the HeirData. If the owner process dies, table ownership transfers to the heir process quietly.
This PR also adds a new function give_away/2, in order to do explicit transfer of table ownership quietly. This will probably be of somewhat lesser use, and is only added for the sake of symmetry/completeness.
CT Test Results
4 files 227 suites 1h 55m 0s ⏱️ 3 670 tests 3 577 ✅ 93 💤 0 ❌ 4 773 runs 4 657 ✅ 116 💤 0 ❌
Results for commit 74cf8566.
:recycle: This comment has been updated with latest results.
To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.
See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.
Artifacts
// Erlang/OTP Github Action Bot
Rebased because of merge conflicts.
The failing test seems to be unrelated to the changes made in this PR.
While I don't mind the change in general, I think the justification is questionable. I think it promotes an anti-pattern I observed several times in the past:
This is not always feasible, for example when the heir process is a supervisor which passes the table to a child process on (re-)starts. If the child process dies, the supervisor will receive an 'ETS-TRANSFER' message, which it can't do anything with, and will just log it as an unexpected message.
It stands out to me, that if you want the supervisor to own the ETS table, you should do that from the very beginning. That is, create the table in the init callback, and pass the table reference to children that need it.
One should think of an ETS table as a part of the (owner) process state. If your process crashes, but you want to save (some?) of its state, it sounds alarming. When the process restarts but it keeps the state (by giving it away to an unsuspecting supervisor), chances are, another crash is coming up. Until finally the supervisor decides to give up, and crashes too (wiping out the ETS table, and eventually restarting a larger tree).
If that's the behaviour you're after, you should not need this PR, as you could just create the ETS table in the supervisor (and keep ownership by the supervisor).
@max-au hm, it sounds like you grabbed the wrong end of the stick somehow 😅 I'll elaborate on it later when I get my hands on a proper computer, I'm on my mobile on a shaky train right now 😵💫
@max-au sorry for the delay in answering :sweat_smile:
For one, this PR doesn't add anything really new. It does not enable (or disable, for that matter) anything that couldn't already be done now, bad and good alike.
For another, don't pay too much attention to the introduction of give_away/2, that is, the ability to give away a table to another process without implying the sending of an 'ETS-TRANSFER' message. This is only for completeness' sake, the actual objective of this PR is to let an appointed heir process inherit a table from a dying process without being sent such a message.
That said, the idea is to let a supervisor create a table in its init function, and appoint itself as that table's heir. When it then starts a child that is supposed to somehow work with the table, it will give away this table to that process. When the child dies, the supervisor gets back the table (this is the point where, as of now, the supervisor would receive an 'ETS-TRANSFER' message, which it can't do anything with and will just log it as an unexpected message; with the changes in this PR, it is possible to not provide any HeirData in the heir tuple, effectively suppressing the unexpected message), and will give it away again when restarting the crashed process. As such, the supervisor is not "unsuspecting", as it created the table and made itself the heir, so it should also expect to get back the table at some point in case the child crashes.
Instead of giving the table away, the supervisor could in fact also stay its owner, but then there would be no choice but to make the table public.
You point out that this could be used in an attempt to save some of the owner process' state, and I agree that this is bad design, an anti-pattern.
However, in modern applications, a supervision (sub-)tree often represents a work unit of several cooperating processes, instead of single worker processes (this is also what gave rise to EEP 56). Viewed from this angle, such a table could be seen as the state of the work unit, instead of as the state of a single process.
Depending on the task and subsequently the structure and (in-)dependence of the constituting child processes, it might be possible to restart one or another of them if they crash, without the need to crash the entire work unit.
I agree that misuse is possible, yes, but it is already possible right now, at the expense of some logging noise. As always, careful designing is required, in the sense that the table in question should not contain any of or be treated as the state of the process it is given to, but as a state of the work unit the process is a part of.
The purpose of the 'ETS-TRANSFER' message was to avoid the risk of table leakage due to the receiver not being aware. Without it you must make sure to make the receiver aware some other way. For an heir like a supervisor I can see that's not too difficult to achieve with links or monitors. But the give_away is more difficult to make safe without an automatic and atomically sent message, I think. I understand the urge for symmetry/completeness but maybe we should skip it due to its unsafeness.
@sverker ok, it is probably a bad idea to create a new button only to label it "Don't push!" :sweat_smile: So if this is your final verdict, I'll remove this part again.
@sverker last commit removes give_away/2 again.
It does not enable (or disable, for that matter) anything that couldn't already be done now, bad and good alike.
That I understand.
I agree that misuse is possible, yes, but it is already possible right now, at the expense of some logging noise.
And this is exactly the kind of "soft friction" I like. One is not prohibited from leveraging this use-case. But there is a nagging indicator "you might be holding it wrong". :-) When there is no such indicator, it's fair to assume that this design is endorsed by Erlang/OTP.
To give some more background, at a previous place there was quite an amount of code using heir as a way to save ETS tables from the programmer's error (and a recommendation "always use the supervisor as a heir so you don't lose your tables, and other processes reading your table are unaffected by your crash-looping process). To a degree that we even had a custom patch that suppressed logging of ETS-TRANSFER message by the supervisor... now, bitten by that, I'm somewhat opposed to changes that may inadvertently promote technique of "assigning some heir just to keep ETS table exist while my process is crash-looping".
Having said that, I'll repeat that I don't mind merging this PR. There is some value in it, even though it may introduce some "happy debugging, folks" episodes later.
Last commit solves merge conflicts caused by the merge of #8026.
Ping @sverker :) Is there still any interest in this?
@Maria-12648430 @juhlig
Forgot about this one.
I added the missing {heir,Pid} in the docs and expanded with a sentence about notifying the heir some other way. Please review my wording.
@sverker LGTM 👍🙂
@Maria-12648430 wrote
...when the heir process is a supervisor which passes the table to a child process on (re-)starts. If the child process dies, the supervisor will receive an
'ETS-TRANSFER'message, which it can't do anything with, and will just log it as an unexpected message.
When I think about this; either I don't understand the use case, or I don't understand how supervisors work.
What would the supervisor do with the inherited table? Or rather how can it do anything with the table? Is there a hook/callback that lets you run your own code within the supervisor when a child is restarted?
What would the supervisor do with the inherited table? Or rather how can it do anything with the table? Is there a hook/callback that lets you run your own code within the supervisor when a child is restarted?
In a sense, yes: The child start function (as given under the start key of a child spec) is run in the supervisor process.
So if you have a supervisor init like this...
init(_) ->
Tbl = ets:new(tab, [{heir, self()}]),
{ok,
#{},
[
#{id => my_child,
start => {my_child, start_link, [Tbl]},
restart => permanent}
]
}
... and a function start_link in module my_child like this...
start_link(Tbl) ->
{ok, Child} = gen_server:start_link(?MODULE, [], []),
ets:give_away(Tbl, Child, here_comes_the_table),
{ok, Child}.
... which, as I said, will be run inside the supervisor process when it (re-)starts a child, the table will be given to the child process when it starts, return to the supervisor process when it dies, and in turn be given to the new child process when it gets restarted.
@Maria-12648430 Thanks for the OTP lesson 🙂.