rfcs
rfcs copied to clipboard
standard lazy types
Add support for lazy initialized values to standard library, effectively superseding the popular lazy_static
crate.
use std::sync::Lazy;
// `BACKTRACE` implements `Deref<Target = Option<String>>`
// and is initialized on the first access
static BACKTRACE: Lazy<Option<String>> = Lazy::new(|| {
std::env::var("RUST_BACKTRACE").ok()
});
Great RFC, I hope it makes it into the standard library!
One other aspect that the crate conquer_once
tackles is a distinction between blocking and non-blocking methods. I believe this is one possible way to divide the api?
/// Never blocks, returns `None` if uninitialized, or while another thread is initializing the `OnceCell` concurrently.
pub fn get(&self) -> Option<&T>;
/// Never blocks, returns `Err` if initialized, or while another thread is initializing the `OnceCell` concurrently.
pub fn set(&self, value: T) -> Result<(), T>;
/// Blocks if another thread is initializing the `OnceCell` concurrently.
pub fn get_or_init<F>(&self, f: F) -> &T
where
F: FnOnce() -> T,
;
/// Blocks if another thread is initializing the `OnceCell` concurrently.
pub fn get_or_try_init<F, E>(&self, f: F) -> Result<&T, E>
where
F: FnOnce() -> Result<T, E>,
;
}
Alternatively, all four methods could be blocking, and a try_get
and try_set
may suffice for a non-blocking use case.
But don't let this comment derail the discussion about whether to include OnceCell
in the standard library to much.
fn pointers are not ZSTs, so we waste one pointer per static lazy value. Lazy locals will generally rely on type-inference and will use more specific closure type.
It looks like https://github.com/rust-lang/rust/issues/63065 will allow using a zero-size closure type in a static:
static FOO: Lazy<Foo, impl FnOnce() -> Foo> = Lazy::new(|| foo());
But this will require spelling the item type twice. Maybe that repetition could be avoided… with a macro… that could be named lazy_static!
:)
@SimonSapin or we can add bounds to Lazy
like so
struct Lazy<F: FnOnce<()>> { .. }
and use it like this,
static FOO: Lazy<impl FnOnce() -> Foo> = Lazy(foo);
@KrishnaSannasi How does that help with the requirement that static
items name their type without inference?
static FOO: &[_] = &[2, 4];
Compiling playground v0.0.1 (/playground)
error[E0121]: the type placeholder `_` is not allowed within types on item signatures
--> src/lib.rs:1:15
|
1 | static FOO: &[_] = &[2, 4];
| ^ not allowed in type signatures
@SimonSapin I'm not sure what you are asking, I didn't use _
anywhere in my example. It would be nice if we could have type inference in static
/const
that only depended on their declaration, but we don't have that (yet).
You proposed using impl Trait
to get rid of the cost of fn() -> ...
, and noted a paper cut where you would have to name a type multiple times, and I proposed a way to get rid of that paper cut by changing Lazy
by removing the redundant type parameter.
It doesn't really seem all that worth it to me to spend time worrying about how to inline a function that is called at most one time in the entire execution of a program.
A small difference that may be worth noting: std::sync::Once
will only run its closure once, while OnceCell::get_or{_try}_init
will one run its closure once successfully.
@pitdicker pointed out that we actually can have some part of the sync
API exposed via core
. This unfortunately increases design/decision space a bit :) I've amended the rfc.
@jhpratt the author of this RFC @matklad is the author of once_cell
. It also mentions:
The proposed API is directly copied from
once_cell
crate.
I was expecting this to implement haskell-style lazy values. Can I use this outside global scope? Maybe use another more descriptive name? like LazyStatic
Yes, you can use this in non-static scope, see the very end of the guide section.
+1. Nay, such is my exuberance that I dare say, +2. Ever since finding out about once_cell
I've considered it a slam-dunk for inclusion in the stdlib: it satisfies a ubiquitous use case, supplies a fundamental primitive, encapsulates tricky unsafety, has a small, self-contained, and obvious API, and supplants a macro solution with a more idiomatic approach.
I'd love to see this added to the standard library.
I share the concern that I'd prefer to not have two types with the same name distinguished only by module. I've run into far too many cases of annoying conflicts between io::Write
and fmt::Write
, and I'd like to not repeat that.
That issue aside, :+1:.
I'd prefer to not have two types with the same name distinguished only by module
As a counter-bikeshed, I'm fine using modules to namespace types. When I need both, I import only the module or rename it:
fn foo(_: impl io::Write) -> impl fmt::Write {}
use io::Write as IoWrite;
use fmt::Write as FmtWrite;
fn foo(_: impl IoWrite) -> impl FmtWrite {}
Only when you need both (which my experience has shown to be rare) do you need the disambiguation.
I think Write
s are especially problematic, because they are traits (with identically named write_fmt
name and duck-typed format!
to boot), and it's easier to confuse traits than types. As an anecdata-point, the result-alias idiom does not confuse me, while Write
traits do.
My hypothesis is that cell::OnceCell
and sync::OnceCell
would not be as bad as io::Write / fmt::Write
. I don't know of a strictly better naming scheme :)
nope, fixed, thanks!
On Sat, 9 Nov 2019 at 14:59, Aleksey Melnikov [email protected] wrote:
@aleksmelnikov commented on this pull request.
In text/0000-standard-lazy-types.md https://github.com/rust-lang/rfcs/pull/2788#discussion_r344440925:
- Mutex::new(m) +}); +```
+Moreover, once
#[thread_local]
attribute is stable,Lazy
might supplantstd::thread_local!
as well: + +rust +use std::cell::{RefCell, Lazy}; + +#[thread_local] +pub static FOO: Lazy<RefCell<u32>> = Lazy::new(|| RefCell::new(1)); +
+ +However,#[thread_local]
attribute is pretty far from stabilization at the moment, and due to the required special handling of destructors, it's unclear if just usingcell::Lazy
will work out. + +Unlikelazy_static!
,Lazy
can be used used for locals:used used - is it ok?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rfcs/pull/2788?email_source=notifications&email_token=AANB3M6TCK6A3MYEVK2X4A3QS2QZVA5CNFSM4JCIMIBKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCK73SVI#pullrequestreview-314554709, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANB3M6UG6TE4ZP3CIL6TIDQS2QZVANCNFSM4JCIMIBA .
Hey, @rust-lang/libs , this have been sitting quite for some time, should this maybe get an I-nominated label and/or an assignee from the T-libs?
I do not personally have the time to champion this RFC to get merged, but I would also probably say that this is ok to send as a PR to libstd (unstable of course). The various nuances of the API design would be debated there and we could debate more on the tracking issue itself.
I would love for this to happen. :) I would accept a PR to add both both sync and unsync OnceCell and Lazy as unstable. We tend to get more useful design work out of iterating on these in actual usage than thinking about usages up front.
@matklad would you be interested in sending a PR?
Yeah, I plan to send a PR, but I don't have to much free time right now. I think I'll find some by the end of the year, but if someone wants to do this instead -- please go ahead!
I would love for this to happen. :) I would accept a PR to add both both sync and unsync OnceCell and Lazy as unstable. We tend to get more useful design work out of iterating on these in actual usage than thinking about usages up front.
@matklad would you be interested in sending a PR?
I found that once_cell doesn't currently have enough features for the crates I use, but conquer-once seems to. In particular I needed a spin-lock based implementation for no_std environments, which once_cell doesn't have, but conquer-once does. I don't think we should rush to merge an implementation that doesn't provide a no_std
-compatible API since the abstraction of the locking mechanism is likely to impact the public API in a way that wouldn't be backward-compatible if we tried to do it piecemeal.
(I haven't reviewed the implementation of conquer-once; I just verified that it provides the API I need. I would help review conquer-once if that is the direction this goes. There may be other implementations of Once
that also provide no_std
-compatible variants but I couldn't find any that worked for my use case.)
Agreed with @briansmith that now is the time to talk about support for no_std
, although I see that the author of this RFC has recently expressed dissatisfaction with spinlocks: https://matklad.github.io//2020/01/02/spinlocks-considered-harmful.html .
@matklad , could the RFC be updated with open questions regarding no_std
/spinlocks, and could the alternatives mention conquer-once?
@briansmith pointed another unresolved question in the once-cell issue tracker. Should the following program be guaranteed to never panic?
static FLAG: AtomicBool = AtomicBool::new(false);
static CELL: OnceCell<()> = OnceCell::new();
// thread1
CELL.get_or_init(|| FLAG.store(true, Relaxed));
// thread2
if CELL.get().is_some() {
assert!(FLAG.load(Relaxed))
}
That is, are side-effects of the initializer function guarated to be observed by the threads who observed its result?
My amateur understanding is that both are possible, depending on which memory ordering, Acquire
or Consume
, we use in the hot-path for getting an initialized value. We don't officially have Consume
in Rust right now, and even in C++ it is controversial. However, in practice, crossbeam does provide a consume: https://docs.rs/crossbeam-utils/0.7.0/crossbeam_utils/atomic/trait.AtomicConsume.html#required-methods.
My feeling is that for std we should start with forwards-compatible option of not giving guarantees about side effects.
We always have to do a Release
on CELL
after running the closure, without exception. So we can guarantee that FLAG
is set before that.
But I don't think we should guarantee that all memory gets synchronized, just the part contained within the OnceCell
seems reasonable.
~I would say that in the example code assert!(FLAG.load(Relaxed))
should use Acquire
to establish an ordering.~
Is there no Acquire
in get
?
But I don't think we should guarantee that all memory gets synchronized, just the part contained within the
OnceCell
seems reasonable.
I agree with this. It is important that we avoid any indication that Lazy
is equivalent to or better than std::sync::Once
though, because they are different primitives. I have seem quite a few people--including myself--assume that Lazy<()>
is equivalent to Once
, for various implementations of Lazy
, when this isn't true w.r.t. these side effects. Ideally Lazy<T>
where T is a unitary type like ()
would generate a warning because it is likely that the user should be using Once
instead of Lazy<T>
.
I have seem quite a few people--including myself--assume that Lazy<()> is equivalent to Once, for various implementations of Lazy, when this isn't true w.r.t. these side effects.
@briansmith I think there's some misunderstanding here (not sure on whose side though). My understanding is that at the moment all various implementations (std::sync::Once
, lazy_static
, once_cell::OnceCell
, spin::Once
, conquer_once::OnceCell
) do, in practice, provide the guarantee about side effects and are equivalent to std::sync::Once::<()>
. They not always document this guarantee, but that is most likely not intentional (as they usually don't specify any synchronization guarantees at all).
I have seem quite a few people--including myself--assume that Lazy<()> is equivalent to Once, for various implementations of Lazy, when this isn't true w.r.t. these side effects.
@briansmith I think there's some misunderstanding here (not sure on whose side though). My understanding is that at the moment all various implementations (
std::sync::Once
,lazy_static
,once_cell::OnceCell
,spin::Once
,conquer_once::OnceCell
) do, in practice, provide the guarantee about side effects and are equivalent tostd::sync::Once::<()>
. They not always document this guarantee, but that is most likely not intentional (as they usually don't specify any synchronization guarantees at all).
It isn't clear whether the various implementations intend to provide the extra guarantee, so I am conservatively assuming that it is accidental that the current implementations are implemented the way they are, and that they may switch to a more efficient implementation that doesn't preserve this guarantee in the future. In the case of OnceCell
, I see you've now documented your intent to always implement the same guarantee as std::sync::Once
(IIUC) so that is the exception to the rule.
For libstd, I think Lazy
should be implemented conservatively to start, but also the implementation should document that this implementation is only temporary until more efficient and less friendly semantics can be implemented.
Can you expand more on why this should be in std
? The RFC says
We can have a single canonical API for a commonly used tricky unsafe concept, so we probably should have it!
I don't find this very convincing; the same is true of bindgen
, which isn't in std.
As an alternative, could this be part of rust-lang-nursery, like bindgen and rand? That would keep all your hard work on the design and also allow you to make breaking changes in the future if necessary.
Updated the RFC with discussion on spinlocks. My takeaway from looking into this is that we should just not have spinlocks in std. That is, I don't consider this an unresolved question, but rather a deliberate design decision.
Updated the RFC with discussion on synchronization. This is a question I don't know the best answer for. If some folks with intimate knowledge of memory orderings and consume
semantics could chime in, that would be really helpful!
I don't find this very convincing; the same is true of bindgen, which isn't in std.
This is all about tradeoffs. In general, I think three abstract metrics are most useful when discussing inclusion in std:
a) how general the solution/problem is (what percentage of crates uses it? Can it be called ubiquitous?) b) how much value the the addition derives for those crates that use it (ie, how easy it is to just write the thing yourself?) c) (this one is often overlooked) how small is the design space? Should std chose between designs x, y, z with different tradeoffs, or is there a "minimal energy" design w, such that it doesn't make sense to do anything different from it?
OnceCell
scores high on all three points, lazy_static!
on the first two. bindgen
has enormous score on b
, medium-low score on a
and zero score on c
(as the design space is pretty-much unconstrained).
Overall, I am pretty sure that there's a general consensus that the problem of lazy values is important enough to warrant std solution.