rfcs
rfcs copied to clipboard
Add `bf16`, `f64f64` and `f80` types
Previous RFC #3451 mixes proposal for IEEE-754 compliant f16
/f128
and such non-standard types, split this off from it to focus on the target related ones.
f64f64
is a bad name for type. Maybe fxd64
(float extended double 64) is better.
Having unified rule for naming is a benefit. For example,
- it must starts with
f
: - common letter must follow
f
, for examplex
- extended,n
- non-standard,c
- custom,a
- alternative And we getfx80
fxd64
fxb16
Since f80
is x86-only and f64f64
is PowerPC-only, x86_f80
and ppc_f128
look clear and consistent with LLVM. If leading letter should be f, f80_x86
and f128_ppc
... are okay but a little weird? Especially for f128_ppc
, maybe confusing with f128
. Or f128xppc
, but f80xx86
is a bad name.
I think bfloat16
or bf16
are the only good options for that type, because they are the de-facto standardized names (ignoring the extra underlines C/C++ compilers like to add). I strongly dislike fxb16
and f16b
.
Where will bf16 be located in libcore? The PPC and x86 floats will go in the relevant arch modules, but since bfloat is not arch-specific, it seems relevant to ask.
Where will bf16 be located in libcore? The PPC and x86 floats will go in the relevant arch modules, but since bfloat is not arch-specific, it seems relevant to ask.
I would expect bf16
to be a primitive type, so it would always be available like f32
(in a new edition) and be in the prelude (for old editions) and core::primitive
.
f64f64
is a bad name for type. Maybefxd64
(float extended double 64) is better.Having unified rule for naming is a benefit. For example,
- it must starts with
f
:- common letter must follow
f
, for examplex
- extended,n
- non-standard,c
- custom,a
- alternative And we getfx80
fxd64
fxb16
f64f64 comes from double double, as f64=double, that's makes sense
f64f64 comes from double double, as f64=double, that's makes sense
That's not how rust's naming convention works. Considering that this is a 128 bit float format with a slightly different exponent split than the usual f128
, I would strongly recommend using a variation on f128
like fx128
, f128ppc
, core::arch::power_pc::f128
or similar. Based on the description it's not even correct to describe it as "two f64's", it is one f64 and then a u64 with a bunch of extra mantissa bits. A type which is actually "two f64's" would be f64x2
and I would expect it to come up when representing complex numbers or a small SIMD float.
Based on the description it's not even correct to describe it as "two f64's", it is one f64 and then a u64 with a bunch of extra mantissa bits.
it is actually two f64
's. the number is represented as the sum of two f64
values where one is approximately 2^53
larger than the other so the mantissa bits of one f64
stop about where they start in the other f64
. you could also think of it as a f64
and another f64
telling you how far off the first f64
is, approximately doubling the precision.
Based on the description it's not even correct to describe it as "two f64's", it is one f64 and then a u64 with a bunch of extra mantissa bits.
it is actually two
f64
's. the number is represented as the sum of twof64
values where one is approximately2^53
larger than the other so the mantissa bits of onef64
stop about where they start in the otherf64
. you could also think of it as af64
and anotherf64
telling you how far off the firstf64
is, approximately doubling the precision.
In this case f64pf64
might be better to convey it's an f64 value plus another, rather than just the word f64 twice, which seems weird.
Based on the description it's not even correct to describe it as "two f64's", it is one f64 and then a u64 with a bunch of extra mantissa bits.
it is actually two
f64
's. the number is represented as the sum of twof64
values where one is approximately2^53
larger than the other so the mantissa bits of onef64
stop about where they start in the otherf64
. you could also think of it as af64
and anotherf64
telling you how far off the firstf64
is, approximately doubling the precision.In this case
f64pf64
might be better to convey it's an f64 value plus another, rather than just the word f64 twice, which seems weird.
doubledouble
doesn't looks weird, so would f64f64
, the p
of f64pfp64
are weird becase for new one have no context won't know p
is plus
, and what's plus for? is that are +
?
one other option is we could copy the existing double-double crate and call it twofloat
one other option is we could copy the existing double-double crate and call it twofloat
this comes with an issue that not the rust style like f16,f32,f64,f128
this comes with an issue that not the rust style like f16,f32,f64,f128
To echo something said by scottmcm on another thread: types representing 80-bit extended precision (f80) and double-double (f64f64) are specialized types that we want to make available but don't want to encourage common use of (they will forever live in core::arch), so they don't need to match Rust's primitive naming style. "ugly" names along the lines of __m128bh are fine.
this comes with an issue that not the rust style like f16,f32,f64,f128
To echo something said by scottmcm on another thread: types representing 80-bit extended precision (f80) and double-double (f64f64) are specialized types that we want to make available but don't want to encourage common use of (they will forever live in core::arch), so they don't need to match Rust's primitive naming style. "ugly" names along the lines of __m128bh are fine.
If "ugly" names is accepted, __float80
and __ibm128
can be used and that comes from GCC, we should using existing one(for language),
LLVM's x86_fp80
and ppc_fp128
is for IR, not for C/C++ language, so that's should be avoided.
But if ugly
names is not accepted, and look at Float80(https://developer.apple.com/documentation/swift/float80) from Swift, I think f80
and f64f64
is still a good idea as beautiful names, but why we would want "ugly" names,
f80
and f64f64
always comes core:arch
, it's already prefixed with core::arch
string, there is no reason makes it ugly.
I'm not exactly sure why the double-underscore convention was adopted for the x86 types, but if we're going for consistency, f80 should be called __m80. I'm not sure if x86 has SIMD types involving 80-bit floats (I sure hope not) but if so we could also use similar naming here.
I'm not exactly sure why the double-underscore convention was adopted for the x86 types, but if we're going for consistency, f80 should be called __m80. I'm not sure if x86 has SIMD types involving 80-bit floats (I sure hope not) but if so we could also use similar naming here.
Where does the __m80
comes from? consistency with what?
x86 have no SIMD for 80-bit floats for sure
I'm not exactly sure why the double-underscore convention was adopted for the x86 types, but if we're going for consistency, f80 should be called __m80.
the double underscore is likely because C reserves all identifiers starting with __
for the implementation.
the __m64
/__m128
/... types are likely named for MMX (the predecessor to SSE), they are always SIMD types afaik. f80
is not a SIMD type, so imho naming it __m80
is incorrect.
this comes with an issue that not the rust style like f16,f32,f64,f128
To echo something said by scottmcm on another thread: types representing 80-bit extended precision (f80) and double-double (f64f64) are specialized types that we want to make available but don't want to encourage common use of (they will forever live in core::arch), so they don't need to match Rust's primitive naming style. "ugly" names along the lines of __m128bh are fine.
BTW, __m128bh are comes from https://www.intel.com/content/www/us/en/docs/cpp-compiler/developer-guide-reference/2021-8/intrinsics-for-avx-512-bf16-instructions.html, and it's for SIMD,
but f80
and f64f64
is for c FFI
, that's a different story.
Based on the description it's not even correct to describe it as "two f64's", it is one f64 and then a u64 with a bunch of extra mantissa bits.
it is actually two
f64
's. the number is represented as the sum of twof64
values where one is approximately2^53
larger than the other so the mantissa bits of onef64
stop about where they start in the otherf64
. you could also think of it as af64
and anotherf64
telling you how far off the firstf64
is, approximately doubling the precision.In this case
f64pf64
might be better to convey it's an f64 value plus another, rather than just the word f64 twice, which seems weird.
doubledouble
doesn't looks weird, so wouldf64f64
, thep
off64pfp64
are weird becase for new one have no context won't knowp
isplus
, and what's plus for? is that are+
?
Hard disagree. doubledouble
looks weird, same with f64f64
. But yeah, the p
is just confusing
the double underscore is likely because C reserves all identifiers starting with
__
for the implementation.
This isn't C, though.
the
__m64
/__m128
/... types are likely named for MMX (the predecessor to SSE), they are always SIMD types afaik.f80
is not a SIMD type, so imho naming it__m80
is incorrect.
Hmm, when poking around a few x86 references I found that people used m80
or m80fp
to refer to these float args, but I guess that it was just a weird convention? I wasn't under the impression that the m
here stood for MMX, but memory, since x86 uses rN
to refer to registers, immN
to refer to immediates, and mN
to refer to memory.
I guess if we wanted to go with the prefix meaning the instruction set, we could go with fp80
since that's closer to what Intel uses.
the double underscore is likely because C reserves all identifiers starting with
__
for the implementation.This isn't C, though.
but the __m128
naming comes from the x86 intrinsics which are designed for C.
Just randomly saw this, and this is some good timing, because I have rustc_apfloat
news:
a bunch of f80
bug fixes, and support for bf16
, are included in:
- https://github.com/rust-lang/rustc_apfloat/pull/1
However, I would strongly advise staying away from f64f64
aka ppc_f128
Unlike IEEE formats (which have all of their behavior parameterized by their exponent and significant bitwidths) and x87's 80-bit weird format (which is mostly IEEE-like outside of some weird extra NaNs in the form of non-subnormals lacking the "integer bit"), llvm::APFloat
/rustc_apfloat
's support for the f64f64
/ppc_f128
"double double" format lacks specialized implementations for many operations, relying instead on a lossy fallback to a custom IEEE-style format, that cannot represent some of the nastier edge cases.
(IIRC f64f64
aka ppc_f128
aka "the uniquely weird PPC double-double
format", allows its two f64
s to have exponents so different, so you would require a massive IEEE-style format to losslessly contain its effective significand - something like f2113
if I had to guess, and requiring rustc_apfloat::ieee::IeeeFloat
to have sig: [u128; 16]
instead of sig: [u128; 1]
- but that's wasteful because most of those bits will always be 0
)
Since
f80
is x86-only andf64f64
is PowerPC-only,x86_f80
andppc_f128
look clear and consistent with LLVM.
I think LLVM made a mistake here with x86
(though I can kinda see why they chose that), llvm::APFloat
correctly calls it x87
- this is not a format that x86
will use, it's specifically the internal "transiently higher precision" format of the x87
FPU.
x87_f80
and fx87_80
both look kind of silly to me, though (see below for a better solution).
I would strongly recommend using a variation on
f128
likefx128
,f128ppc
,core::arch::power_pc::f128
or similar.
I was about to suggest that last one, i.e. scoping these under core::arch::*
(though I would avoid calling it 128
- it's not 128
anything other than storage, it's "the sum of two standard IEEE f64
s", maybe core::arch::power_pc::double_f64
?).
I think core::arch::x86::f80
or core::arch::x86::x87_f80
would work great.
bf16
as builtin type for 'Brain floating format', widely used in machine learning, different from IEEE-754 standardbinary16
representation
- Putting this in the global namespace seems very under-motivated. Can't users who need it
use core::...::bf16
? - If "b" stands for "Brain" rather than (as I would have assumed) "binary", why give this the cryptic and generic name
bf16
, making it sound like a standard (albeit still niche) "binary float, 16-bit", rather than, say,brain_f16
? Then machine learning projects canuse core::...::brain_f16 as bf16
.
I was under the impression that primitives were merely in the prelude, and their "primitive" nature simply came from the fact that they were associated with lang items. However, after looking at the prelude, this is not the case, and they are instead always present.
I understand the desire to make them work with literal suffixes, but could this not be allowed without bringing the types in scope? Or maybe only with the types in scope? Perhaps this can be affected by an edition bump.
The ideal way IMHO this would work is that you can always coerce a literal to the type, but in order to actually reference the type or use it via an explicit suffix, you'd have to import it. Perhaps the "explicit suffix" form might even be undesired and you would have to do it via some expression like let x: bf16 = 1.0
.
One interesting functionality that f64f64
and f80
bring in is that both of these types have non-canonical representations. f80
essentially has one bit that is completely determined by the other 79 bits, and if that bit is incorrectly set, it is a non-canonical number of some kind. IEEE 754 does have a concept of canonical and noncanonical numbers, but this applies only to f80
(which is itself an implementation of an extended-precision binary64) and the decimal floating-point types. f64f64
has various interesting kinds of non-canonical representations, but that is the detailed extent of my knowledge.
f64f64
would be the first floating-point type added to Rust that cannot be described directly with IEEE 754 semantics (which are parameterized on a base/# digits/maximum exponent basis); concepts like "number of mantissa digits" is not well-defined, and I don't know how this problem is solved in the C/C++ libraries that exist. This does add risks for representing this type.
IEEE 754-2019 adds a section on augmented arithmetic operations, which includes addition, subtraction, and multiplication, but not division (for reasons I don't know and will not speculate on). It may be the case that future versions will grow a more general double-double library functionality for extra precision.
However, I would strongly advise staying away from
f64f64
akappc_f128
Unlike IEEE formats (which have all of their behavior parameterized by their exponent and significant bitwidths) and x87's 80-bit weird format (which is mostly IEEE-like outside of some weird extra NaNs in the form of non-subnormals lacking the "integer bit"),
llvm::APFloat
/rustc_apfloat
's support for thef64f64
/ppc_f128
"double double" format lacks specialized implementations for many operations, relying instead on a lossy fallback to a custom IEEE-style format, that cannot represent some of the nastier edge cases.
Is a complete softfloat implementation strictly necessary? We could just forbid operations on ppc_f128
in const
contexts.
I think that is the goal - everything here (except for bf16) would be in std::arch, only available wherever there is hardware support
Better split bf16 out of this, I think the main reason f64f64
and f80
is for keep ABI compatible with existing C libraries, but bf16 is not just for ABI compatible only, but also for acceleration
At this point, I've personally come to expect that Rust types named f*
represent an IEEE 754 basic/interchange format. I'm not sure if naming the x87 floats f80
would be clear enough in distinguishing these from the cross-platform types. Thus, to make it clear that the type is platform-specific, I think I'd prefer the type to be named x87_f80
, x86_f80
or similar.
Because these live in core::arch::...
, you can establish the convention of always using them as x86::f80
instead of importing the type directly.