Unums.jl
Unums.jl copied to clipboard
Bitstype design
Following up on tonight's call, I wanted to make a quick review of my design decisions to date, and go over some likely extensions to the basic design. I want collaboration on this... I likely don't have enough bandwidth to see it all the way to a finished product myself.
First, the primary abstraction is the AbstractUnum{B,ESS,FSS}
, with type parameters B = base, ESS = exponent size size, FSS = fraction size size. I think the "64" may belong as a parameter as well, but I can change that later when needed. The important note is that the core type is always exactly 64 bits, whether or not all bytes are used, and bitstypes in Julia have access to some great low-level operations. The Unum64
is just a typealias for the bitstype FixedUnum64{2,4,5}
.
bitstype 64 FixedUnum64{B,ESS,FSS} <: AbstractUnum{B,ESS,FSS}
typealias BinaryUnum64{ESS,FSS} FixedUnum64{2,ESS,FSS}
typealias Unum64 BinaryUnum64{4,5}
For each concrete unum type, a specialized type is created during code generation:
type UnumInfo{U<:AbstractUnum}
base::Int
nbits::Int
maxesize::Int
maxfsize::Int
esizesize::Int
fsizesize::Int
utagsize::Int
signbitpos::Int
epos::Int
fpos::Int
ubitpos::Int
esizepos::Int
fsizepos::Int
signbitmask::U
emask::U
fmask::U
efmask::U
ubitmask::U
esizemask::U
fsizemask::U
efsizemask::U
utagmask::U
zero::U # exact zero
poszero::U # inexact positive zero
negzero::U # inexact negative zero
posinf::U # exact positive inf
neginf::U # exact negative inf
mostpos::U # exact maximum positive real
leastpos::U # exact minimum positive real
mostneg::U # exact minimum negative real
leastneg::U # exact maximum negative real
nan::U # this is "quiet NaN" from the book
null::U # this is "signaling NaN" from the book... can maybe repurpose to replace Nullable
UINT::DataType
INT::DataType
UnumInfo() = new()
end
and it looks like:
julia> using Unums
julia> c = Unums.unumConstants(Unum64)
UnumInfo{Unums.FixedUnum64{2,4,5}}:
base 2
nbits 64
maxesize 16
maxfsize 32
esizesize 4
fsizesize 5
utagsize 10
signbitpos 64
epos 58
fpos 42
ubitpos 10
esizepos 9
fsizepos 5
signbitmask 1000000000000000000000000000000000000000000000000000000000000000
emask 0000001111111111111111000000000000000000000000000000000000000000
fmask 0000000000000000000000111111111111111111111111111111110000000000
efmask 0000001111111111111111111111111111111111111111111111110000000000
ubitmask 0000000000000000000000000000000000000000000000000000001000000000
esizemask 0000000000000000000000000000000000000000000000000000000111100000
fsizemask 0000000000000000000000000000000000000000000000000000000000011111
efsizemask 0000000000000000000000000000000000000000000000000000000111111111
utagmask 0000000000000000000000000000000000000000000000000000001111111111
zero 0000000000000000000000000000000000000000000000000000000000000000
poszero 0000000000000000000000000000000000000000000000000000001000000000
negzero 1000000000000000000000000000000000000000000000000000001000000000
posinf 0000001111111111111111111111111111111111111111111111110111111111
neginf 1000001111111111111111111111111111111111111111111111110111111111
mostpos 0000001111111111111111111111111111111111111111111111100111111111
leastpos 0000000000000000000000000000000000000000000000000000010000000000
mostneg 1000001111111111111111111111111111111111111111111111100111111111
leastneg 1000000000000000000000000000000000000000000000000000010000000000
nan 0000001111111111111111111111111111111111111111111111111111111111
null 1000001111111111111111111111111111111111111111111111111111111111
This type has the important masks and special unums calculated and hardcoded at compile time, each one for the specific concrete bitstype.
The intended usage looks something like this:
"exponent of the unum"
@generated function Base.exponent{U<:AbstractUnum}(u::U)
c = unumConstants(U)
:(reinterpret($(c.INT), u & $(c.emask)) >> $(c.fpos))
end
For those that don't know generated (staged functions)... at runtime a concrete unum type (not value) will be passed in to the main method. During the very first call to the method with those types, the UnumInfo
type will be populated, and super-efficient machine code can be generated specific to this bits layout. The last line of code is the actual method body, and a function will be compiled as if you hardcoded everything by hand.
To give you an idea of what this means... here's the native code for that specialized method:
julia> u = c.emask
bits: 0000001111111111111111000000000000000000000000000000000000000000
| 0 | 1111111111111111 | 00000000000000000000000000000000 | 0 | 0000 | 00000 |
| signbit | exp | frac | ubit | esize-1 | fsize-1 |
julia> exponent(u)
65535
julia> @code_native exponent(u)
.text
Filename: /home/tom/.julia/v0.4/Unums/src/ops.jl
Source line: 12
pushq %rbp
movq %rsp, %rbp
Source line: 12
shrq $42, %rdi
movzwl %di, %eax
popq %rbp
ret
and now look at the native code for Float64...
julia> fl = Float64(100000)
100000.0
julia> @code_native exponent(fl)
.text
Filename: math.jl
Source line: 199
pushq %rbp
movq %rsp, %rbp
Source line: 199
vmovd %xmm0, %rcx
movabsq $9218868437227405312, %rdx # imm = 0x7FF0000000000000
Source line: 200
movq %rcx, %rax
andq %rdx, %rax
je L49
Source line: 201
cmpq %rdx, %rax
je L117
shrq $52, %rax
jmpq L108
L49: vxorpd %xmm1, %xmm1, %xmm1
Source line: 203
vucomisd %xmm1, %xmm0
jne L74
jp L74
jmpq L144
L74: movabsq $4503599627370495, %rax # imm = 0xFFFFFFFFFFFFF
Source line: 204
andq %rax, %rcx
movl $127, %edx
Source line: 205
bsrq %rcx, %rax
cmoveq %rdx, %rax
xorq $-64, %rax
Source line: 206
addq $13, %rax
Source line: 210
L108: addq $-1023, %rax # imm = 0xFFFFFFFFFFFFFC01
popq %rbp
ret
Source line: 208
L117: movabsq $jl_throw_with_superfluous_argument, %rax
movabsq $140108740444384, %rdi # imm = 0x7F6D9BB440E0
movl $208, %esi
callq *%rax
Source line: 203
L144: movabsq $jl_throw_with_superfluous_argument, %rax
movabsq $140108740444384, %rdi # imm = 0x7F6D9BB440E0
movl $203, %esi
callq *%rax
The idea for general implementation is that unums operations would generate specialized implementations for each "unum environment", and since we are dealing with bitstypes, we can use many low level bit operations, and at the same time have a relatively compact memory footprint, always less than 64 bits in the ulayer.
One idea for extension of the abstract unum: EMIN, which (combined with the parameterized base) open up possibilities for decimal and integer unums under the exact same framework/code. See #4.
Obviously this isn't a finished implementation, lacking basic conversions and operations, but I think I have just enough implemented to show the overarching design that I envision. Comments please!
Call? Would love to be a part of any discussions of unums. This is looking very good!
I'm curious now, where the Unum work is currently going on. I've seen @dpsanders' https://github.com/dpsanders/SimpleUnums.jl recently, and https://github.com/REX-Computing/unumjl. I also had a small idea about the parameterization: instead of parameterizing by 2 or 10, parameterize by a type (see https://github.com/JuliaLang/julia/pull/14251 for where this might be able to fit into). You could have something like the following:
abstract UnumFormat <: NumericFormat
abstract UnumBinaryFmt{ESS,FSS} <: UnumFormat
abstract UnumDecimalFmt{ESS,FSS} <: UnumFormat
...
bitstype 64 Unum64 <: AbstractFloat{UnumBinaryFmt{3,4}}
bitstype 128 Unum128 <: AbstractFloat{UnumBinaryFmt{4,7}}