Unums.jl icon indicating copy to clipboard operation
Unums.jl copied to clipboard

Bitstype design

Open tbreloff opened this issue 9 years ago • 2 comments

Following up on tonight's call, I wanted to make a quick review of my design decisions to date, and go over some likely extensions to the basic design. I want collaboration on this... I likely don't have enough bandwidth to see it all the way to a finished product myself.

First, the primary abstraction is the AbstractUnum{B,ESS,FSS}, with type parameters B = base, ESS = exponent size size, FSS = fraction size size. I think the "64" may belong as a parameter as well, but I can change that later when needed. The important note is that the core type is always exactly 64 bits, whether or not all bytes are used, and bitstypes in Julia have access to some great low-level operations. The Unum64 is just a typealias for the bitstype FixedUnum64{2,4,5}.

bitstype 64 FixedUnum64{B,ESS,FSS} <: AbstractUnum{B,ESS,FSS}
typealias BinaryUnum64{ESS,FSS}   FixedUnum64{2,ESS,FSS}
typealias Unum64                  BinaryUnum64{4,5}

For each concrete unum type, a specialized type is created during code generation:


type UnumInfo{U<:AbstractUnum}
  base::Int
  nbits::Int
  maxesize::Int
  maxfsize::Int
  esizesize::Int
  fsizesize::Int
  utagsize::Int

  signbitpos::Int
  epos::Int
  fpos::Int
  ubitpos::Int
  esizepos::Int
  fsizepos::Int

  signbitmask::U
  emask::U
  fmask::U
  efmask::U
  ubitmask::U
  esizemask::U
  fsizemask::U
  efsizemask::U
  utagmask::U

  zero::U      # exact zero
  poszero::U   # inexact positive zero
  negzero::U   # inexact negative zero
  posinf::U    # exact positive inf
  neginf::U    # exact negative inf
  mostpos::U   # exact maximum positive real
  leastpos::U  # exact minimum positive real
  mostneg::U   # exact minimum negative real
  leastneg::U  # exact maximum negative real
  nan::U       # this is "quiet NaN" from the book
  null::U      # this is "signaling NaN" from the book... can maybe repurpose to replace Nullable

  UINT::DataType
  INT::DataType

  UnumInfo() = new()
end

and it looks like:

julia> using Unums

julia> c = Unums.unumConstants(Unum64)
UnumInfo{Unums.FixedUnum64{2,4,5}}:
             base      2
            nbits     64
         maxesize     16
         maxfsize     32
        esizesize      4
        fsizesize      5
         utagsize     10
       signbitpos     64
             epos     58
             fpos     42
          ubitpos     10
         esizepos      9
         fsizepos      5
      signbitmask 1000000000000000000000000000000000000000000000000000000000000000
            emask 0000001111111111111111000000000000000000000000000000000000000000
            fmask 0000000000000000000000111111111111111111111111111111110000000000
           efmask 0000001111111111111111111111111111111111111111111111110000000000
         ubitmask 0000000000000000000000000000000000000000000000000000001000000000
        esizemask 0000000000000000000000000000000000000000000000000000000111100000
        fsizemask 0000000000000000000000000000000000000000000000000000000000011111
       efsizemask 0000000000000000000000000000000000000000000000000000000111111111
         utagmask 0000000000000000000000000000000000000000000000000000001111111111
             zero 0000000000000000000000000000000000000000000000000000000000000000
          poszero 0000000000000000000000000000000000000000000000000000001000000000
          negzero 1000000000000000000000000000000000000000000000000000001000000000
           posinf 0000001111111111111111111111111111111111111111111111110111111111
           neginf 1000001111111111111111111111111111111111111111111111110111111111
          mostpos 0000001111111111111111111111111111111111111111111111100111111111
         leastpos 0000000000000000000000000000000000000000000000000000010000000000
          mostneg 1000001111111111111111111111111111111111111111111111100111111111
         leastneg 1000000000000000000000000000000000000000000000000000010000000000
              nan 0000001111111111111111111111111111111111111111111111111111111111
             null 1000001111111111111111111111111111111111111111111111111111111111

This type has the important masks and special unums calculated and hardcoded at compile time, each one for the specific concrete bitstype.

The intended usage looks something like this:

"exponent of the unum"
@generated function Base.exponent{U<:AbstractUnum}(u::U)
  c = unumConstants(U)
  :(reinterpret($(c.INT), u & $(c.emask)) >> $(c.fpos))
end

For those that don't know generated (staged functions)... at runtime a concrete unum type (not value) will be passed in to the main method. During the very first call to the method with those types, the UnumInfo type will be populated, and super-efficient machine code can be generated specific to this bits layout. The last line of code is the actual method body, and a function will be compiled as if you hardcoded everything by hand.

To give you an idea of what this means... here's the native code for that specialized method:

julia> u = c.emask
bits: 0000001111111111111111000000000000000000000000000000000000000000
|    0    | 1111111111111111 | 00000000000000000000000000000000 |  0   |  0000   |  00000  | 
| signbit |       exp        |               frac               | ubit | esize-1 | fsize-1 | 

julia> exponent(u)
65535

julia> @code_native exponent(u)
        .text
Filename: /home/tom/.julia/v0.4/Unums/src/ops.jl
Source line: 12
        pushq   %rbp
        movq    %rsp, %rbp
Source line: 12
        shrq    $42, %rdi
        movzwl  %di, %eax
        popq    %rbp
        ret

and now look at the native code for Float64...

julia> fl = Float64(100000)
100000.0

julia> @code_native exponent(fl)
        .text
Filename: math.jl
Source line: 199
        pushq   %rbp
        movq    %rsp, %rbp
Source line: 199
        vmovd   %xmm0, %rcx
        movabsq $9218868437227405312, %rdx # imm = 0x7FF0000000000000
Source line: 200
        movq    %rcx, %rax
        andq    %rdx, %rax
        je      L49
Source line: 201
        cmpq    %rdx, %rax
        je      L117
        shrq    $52, %rax
        jmpq    L108
L49:    vxorpd  %xmm1, %xmm1, %xmm1
Source line: 203
        vucomisd        %xmm1, %xmm0
        jne     L74
        jp      L74
        jmpq    L144
L74:    movabsq $4503599627370495, %rax # imm = 0xFFFFFFFFFFFFF
Source line: 204
        andq    %rax, %rcx
        movl    $127, %edx
Source line: 205
        bsrq    %rcx, %rax
        cmoveq  %rdx, %rax
        xorq    $-64, %rax
Source line: 206
        addq    $13, %rax
Source line: 210
L108:   addq    $-1023, %rax            # imm = 0xFFFFFFFFFFFFFC01
        popq    %rbp
        ret
Source line: 208
L117:   movabsq $jl_throw_with_superfluous_argument, %rax
        movabsq $140108740444384, %rdi  # imm = 0x7F6D9BB440E0
        movl    $208, %esi
        callq   *%rax
Source line: 203
L144:   movabsq $jl_throw_with_superfluous_argument, %rax
        movabsq $140108740444384, %rdi  # imm = 0x7F6D9BB440E0
        movl    $203, %esi
        callq   *%rax

The idea for general implementation is that unums operations would generate specialized implementations for each "unum environment", and since we are dealing with bitstypes, we can use many low level bit operations, and at the same time have a relatively compact memory footprint, always less than 64 bits in the ulayer.

One idea for extension of the abstract unum: EMIN, which (combined with the parameterized base) open up possibilities for decimal and integer unums under the exact same framework/code. See #4.

Obviously this isn't a finished implementation, lacking basic conversions and operations, but I think I have just enough implemented to show the overarching design that I envision. Comments please!

tbreloff avatar Oct 13 '15 02:10 tbreloff

Call? Would love to be a part of any discussions of unums. This is looking very good!

ScottPJones avatar Oct 15 '15 21:10 ScottPJones

I'm curious now, where the Unum work is currently going on. I've seen @dpsanders' https://github.com/dpsanders/SimpleUnums.jl recently, and https://github.com/REX-Computing/unumjl. I also had a small idea about the parameterization: instead of parameterizing by 2 or 10, parameterize by a type (see https://github.com/JuliaLang/julia/pull/14251 for where this might be able to fit into). You could have something like the following:

abstract UnumFormat <: NumericFormat
abstract UnumBinaryFmt{ESS,FSS} <: UnumFormat
abstract UnumDecimalFmt{ESS,FSS} <: UnumFormat
...
bitstype 64   Unum64   <: AbstractFloat{UnumBinaryFmt{3,4}}
bitstype 128 Unum128 <: AbstractFloat{UnumBinaryFmt{4,7}}

ScottPJones avatar Dec 11 '15 15:12 ScottPJones