Biohazrd icon indicating copy to clipboard operation
Biohazrd copied to clipboard

Add support for variable arguments (va_list)

Open PathogenDavid opened this issue 3 years ago • 2 comments

Splitting this issue out from https://github.com/InfectedLibraries/Biohazrd/issues/16 as supporting va_list is much simpler than supporting ... variable arguments.

va_list is already somewhat usable. For example (on Windows), the following C++ function:

void TextV(const char* fmt, va_list args);

is translated as:

public static extern void TextV(byte* fmt, byte* args);

(TODO: Everything below applies to the Microsoft ABI with the CRT implementation of va_list. Need to investigate whether others are the same.)

You can pass the variable argument parameters by taking the address of a tuple and casting it to byte*:

(double, double) parameters = (16f, 60f);
imgui.TextV(PinnedUtf8("Application average %.3f ms/frame (%.1f FPS)"), (byte*)&parameters);

It'd be fairly simple to provide a friendly generic overload like this:

public static void TextV<TArgs>(byte* fmt, in TArgs args)
    where TArgs : unmanaged, ITuple
{
    fixed (TArgs* argsP = &args)
    { TextV(fmt, (byte*)argsP); }
}

Which makes invocation look more natural:

imgui.TextV(PinnedUtf8("Application average %.3f ms/frame (%.1f FPS)"), (16.0, 60.0));

You can't really generate a params overload like you'd expect in C# without a bunch of checks and data shuffling at runtime, but in theory you could generate a bunch of different overloads which call this one:

public static void TextV<T0, T1>(byte* fmt, T0 arg0, T1 arg1)
    where T0 : unmanaged
    where T1 : unmanaged
    => TextV(fmt, (arg0, arg1));

which has a very natural usage of:

imgui.TextV(PinnedUtf8("Application average %.3f ms/frame (%.1f FPS)"), 16.0, 60.0);

Although these examples do highlight an important pitfall of these APIs: C++ developers might typically pass a float for %f, but as seen above those floats are implicitly widened to doubles. (TODO: Figure out where this is specified. Is it Windows-specific?)

First class variable argument support likely needs to be accompanied by an analyzer to help ensure proper usage.

PathogenDavid avatar Feb 22 '21 03:02 PathogenDavid

(TODO: Everything below applies to the Microsoft ABI with the CRT implementation of va_list. Need to investigate whether others are the same.)

Briefly looked at Linux and friends today, first thing of note is that the Clang version of stdarg.h is not nearly as transparent as the Microsoft CRT implementation. va_list is simply an alias for the opaque __builtin_va_list .

Looking at the System-V x64 ABI, va_list is defined as follows in figure 3.34:

typedef struct {
    unsigned int gp_offset;
    unsigned int fp_offset;
    void *overflow_arg_area;
    void *reg_save_area;
} va_list[1];

As you can see, it's quite a bit more complex than Microsoft's method of "dump everything on the stack and pass a pointer".

If you read the details on va_arg, you can see there's logic to try and get arguments from the "registers". It's not actually getting them from the register. Part of va_start is that it puts all the extra arguments which would've been passed via register onto the stack in a special "register area" and va_arg grabs them separately. (If you look at the actual variable argument calls, arguments which could've been passed via registers if it wasn't a variable arg function still are.)

You can see this in this Godbolt:

#include <stdarg.h>

void TestV(const char* fmt, va_list args);

void Test(const char* fmt, ...)
{
    va_list args;
    va_start(args, fmt);
    TestV(fmt, args);
    va_end(args);
}

void DoTest()
{
    Test("%d", 3226);
}

Notice that DoTest is passing 3226 via esi which is later copied to the stack by mov qword ptr [rsp + 40], rsi.

This is unfortunately much more involved compared to the Microsoft version since we now need to know/understand the intimate argument passing details. Especially when you consider how SystemV handles passing structs in registers. For example:

struct Int2
{
    int x;
    int y;
};

void DoTestInt()
{
    Int2 l;
    l.x = 100;
    l.y = 200;
    Test("%d", l);
}

struct Long2
{
    long long x;
    long long y;
};

void DoTestLong()
{
    Long2 l;
    l.x = 100;
    l.y = 200;
    Test("%d", l);
}

DoTestInt will pass all of the Int2 in rsi: movabs rsi, 858993459300 and DoTestLong will pass Long2.x in esi and Long2.y in edix:

DoTestInt():                          # @DoTestInt()
        movabs  rsi, 858993459300 # (100 | (200 << 32))
        mov     edi, offset .L.str
        xor     eax, eax
        jmp     Test(char const*, ...)                     # TAILCALL
DoTestLong():                        # @DoTestLong()
        mov     edi, offset .L.str
        mov     esi, 100
        mov     edx, 200
        xor     eax, eax
        jmp     Test(char const*, ...)                     # TAILCALL

PathogenDavid avatar Sep 18 '21 09:09 PathogenDavid

Random thought I had the other night while failing to sleep: Is there anything in the SysV ABI that says we have to use the register space? Could we construct a va_list that only uses the overflow area and end up with an implementation basically the same as with Windows?

It seems to me like this should be possible purely because va_list can be passed around, so all implementations have to be able to handle the possibility that their parent function already used up spots in the register space.

PathogenDavid avatar Sep 24 '21 08:09 PathogenDavid