FEX icon indicating copy to clipboard operation
FEX copied to clipboard

Support thunking of multi-instance Vulkan applications

Open neobrain opened this issue 3 years ago • 2 comments

FEX's libvulkan thunks don't work if multiple VkInstances are used that return different function pointers to vkGetDeviceProcAddr. This is because to implement vkGetDeviceProcAddr in the first place, FEX needs to pre-query a VkInstance-specific function pointer during initialization. The function pointer itself is called without any direct reference to the original VkInstance, so FEX acts as if only the first VkInstance created by the application was used.

In the broader picture, the problem is a restriction in the thunk generator: Functions that must be guest-callable through host function pointers can not use a custom guest entrypoint.

Suggested solution: Extending framework for guest-callable host function pointers

Currently, guest-callable host function pointers are implemented by linking the host function pointer to a generic template function that forwards the packed argument list plus the original host function pointer to a hostcall_ thunk. In this case, the function pointer is vkGetDeviceProcAddr as returned from vkGetInstanceProcAddr. What's needed is a way of customizing the target function: Instead of calling the thunk that initiates a Guest->Host transition, a custom function should be called.

Implementation sketch of the guest-side thunk library:

template<auto Function, typename Result, typename... Args>
inline Result ReadHiddenArgumentAndCall(Args... args) {
    uintptr_t hidden_arg;
    asm("mov %%rax, %0" : "=r" (hidden_arg));
    return Function(args..., hidden_arg);
}

PFN_vkVoidFunction vkGetDeviceProcAddr_indirect(VkDevice a_0, const char* a_1, void* host_ptr) {
    PackedArguments<PFN_vkVoidFunction, VkDevice, const char*, uintptr_t> args = { a_0, a_1, host_ptr };
    fexthunks_libvulkan_hostcall_vkGetDeviceProcAddr(&args);
    LinkGuestAddressToHostFunction(args.rv, PtrsToLookUp.at(a_1));
    return args.rv;
}

#if 0
// For illustration only: Public entrypoint of this function (if it were needed)
PFN_vkVoidFunction vkGetDeviceProcAddr(VkDevice a_0, const char* a_1) {
    return vkGetDeviceProcAddr_indirect(a_0, a_1, fexldr_ptr_vkGetDeviceProcAddr);
}
#endif

PFN_vkVoidFunction vkGetInstanceProcAddr(VkInstance a_0, const char* a_1){
    auto Ret = fexfn_pack_vkGetInstanceProcAddr(a_0, a_1);
    if (a_1 != std::string_view { "vkGetDeviceProcAddr" }) {
        LinkGuestAddressToHostFunction(Ret, PtrsToLookUp.at(a_1));
        return Ret;
    } else {
        LinkGuestAddressToHostFunction(Ret, ReadHiddenArgumentAndCall<vkGetDeviceProcAddr_indirect>);
        return Ret;
    }
}

neobrain avatar Jun 13 '22 16:06 neobrain

In the broader picture, the problem is a restriction in the thunk generator: Functions that must be guest-callable through host function pointers can not use a custom guest entrypoint.

After playing a bit at this with the copyable functions, and looking at the codegen, repacking the arguments can be /very/ expensive.

I'm not sure yet of how it would look in practice, but i think intercepting after the arguments have been packed would make things much more efficient as then we can just pass around the pointer to them.

Then things would look

guest call(....) -> packer(....) [-> guest handler (packed_args, optional host_ptr) ] -> thunk [-> host handler (packed_args, optional host_ptr)]  -> unpacker (packed_args, optional host_ptr) -> host call(...)

The guest packer could add the host_ptr as a second argument to the guest handler, and we could amend the thunk op there to take two parameters, so we can pack a variable amount of data independently of the original argument list.

Which would constrain us in other ways, but the host_ptr case would fit right in, instead of being an exception.

skmp avatar Jun 15 '22 00:06 skmp

Copyable functions PoC in our X11 thunks

https://github.com/FEX-Emu/FEX/blob/c5f1929edd4804d0894b0dc736c808f50471c5ed/ThunkLibs/libX11/libX11_Host.cpp#L61

Boilerplate aside,

DECL_COPYABLE_TRAMPLOLINE(XUnregisterIMInstantiateCallbackCBFN)

Bool fexfn_impl_libX11_XUnregisterIMInstantiateCallback_internal(
    Display* a0, struct _XrmHashBucketRec* a1,
    char* a2, char* a3, XUnregisterIMInstantiateCallbackCBFN* a4, XPointer a5) {
    auto fn = binder::make_instance(a4, &CallbackMarshaler<XUnregisterIMInstantiateCallbackCBFN>::marshal<offsetof(CallbackUnpacks, libX11_XUnregisterIMInstantiateCallbackCB)>);
    return fexldr_ptr_libX11_XUnregisterIMInstantiateCallback(a0, a1, a2, a3, fn, a5);
}

skmp avatar Jun 24 '22 16:06 skmp