mojo icon indicating copy to clipboard operation
mojo copied to clipboard

[Feature Request] [stdlib] A Foreign Function Interfacing exclusive package (`ffi`)

Open martinvuyk opened this issue 5 months ago • 2 comments

Review Mojo's priorities

What is your request?

Reorganize and polish sys/ffi.mojo and take it out to be it's own package with it's own set of ease of use capabilities for using Mojo as the glue between languages.

For all major languages for which support is judged as worth being added, adding basic type conversions etc. In the case of C, also adding the basic POSIX functions (libc).

An excerpt from an example implementation:

# Adapted from https://github.com/crisadamo/mojo-Libc which doesn't currently
# (2024-07-22) have a licence, so I'll assume MIT licence.
# Huge thanks for the work done.

# /ffi/c/types.mojo

struct C:
    """C types. This assumes that the platform is 32 or 64 bit, and char is
    always 8 bit (POSIX standard).
    """

    alias char = Int8
    """Type: `char`. The signedness of `char` is platform specific. Most
    systems, including x86 GNU/Linux and Windows, use `signed char`, but those
    based on PowerPC and ARM processors typically use `unsigned char`."""
    alias s_char = Int8
    """Type: `signed char`."""
    alias u_char = UInt8
    """Type: `unsigned char`."""
    alias short = Int16
    """Type: `short`."""
    alias u_short = UInt16
    """Type: `unsigned short`."""
    alias int = Int32
    """Type: `int`."""
    alias u_int = UInt32
    """Type: `unsigned int`."""
    alias long = Int64
    """Type: `long`."""
    alias u_long = UInt64
    """Type: `unsigned long`."""
    alias long_long = Int64
    """Type: `long long`."""
    alias u_long_long = UInt64
    """Type: `unsigned long long`."""
    alias float = Float32
    """Type: `float`."""
    alias double = Float64
    """Type: `double`."""
    alias void = Int8
    """Type: `void`."""
    alias ptr_addr = Int
    """Type: A Pointer Address."""

alias NULL = UnsafePointer[C.void]()
"""Null pointer."""

# ===----------------------------------------------------------------------=== #
# Utils
# ===----------------------------------------------------------------------=== #
fn char_ptr_to_string(s: UnsafePointer[C.char]) -> String:
    ...
fn strlen(s: UnsafePointer[C.char]) -> C.u_int:
    ...
...

# /ffi/c/networking.mojo

fn socket(domain: C.int, type: C.int, protocol: C.int) -> C.int:
    """Libc POSIX `socket` function.

    Args:
        domain: Address Family see AF_ alises.
        type: Socket Type see SOCK_ alises.
        protocol: Protocol see IPPROTO_ alises.

    Returns:
        A filedescriptor for the socket.

    Notes:
        [Reference](https://man7.org/linux/man-pages/man3/socket.3p.html).
        Fn signature: `int socket(int domain, int type, int protocol)`.
    """
    return external_call["socket", C.int, C.int, C.int, C.int](
        domain, type, protocol
    )

# /ffi/c/logging.mojo
# TODO
fn errno() -> Int:
    """Get the `errno` global variable.

    Returns:
        The current value of the variable.
    """
    return 0


fn strerror(errnum: Int) -> UnsafePointer[C.char]:
    """Libc POSIX `strerror` function.

    Args:
        errnum: The number of the error.

    Returns:
        A Pointer to the error message.

    Notes:
        [Reference](https://man7.org/linux/man-pages/man3/strerror.3.html).
        Fn signature: `char *strerror(int errnum)`.
    """
    return external_call["strerror", UnsafePointer[C.char], Int](errnum)

What is your motivation for this change?

Mojo might be the best language for heterogeneous compute, but most infrastructure projects are written in the JVM family of languages and C/C++. If we can offer a set of tools for easy interop language adoption will be faster, we only need to look at the case of Zig where it's cross-compilation capabilities are arguably one of it's biggest strengths and a common entrypoint for it's adoption. And also a bit needless to point out, but flexibility to be the code logic layer is the biggest use case for Python.

Any other details?

In my particular case I'd love to see Mojo kernels being able to be used in Data Bases and finally have a plug and play way to use GPUs for analytical workloads with engines/query planners like Spark (written in Scala) without falling into the fanatical mentality of rewriting everything in X language because of Y reasons...

Disclaimer: ABI compatibility/stability guarantees, dynamic and static linking, and many other things about language interop are way beyond my area of knowledge. That is why I'm posting this as a Feature Request and not a proposal since I'm not even sure of how exactly the end result would look like as I'm not a person who'd use this daily. I'm currently trying to implement a socket package, that's why I've found this need.

martinvuyk avatar Sep 21 '24 19:09 martinvuyk