zig icon indicating copy to clipboard operation
zig copied to clipboard

language proposal: rework the implicit file-scope struct declaration

Open expikr opened this issue 1 year ago • 7 comments

Motivation

Files are implicitly structs, which lets you do things like:

//! Allocator.zig
ptr: *anyopaque,
vtable: *const VTable,
pub const VTable = ...
//! main.zig
const Allocator = @import("Allocator.zig");

const allo = Allocator {
    .ptr = *myAllocFn,
    .vtable = ...
};
...

which implicitly expands into

//! main.zig
const Allocator = struct {
    ptr: *anyopaque,
    vtable: *const VTable,
    pub const VTable = ...
};

const allo = Allocator {
    .ptr = *myAllocFn,
    .vtable = ...
};
...

This convenience technique is widely used in the stdlib for concrete types.

However, because @import essentially wraps an implicit struct{ ... } around the file content, it is not possible to declare packed structs or enums this way.

For example, imagine a cumbersome file that declares a u8 enum of keycodes, this is how it must be done at the moment:

//! keycodes.zig
pub const Keycode = enum(u8) {
    ...
    Key8 = '8',
    Key9 = '9',
    A = 'A',
    B = 'B',
    ...
    Num1 = 0x61,
    Num2 = 0x62,
    ...
    unmapped = 0xFF,
};
// nothing else in file-scope

its usage entails const Keycode = @import("keycodes.zig").Keycode;
rather than just const Keycode = @import("keycodes.zig");
even though the file defines nothing else.

Proposal

What I would like to propose is to
(1) make @import scopes implicitly an expression rather than being the body of an implied struct{ ... }, and
(2) make status-quo bare declaration files implicitly wrapped inside a non-instantiable namespace construct.

To illustrate, status quo namespace files will still look exactly the same:

///! std.zig
pub const ArrayHashMap = array_hash_map.ArrayHashMap;
pub const ArrayHashMapUnmanaged = array_hash_map.ArrayHashMapUnmanaged;
...
//! main.zig
const std = @import("std");

pub fn main() void {
    std.debug.print("hello world", .{});
}

which implicitly expands into

//! main.zig
const std = opaque {
    pub const ArrayHashMap = array_hash_map.ArrayHashMap;
    pub const ArrayHashMapUnmanaged = array_hash_map.ArrayHashMapUnmanaged;
    ...
};

pub fn main() void {
    std.debug.print("hello world", .{});
}

But file-scope Types must now be prefixed with a type definition keyword, i.e.

//! Allocator.zig
struct {
    // helper imports and decls
    ...

    // The type erased pointer to the allocator implementation
    ptr: *anyopaque,
    vtable: *const VTable,
    pub const VTable = ...
}
// no semicolon, since this is an expression
// nothing else can be defined in file-scope

While this creates the disadvantage for the code writer of needing to indent everything by one extra level, in return we receive the advantage that the code reader can more readily expect to find field definitions somewhere in the middle of the file, aside from just filename capitalization which may or may not be lying.

It also lets us more cleanly quarantine unwieldy extern struct definitions into individual files:

//! NOTIFYICONDATAA.zig
extern struct {
    // helper imports and decls
    const std = @import("std");
    const win = std.os.windows;

    // the actual struct definition
    cbSize: win.DWORD,
    hWnd: win.HWND,
    uID: win.UINT,
    uFlags: win.UINT,
    uCallbackMessage: win.UINT,
    hIcon: win.HICON,
    szTip: if(isLaterThanWin2k()) [127:0]win.CHAR else [63:0]win.CHAR,
    dwState: win.DWORD,
    dwStateMask: win.DWORD,
    szInfo: [255:0]win.CHAR,
    DUMMYUNIONNAME: extern union {
        uTimeout: win.UINT,
        uVersion: win.UINT,
    },
    szInfoTitle: [63:0]win.CHAR,
    dwInfoFlags: win.DWORD,
    guiItem: win.GUID,
    hBalloonIcon: win.HICON,

    // a bunch of ad-hoc helper functions for parsing/constructing this godforsaken abomination
    ...
}
// no semicolon, since this is an expression
// nothing else can be defined in file-scope

Additionally, with the benefit of generally being an expression, one can achieve comptime conditional declarations by making the file-scope an execution block, which supercedes certain usecases of usingnamespace:

//! MyThing.zig
comptime do: {
    const MyThing_variant_A = struct {
        data1: u32, data2: u32,
        const operate = @import("variant_A").operate;
    };
    const MyThing_variant_B = struct {
        data1: u32, data2: u32, data3: u32,
        const operate = @import("variant_B").operate;
    };
    break :do if(compile_variant_A()) MyThing_variant_A else MyThing_variant_B;
}

Or directly return generic type functions:

//! MyGeneric.zig
(opaque {
    fn MyGeneric(comptime T: type) type {
        return struct {
            data1: T, data2: T, data3: T,
            pub fn ...
        };
    }
}).MyGeneric
//! main.zig
const Generic_f32 = @import("MyGeneric.zig")(f32);
const Generic_f64 = @import("MyGeneric.zig")(f32);
pub fn main() void {
    ...
}

Furthermore, it also allows us to directly import ZON files as valid zig code, since ZON itself is an expression as well.

This is somewhat comparable to Lua's require syntax being implicitly function returns, which lets you procedurally construct what you want to return:

-- Thing.lua
local ret = {}
local function doMyThing_general(arg)
    ...
end
local function doMyThing_win32(arg) 
    ...
end
local function doMyThing_linux(arg)
    ...
end
do
    ret.generalThing = doMyThing_general
    if isWin32() then 
        ret.specificThing = doMyThing_win32
        ret.os = win32
    elseif isLinux() then
        ret.specificThing = doMything_linux
        ret.os = linux
    else
        ret.specificThing = ret.generalThing
    end
end

return ret
-- main.lua
local Thing = require('Thing')
local current_os = Thing.os
local specificResult_of_3 = Thing.specificThing(3)
...

expikr avatar Mar 27 '24 09:03 expikr

Related https://github.com/ziglang/zig/issues/7881#issuecomment-767069673

Vexu avatar Mar 27 '24 10:03 Vexu

It just occurred to me that yet another advantage of (2) is the usecase of defining foreign opaque types in a standalone file right alongside their associated extern functions, e.g.

//! HBITMAP.zig
const HBITMAP = *@This();
const win = @import("std").os.windows;

extern "gdi32" SelectObject(HDC, HBITMAP) callconv(WINAPI) HBITMAP;
pub fn select(h: HBITMAP, hdc: HDC) HBITMAP {
    return @ptrCast(SelectObject(hdc, h));
}

pub extern "gdi32" CreateBitmap(c_int, c_int, UINT, UINT, *const anyopaque) callconv(WINAPI) HBITMAP;
pub extern "gdi32" CreateBitmapIndirect(*const BITMAP) callconv(WINAPI) HBITMAP;
pub extern "gdi32" CreateCompatibleBitmap(HDC, c_int, c_int) callconv(WINAPI) HBITMAP;
...

expikr avatar Mar 28 '24 03:03 expikr

Related #7881 (comment)

I was thinking about another option:

//! SomeStruct.zig
extern struct =>

field: u32,
.......

The => at this position means "everything till the end of the file is implicitly enclosed in { }s".

This can be generalized further to reduce indentation in files containing a single generic (compared to the proposal here to use comptime blocks, which adds 2 extra levels of indentation to the 2 already existing). E.g.

//! SomeGeneric.zig
pub fn SomeGeneric(comptime T: type) type =>
const size = @sizeOf(T);
return struct =>

data: [size]u8,
pub fn method(self: @This()) void {
   ......
}  

although there are obvious readability concerns. Besides, in the latter case one would really wish one also didn't have to explicitly write the generic name and use the filename instead, pretty much like in the first example.

Edit: the latter problem probably could be addressed as

//! SomeGeneric.zig
fn (comptime T: type) type =>
const size = @sizeOf(T);
return struct =>

data: [size]u8,
pub fn method(self: @This()) void {
   ......
}  

however that implicitly suggests the introduction of a second syntax for function definition in Zig:

const SomeGeneric = fn(comptime T: type) type { ....... };

vadim-za avatar Mar 29 '24 09:03 vadim-za

I think, if you really wanted to be able to make enums, or anything else at the file scope. Something special like the following could perhaps do the trick;

@This = enum(u8) {
    A = 'a',
    ...
};

// or
// @This = error { }
// etc

The idea being that the following is usually always implied;

@This = struct {
// <file contents>
};

This would be considered very special and unique syntax however, and I think the following rule would apply;

  • When this syntax is present, it must be the only expression at the file scope.
  • The syntax may only be present at the file scope.

The latter is crucial, as normally @This() is an expression which returns this struct. When used within the inner scope, the behaviour of @This() should still be as expected. I also suggest the following then be forbidden (or zig-fmt'ed away);

@This = struct { ... };

Because normally it would be implied anyways.

Additionally this could allow for some silly trickery like;

@This = @Type({ ... });

But I don't think it's a big deal. Although I still don't like how this is still kind of new syntax and implies something about the @This() builtin. But It also kind of makes sense, considering it is also the return value of @This(), which has always been kind of recursive in nature.

paperdev-code avatar Mar 29 '24 18:03 paperdev-code

Related #7881 (comment)

Realized I suggested nearly the exact same thing. Edit; Upon further inspection, I definitely expanded upon it.

paperdev-code avatar Mar 29 '24 18:03 paperdev-code

I like this proposal, I'd like to see files truly become abstractions for all zig containers, not just structs. I also prefer the proposed brace wrapping when a file contains fields, I don't think a new syntax is warranted to keep the indent level at 0. As someone who browses the standard library source code more than the autodocs, explicit file typing would help me differentiate between module entrypoints and instantiable types that are just broken out as files.

eastmancr avatar May 15 '24 09:05 eastmancr

This proposal raises a minor grammar issue:

comptime {}

This form is a valid expression as well as a valid declaration list, so it is ambiguous how to treat a file containing this text. The same applies if you add statements into the {}.

FWIW, I dislike this proposal largely because it requires that extra level of indentation. When I write Zig code nowadays, at least 50% of my files are implicitly structs, so having to indent all of those would be pretty awkward. It also loses the guarantee we currently have that we know there's nothing "else" in the file other than the struct declaration; for instance:

struct {
    ...
} == void

Yeah, it's a quite silly example and you shouldn't do this, but it being possible whatsoever feels like a regression to readability compared to today's file-level structs.

If you're going to require that level of indentation anyway, then I struggle to see the actual benefit of this proposal; having to write const Foo = @import("foo.zig").Foo; for files representing types other than auto-layout structs isn't actually a big deal.

mlugg avatar Dec 28 '24 23:12 mlugg