zig icon indicating copy to clipboard operation
zig copied to clipboard

Proposal: make environments a non-global resource

Open mlugg opened this issue 1 month ago • 6 comments

Background

Most operating systems have the concept of "the environment": a set of key-value string mappings ("environment variables") which can be modified by any process for itself and are inherited by child processes. In C, the environment is typically represented through a global char **environ, a more enlightening represention of which is the Zig type [*:null]?[*:0]u8. Each pointer in environ refers to a zero-terminated string of the form "key=value". C also provides the function getenv to query a name/key, and POSIX allows mutating the environment with setenv, unsetenv, putenv, and clearenv.

The environment being global state, while a very common abstraction, is actually quite problematic. In C, it is essentially impossible to ever call environment-modifying functions like setenv in a threaded context, because environ can be (and often is) directly accessed without any kind of lock. Additionally, the Zig standard library currently has a major footgun (tracked by #4524): std.os.envion is meant to be our parallel to C's environ, but it is impossible to populate it in some cases (such as a library which does not link libc). The solution given in #4524 solves some use cases, but in others will cause awkward link errors lest the user manually export an environ from somewhere; this is what we inflict upon users by treating the environment as global state.

Here's the interesting thing. While the environment is often considered persistent, global state, this is not generally the case at a low level, and is instead a lie told by a system's libc. For instance, on Linux, a process is passed a full set of initial environment variables when it spawns, and must pass a full set of environment variables to the kernel when spawning a child process (e.g. to the execve syscall); but during process execution, the kernel has no concept of the process' "current environment". Really, environ (and getenv, setenv, etc) is purely an abstraction provided by libc which amounts to a global key-value map of strings. Interestingly, it's usually not even an efficient structure like a hash map, because the POSIX APIs drive libc implementors into a design corner, resulting in large environments being very slow to access!

By now, you might be able to guess where I'm going with this...

Proposal

Modify the Zig standard library such that the environment is not considered global state. Instead, it is simply a key-value string map, which must be passed to any functions which need it. The initial environment on process startup is passed to root.main (related: #4524, #24510). Spawning a child process will require passing an environment.

The type of an environment would be std.process.Environment (similar in principle to today's std.process.EnvMap). This type can be implemented using a hash map, but with the key/value storage matching the format of C environ, because this is what OS APIs (such as Linux's execve syscall) are typically designed to consume. This design means that when spawning a child process, there is no overhead associated with converting the Environment to the form consumed by the operating system.

This proposal sidesteps the "undefined std.os.environ" footgun we have today by accepting that the environment is not global state, and so is not accessible to arbitrary code. In many cases, it is; an executable receives an environment in main, and libraries linking libc can access std.c.environ. But pure-Zig libraries and functions do need to source an environment from somewhere if they need to query environment variables or spawn child processes, and to hand-wave away this requirement with a global variable is, at best, a "local maximum" design.

I have written an (untested) prototype implementation of std.process.Environment which may help to clarify how the design avoids overhead on process spawn if it is unclear.

std/process/Environment.zig

This implementation uses IndexHashMap (#23872), but could also be written with the existing "hash map adapter" API---IndexHashMap just makes it simpler and cleaner.

//! An `Environment` is a hash map of environment variables. The key/value pairs are stored as
//! "environment strings", that is, null-terminated strings of the form "key=value". This design
//! allows the map to be trivially converted to a `[*:null]const ?[*:0]const u8` suitable for
//! passing to system APIs which expect an "envp" (see `slice`).
//!
//! `Environment` supports efficient lookup (`get`) and modification (`set`, `remove`, `clear`).
//!
//! The environment strings are stored in an arena, meaning that if an `Environment` is modified,
//! its memory usage will gradually increase. In the rare case that this is a problem, such as
//! long-term storage of an evolving `Environment`, consider calling `compact` as necessary to
//! minimize memory usage.

string_arena: std.heap.ArenaAllocator.State,
/// Length is never 0. The last item is always `null`. All other items are non-`null` pointers
/// to null-terminated strings in `string_arena`.
entries: std.ArrayList(?[*:0]u8),
map: std.IndexHashMap,

/// Returns an `Environment` containing no variables.
pub fn initEmpty(gpa: Allocator) Allocator.Error!Environment {
    var entries: std.ArrayList(?[*:0]u8) = try .initCapacity(gpa, 1);
    defer entries.deinit(gpa);
    entries.appendAssumeCapacity(null); // sentinel
    return .{
        .string_arena = .{},
        .entries = entries.move(),
        .map = .empty,
    };
}

/// Returns an `Environment` containing the variables in `c_env`.
///
/// Asserts that every string in `c_env` has the form "key=value".
/// Assumes that no two keys in `c_env` are equal.
pub fn initCEnviron(gpa: Allocator, c_env: [*:null]const ?[*:0]const u8) Allocator.Error!Environment {
    const c_env_slice = std.mem.span(c_env);

    var string_arena: std.heap.ArenaAllocator = .init(gpa);
    errdefer string_arena.deinit(gpa);

    var entries: std.ArrayList(?[*:0]u8) = .empty;
    defer entries.deinit(gpa);

    var map: std.IndexHashMap = .empty;
    defer map.deinit(gpa);

    try entries.ensureUnusedCapacity(gpa, c_env_slice.len);
    try map.ensureUnusedCapacity(gpa, c_env_slice.len);

    map.len = c_env_slice.len;
    for (
        c_env_slice,
        entries.addManyAsSliceAssumeCapacity(c_env_slice.len),
        map.hashes[0..map.len],
    ) |c_env_str_ptr, *env_str_ptr, *hash| {
        const c_env_str = std.mem.span(c_env_str_ptr);
        const eq_idx = std.mem.findScalar(u8, c_env_slice, '=').?;
        hash.* = hashKey(c_env_slice[0..eq_idx]);
        const env_str = try string_arena.allocator().dupeZ(u8, c_env_str);
        env_str_ptr.* = env_str.ptr;
    }
    entries.appendAssumeCapacity(null);
    map.updateAllHashes();

    return .{
        .string_arena = string_arena.state,
        .entries = entries.move(),
        .map = map.move(),
    };
}

/// Releases all memory associated with `env`.
///
/// Invalidates slices previously returned by `slice` or `get`.
pub fn deinit(env: *Environment, gpa: Allocator) void {
    env.string_arena.promote(gpa).deinit();
    env.entries.deinit(gpa);
    env.map.deinit(gpa);
}

/// Returns a null-terminated slice of pointers to null-terminated strings of the form "key=value"
/// representing all variables in `env`. This is the form typically used to represent an environment
/// in C, and can be directly consumed by process spawning APIs such as POSIX's `execve`.
///
/// This operation is extremely efficient by design, so it is okay to call `slice` often.
///
/// The returned slice is valid until `env` is mutated (including by `compact`).
pub fn slice(env: *const Environment) [:null]const ?[*:0]const u8 {
    return env.entries.items[0..env.len() :null];
}

/// Returns the number of variables in `env`.
pub fn len(env: *const Environment) u32 {
    assert(env.map.len == env.entries.items.len - 1);
    return env.map.len;
}

/// Returns the value of the variable named `key`, or `null` if `env` does not contain that name.
///
/// The returned slice is valid, and its contents will not change, until `deinit`, `clear`, or
/// `compact` is called, even if the variable's value changes or the variable is removed entirely.
pub fn get(env: *Environment, key: []const u8) ?[:0]const u8 {
    const index = env.map.get(KeyContext, .{
        .key = key,
        .entries = env.entries.items,
    }) catch return null;
    return std.mem.span(env.entries.items[index].?);
}

/// Returns `true` if and only if `env` contains a variable named `key`.
pub fn contains(env: *const Environment, key: []const u8) bool {
    return env.map.contains(KeyContext, .{
        .key = key,
        .entries = env.entries.items,
    });
}

/// Set the value of the variable named `key` to `val`.
///
/// Invalidates slices previously returned by `slice`.
///
/// Asserts that `key` does not contain any null byte or '=' byte.
/// Asserts that `val` does not contain any null byte.
pub fn set(env: *Environment, gpa: Allocator, key: []const u8, val: []const u8) Allocator.Error!void {
    assert(std.mem.indexOf(u8, key, 0) == null);
    assert(std.mem.indexOf(u8, key, '=') == null);
    assert(std.mem.indexOf(u8, val, 0) == null);

    try env.entries.ensureUnusedCapacity(gpa, 1);
    try env.map.ensureUnusedCapacity(gpa, 1);

    const new_str = str: {
        var string_arena = env.string_arena.promote(gpa);
        defer env.string_arena = string_arena.state;
        const a = string_arena.allocator();
        break :str try std.fmt.allocPrintSentinel(a, "{s}={s}", .{ key, val }, 0);
    };

    const gop = env.map.getOrPut(KeyContext, .{
        .key = key,
        .entries = env.entries.items,
    });
    if (gop.found_existing) {
        env.entries.items[gop.index] = new_str;
    } else {
        assert(env.entries.pop().? == null); // temporarily remove sentinel
        assert(gop.index == env.entries.items.len);
        env.entries.appendSliceAssumeCapacity(&.{ new_str.ptr, null });
    }
}

/// If `env` contains a variable named `key`, removes it and returns `true`.
/// Otherwise, returns `false`.
///
/// Invalidates slices previously returned by `slice`.
pub fn remove(env: *Environment, key: []const u8) bool {
    const index = env.map.swapRemove(KeyContext, .{
        .key = key,
        .entries = env.entries.items,
    }) catch return false;
    assert(env.entries.pop().? == null); // temporarily remove sentinel
    env.entries.swapRemove(index);
    env.entries.appendAssumeCapacity(null);
    return true;
}

/// Removes all variables from `env`.
///
/// Invalidates slices previously returned by `slice` or `get`.
pub fn clear(env: *Environment) void {
    assert(env.entries.pop().? == null);

    _ = env.string_arena.reset(.retain_capacity);
    env.entries.clearRetainingCapacity();
    env.map.len = 0;

    env.entries.appendAssumeCapacity(null);
}

/// Compacts the environment strings of `env` in memory, hence minimizing memory usage.
///
/// Invalidates slices previously returned by `slice` or `get`.
///
/// It is rarely necessary to call this function, since typically an `Environment` is only mutated
/// when initially created, if at all.
pub fn compact(env: *Environment, gpa: Allocator) Allocator.Error!void {
    var string_arena: std.heap.ArenaAllocator = .init(gpa);
    errdefer string_arena.deinit(gpa);

    var entries: std.ArrayList(?[*:0]u8) = .empty;
    defer entries.deinit(gpa);

    const l = env.len();
    try entries.ensureUnusedCapacity(gpa, l + 1);
    for (
        env.entries.items[0..l],
        entries.addManyAsSliceAssumeCapacity(l),
    ) |old_str_ptr, *new_str_ptr| {
        const old_str = std.mem.span(old_str_ptr);
        const new_str = try string_arena.allocator().dupeZ(u8, old_str);
        new_str_ptr.* = new_str.ptr;
    }
    assert(env.entries.items[l] == null);
    entries.appendAssumeCapacity(null);

    errdefer comptime unreachable;

    // We can use the same map (which is why this doesn't just call `clone`).
    const map = env.map.move();
    env.deinit();
    env.* = .{
        .string_arena = string_arena.state,
        .entries = entries.move(),
        .map = map,
    };
}

/// Returns a new `Environment` which is equivalent to `env`. The new `Environment` is allocated
/// with `gpa`, which need not be the same allocator used for `env`.
pub fn clone(env: *const Environment, gpa: Allocator) Environment {
    var string_arena: std.heap.ArenaAllocator = .init(gpa);
    errdefer string_arena.deinit(gpa);

    var entries: std.ArrayList(?[*:0]u8) = .empty;
    defer entries.deinit(gpa);

    const l = env.len();
    try entries.ensureUnusedCapacity(gpa, l + 1);
    for (
        env.entries.items[0..l],
        entries.addManyAsSliceAssumeCapacity(l),
    ) |old_str_ptr, *new_str_ptr| {
        const old_str = std.mem.span(old_str_ptr);
        const new_str = try string_arena.allocator().dupeZ(u8, old_str);
        new_str_ptr.* = new_str.ptr;
    }
    assert(env.entries.items[l] == null);
    entries.appendAssumeCapacity(null);

    // The map can be cloned directly because the hashes are unchanged.
    var map = try env.map.clone(gpa);
    defer map.deinit(gpa);

    return .{
        .string_arena = string_arena.state,
        .entries = entries.move(),
        .map = map.move(),
    };
}

const KeyContext = struct {
    key: []const u8,
    entries: []const ?[*:0]const u8,
    pub fn hash(ctx: KeyContext) u32 {
        return hashKey(ctx.key);
    }
    pub fn eql(ctx: KeyContext, other: u32) bool {
        const other_str = std.mem.span(ctx.entries[other]);
        const idx = std.mem.findScalar(u8, other_str, '=').?;
        const other_key = other_str[0..idx];
        return std.mem.eql(u8, ctx.key, other_key);
    }
};

fn hashKey(key: []const u8) u32 {
    return std.hash.Wyhash.hash(0, key);
}

const std = @import("std");
const Allocator = std.mem.Allocator;
const assert = std.debug.assert;

const Environment = @This();

mlugg avatar Nov 17 '25 19:11 mlugg

IWhat you say is right, but regarding setenv, I'd like to add that I think that the setenv C function is a design flaw in first place. I've developed several commercial programs which make heavy use of environment variables, process- and thread structures, for MS Windows and Linux, and I never needed setenv.

hvbargen avatar Nov 18 '25 07:11 hvbargen

IWhat you say is right, but regarding setenv, I'd like to add that I think that the setenv C function is a design flaw in first place. I've developed several commercial programs which make heavy use of environment variables, process- and thread structures, for MS Windows and Linux, and I never needed setenv.

To add to this: the existence of setenv poisons C libraries in a way that using them without looking up if your dependencies use setenv, you are basically praying that your program will run fine, if you are in a threaded context thanks to getaddrinfo using getenv.

KilianHanich avatar Nov 18 '25 15:11 KilianHanich

I’m not sure if my understanding is correct, so please correct me if I’m wrong.

So the future fix for this issue is that, before entering main, we will create a copy of the global environment, and our program execution will rely solely on that copy. During program execution, any additions or modifications to the keys or values in the global environment will no longer affect the running program, because we will no longer allow the program to arbitrarily access the global environment using functions like getenv or setenv. All environment-related operations inside the program must explicitly use the copied environment passed in.

Am I correct ?

septemhill avatar Nov 19 '25 02:11 septemhill

@septemhill

any additions or modifications to the keys or values in the global environment will no longer affect the running program

there is no such thing as the "global environment"[^1] envvars are just key/value pairs that are passed in at program startup

if you're talking about the libc-provided global envvar map, which getenv and setenv interact with, yes, this proposal is to remove that abstraction from zig, and replace it with a locally-scoped environment map that is passed around to whatever part of the code needs it, similar to what zig does with Allocator and, more recently, Io

of course, if you're linking libc, there's nothing stopping you from directly calling getenv/setenv (or even setenv's evil twin, putenv), but that would be strongly discouraged except in the case of needing to set environment variables that are read by C libraries (but also, like, that's horrific. use a better library)

[^1]: I believe windows is weird here, as usual, in that there technically is a "system environment block" and "user environment block", which are global across all processes. you can get access to these, but you have to explicitly query it which basically no program does

silversquirl avatar Nov 19 '25 15:11 silversquirl

This proposal references juicy main, but makes a notable and significant deviation:

Users who want full control over everything can continue to use pub fn main() noreturn {} and have the start code do almost nothing (beyond setting up Thread Local Storage, signal handling, PIE, etc., as it already does).

If I'm reading this right and this is implemented purely on top of the juicy main proposal, after this change you would only have access to environ by taking an std.process.Init, which would initialize a significant amount more than environ.

OTOH: https://github.com/ziglang/zig/issues/4524 Seemingly already says this is already accepted.

scottredig avatar Nov 21 '25 07:11 scottredig

This proposal doesn't fully explain what is passed to the Zig main function. Is it an std.process.Environment instance? In the draft implementation of this type, an allocator is required to initialize it. How will this work for non-juicy main functions which only want the envp and would prefer std.start to do as little work as possible?

I'm not 100% confident that this is always the case in practice, but for POSIX targets that are compliant with the spec it appears that the envp argument passed to the main entry point is not affected by set/put/unsetenv() (unlike environ). It would be useful if an unmodified environment structure obtained from a minimal Zig main function could be forwarded to child processes without any overhead or unnecessary copying. Similarly for e.g. Windows which uses a nonstandard non-UTF-8 environment block, it would be useful if any processing could be deferred until actually needed.

castholm avatar Nov 22 '25 21:11 castholm