Bind Groups | FloraForge Concepts

The two-processor problem

When the engine wants to draw a chunk of terrain, the shader doing the drawing needs to know things only the engine knows: where the camera is, what time of day it is, which way the sun points, what colour the fog should be. None of that can be passed as ordinary function arguments, because nobody calls the shader directly — the GPU does, thousands of times, on its own schedule, in its own memory. Everything the shader needs must be placed in GPU-visible resources ahead of time and wired to named slots in the shader's interface. WebGPU makes that wiring fully explicit, and the unit of wiring is the bind group.

Three kinds of resources

The things you can plug in come in a few flavours, each tuned for a different access pattern:

Uniform buffers — small, fixed-size structs of constants, read-only and broadcast identically to every thread. The camera matrix and the clock live here. The GPU caches them aggressively precisely because they cannot change mid-draw.
Storage buffers — big raw arrays the shader can index freely and, if declared read_write, write back to. This is what the terrain compute shader uses for its height inputs and vertex output.
Textures and samplers — images plus the recipe for reading them (filtering, wrapping, comparison). They travel as a pair: FloraForge's terrain binds its biome colour atlas with an ordinary sampler, and its shadow map with a special comparison sampler that does depth tests in hardware.

Bundles, and the contracts behind them

You don't bind these resources one at a time. WebGPU asks you to define a bind group layout first — a contract that says "slot 0 is a uniform buffer visible to the vertex and fragment stages, slot 1 is a texture…" — and then create bind groups: immutable bundles of actual resources that satisfy that contract. The split looks bureaucratic but is the source of the speed: because the layout is known when the pipeline is built, and the group is validated once when it's created, binding a group at draw time is nearly free — the expensive checking already happened. Here is FloraForge creating its per-frame group, layout first, bundle second:

src/renderer_wgpu/material.rs — layout (the contract), then the bundle (trimmed)

let layout = device.create_bind_group_layout(&wgpu::BindGroupLayoutDescriptor {
    label: Some("frame-bind-group-layout"),
    entries: &[wgpu::BindGroupLayoutEntry {
        binding: 0,
        visibility: wgpu::ShaderStages::VERTEX_FRAGMENT,
        ty: wgpu::BindingType::Buffer {
            ty: wgpu::BufferBindingType::Uniform,
            // …
        },
        count: None,
    }],
});

// …create the uniform buffer holding a FrameUniform…

let bind_group = device.create_bind_group(&wgpu::BindGroupDescriptor {
    label: Some("frame-bind-group"),
    layout: &layout,
    entries: &[wgpu::BindGroupEntry {
        binding: 0,
        resource: buffer.as_entire_binding(),
    }],
});

The cost model: sort by how often it changes

A frame is a long sequence of draws, and between draws the engine swaps bind groups in and out of four numbered slots (WebGPU guarantees at least four). Each swap is cheap, but it isn't free, and the swaps add up across hundreds of draw calls. The classic answer is to sort your data by how often it changes and give each rate its own group. Things that are true for the whole frame — the camera, the clock — go in group 0, bound once and then left alone. Things that change per material — lighting and fog colours — go in group 1. Per-object textures sit higher still. The common case, "same camera, fifty different surfaces," then touches only the cheap, small, frequently-swapped groups while the stable ones stay plugged in.

The terrain shader's four bind group slots. The bundles themselves are created once; what varies is how often the engine writes new contents into the buffers behind them.

FloraForge's split

Every material in the engine follows the same two-group convention. Group 0 is per-frame: a single FrameUniform struct holding the camera's view-projection matrices, the camera position, a packed time vector (elapsed seconds, hour of day, underwater factor) and the shadow parameters. Group 1 is per-material: a MaterialUniform with the light direction, ambient level, fog settings and the sun and sky colours that the day/night cycle recomputes as it goes. On the shader side the declarations read like a mirror of that design:

src/renderer_wgpu/shaders/terrain.wgsl — the shader's view of its bind groups

struct FrameUniform {
    view_proj: mat4x4<f32>,
    inv_view_proj_no_translation: mat4x4<f32>,
    light_view_proj: mat4x4<f32>,
    camera_position: vec4<f32>,
    time: vec4<f32>,
    shadow_params: vec4<f32>,
    view_proj_no_translation: mat4x4<f32>,
};

struct MaterialUniform {
    light_direction: vec4<f32>,
    ambient: vec4<f32>,
    fog_color: vec4<f32>,
    fog_params: vec4<f32>,
    sun_color: vec4<f32>,
    sky_zenith: vec4<f32>,
    sky_horizon: vec4<f32>,
};

@group(0) @binding(0) var<uniform> frame: FrameUniform;
@group(1) @binding(0) var<uniform> material: MaterialUniform;
@group(2) @binding(0) var terrain_atlas: texture_2d<f32>;
@group(2) @binding(1) var terrain_sampler: sampler;
@group(3) @binding(0) var shadow_map: texture_depth_2d;
@group(3) @binding(1) var shadow_sampler: sampler_comparison;

Those vec4 fields where a single float would do aren't waste — they're alignment. Uniform buffers follow strict layout rules (vectors land on 16-byte boundaries), and the engine's matching Rust structs in src/renderer_wgpu/material.rs use the same padded shapes so the bytes copied across line up exactly with what the shader expects. The spare lanes get used, too: time packs three different clocks into one slot.

The render loop in src/renderer_wgpu/world.rs then plays the frequency game exactly as advertised. At the top of the frame the engine writes a fresh FrameUniform into group 0's buffer; each pass binds frame_bg at slot 0 and the relevant material at slot 1, and the terrain, water and river passes all share the same material bundle — one set_bind_group apiece, and on to the draw calls.

In the engine

FloraForge never rebuilds a bind group at runtime. The frame and material bundles are created once at startup, and from then on the engine only rewrites the bytes inside the buffers they point at — one queue.write_buffer of 304 bytes refreshes the camera, clock and shadows for every shader in the engine at once, because they all share the same group 0 bundle.