32. Raytracing

Unlike draw commands which use rasterization, ray tracing is a rendering method which generates an image by tracing the path of rays which have a single origin and using shaders to determine the final colour of an image plane.

Raytracing uses a separate rendering pipeline from both the graphics and compute pipelines (see Raytracing Pipeline). It has a unique set of programmable and fixed function stages.

raypipe
Figure 33. Raytracing Pipeline
Caption

Interaction between the different shader stages in the raytracing pipeline

32.1. Raytracing Properties

The VkPhysicalDeviceRaytracingPropertiesNVX structure is defined as:

typedef struct VkPhysicalDeviceRaytracingPropertiesNVX {
    VkStructureType    sType;
    void*              pNext;
    uint32_t           shaderHeaderSize;
    uint32_t           maxRecursionDepth;
    uint32_t           maxGeometryCount;
} VkPhysicalDeviceRaytracingPropertiesNVX;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • shaderHeaderSize size in bytes of the shader header.

  • maxRecursionDepth is the maximum number of levels of recursion allowed in a trace command.

  • maxGeometryCount is the maximum number of geometries in the bottom level acceleration structure.

Valid Usage
Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_RAYTRACING_PROPERTIES_NVX

32.2. Raytracing Commands

Raytracing commands provoke work in the raytacing pipeline. Raytracing commands are recorded into a command buffer and when executed by a queue will produce work which executes according to the currently bound raytracing pipeline. A raytracing pipeline must be bound to a command buffer before any raytracing commands are recorded in that command buffer.

Each raytracing call operates on a set of shader stages that are specific to the raytracing pipeline as well as a set of VkAccelerationStructure objects which describe the scene geometry in an implementation-specific way. The relationship between the raytracing pipeline object and the acceleration structures is passed into the raytacing command in a VkBuffer object known as a shader binding table.

During execution, control alternates between scheduling and other operations. The scheduling functionality is implementation-specific and is responsible for workload execution. The shader stages are programmable. Traversal, which refers to the process of traversing acceleration structures to find potential intersections of rays with geometry, is fixed function.

The programmable portions of the pipeline are exposed in a single-ray programming model. Each GPU thread handles one ray at a time. Memory operations can be synchronized using standard memory barriers. However, communication and synchronization between threads is not allowed. In particular, the use of compute pipeline synchronization functions is not supported in the raytracing pipeline.

To dispatch a raytracing call use:

void vkCmdTraceRaysNVX(
    VkCommandBuffer                             cmdBuf,
    VkBuffer                                    raygenShaderBindingTableBuffer,
    VkDeviceSize                                raygenShaderBindingOffset,
    VkBuffer                                    missShaderBindingTableBuffer,
    VkDeviceSize                                missShaderBindingOffset,
    VkDeviceSize                                missShaderBindingStride,
    VkBuffer                                    hitShaderBindingTableBuffer,
    VkDeviceSize                                hitShaderBindingOffset,
    VkDeviceSize                                hitShaderBindingStride,
    uint32_t                                    width,
    uint32_t                                    height);
  • cmdBuf is the command buffer into which the command will be recorded.

  • raygenShaderBindingTableBuffer is the buffer object that holds the shader binding table data for the ray generation shader stage.

  • raygenShaderBindingOffset is the offset (relative to raygenShaderBindingTableBuffer) of the ray generation shader being used for the trace.

  • missShaderBindingTableBuffer is the buffer object that holds the shader binding table data for the miss shader stage.

  • missShaderBindingOffset is the offset (relative to missShaderBindingTableBuffer) of the miss shader being used for the trace.

  • missShaderBindingStride is the size of each shader binding table record in missShaderBindingTableBuffer

  • hitShaderBindingTableBuffer is the buffer object that holds the shader binding table data for the hit shader stages.

  • hitShaderBindingOffset is the offset (relative to hitShaderBindingTableBuffer) of the hit shader group being used for the trace.

  • hitShaderBindingStride is the size of each shader binding table record in hitShaderBindingTableBuffer

  • width is the width of the ray trace query dimensions.

  • height is height of the ray trace query dimensions.

When the command is executed, a ray query of width × height rays is assembled.

Valid Usage
Valid Usage (Implicit)
  • cmdBuf must be a valid VkCommandBuffer handle

  • raygenShaderBindingTableBuffer must be a valid VkBuffer handle

  • missShaderBindingTableBuffer must be a valid VkBuffer handle

  • hitShaderBindingTableBuffer must be a valid VkBuffer handle

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • Each of cmdBuf, hitShaderBindingTableBuffer, missShaderBindingTableBuffer, and raygenShaderBindingTableBuffer must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Both

Graphics
Compute

32.3. Shader Binding Table

A shader binding table is a resource which establishes the relationship between the raytracing pipeline and the acceleration structures that were built for the ray tracing query. It indicates the shaders that operate on each geometry in an acceleration structure. In addition, it contains the resources accessed by each shader, including textures and constants. The application allocates and manages shader binding tables as VkBuffer objects.

The shader binding tables to use in a ray tracing query are passed to VkCmdTraceRaysNVX. Shader binding tables are read-only in shaders that are executing on the ray tracing pipeline.

32.3.1. Indexing Rules

In order to execute the correct shaders and access the correct resources during a ray tracing dispatch, the implementation must be able to locate shader binding table entries at various stages of execution. This is accomplished by defining a set of indexing rules that compute shader binding table record positions relative to the buffer’s base address in memory. The application must organize the contents of the shader binding table’s memory in a way that application of the indexing rules will lead to correct records.

Ray Generation Shaders

Only one ray generation shader is executed per ray tracing dispatch. Its location is passed into vkCmdTraceRaysNVX using the raygenShaderBindingTableBuffer and raygenShaderBindingTableOffset parameters.

The rule to compute a ray generation shader binding table record index is:

raygenShaderBindingTableIndex

Hit Shaders

The base for the computation of intersection, any hit and closest hit shader locations is the instanceShaderBindingTableRecordOffset value stored with each instance of a top-level acceleration structure (see VkInstanceNVX). This value determines the beginning of the shader binding table records for a given instance. Each geometry in the instance must have at least one hit program record.

In the following rule, geometryIndex refers to the location of the geometry within the instance.

The sbtRecordStride and sbtRecordOffset values are passed in as parameters to traceNVX() calls made in the shaders. See Section 8.19 (Raytracing Functions) of the OpenGL Shading Language Specification for more details.

The result of this computation is then added to hitProgramShaderBindingTableBaseIndex, a base index passed to vkCmdTraceRaysNVX.

The complete rule to compute a hit shader binding table record index is:

instanceShaderBindingTableRecordOffset + hitProgramShaderBindingTableBaseIndex + geometryIndex × sbtRecordStride + sbtRecordOffset

Miss Shaders

A Miss shader is executed whenever a ray query fails to find an intersection for the given scene geometry. Multiple miss shaders can be executed throughout a ray tracing dispatch.

The base for the computation of miss shader locations is missProgramShaderBindingTableBaseIndex, a base index passed into vkCmdTraceRaysNVX.

The sbtRecordOffset value is passed in as parameters to traceNVX() calls made in the shaders. See Section 8.19 (Raytracing Functions) of the OpenGL Shading Language Specification for more details.

The complete rule to compute a miss shader binding table record index is:

missProgramShaderBindingTableBaseIndex + sbtRecordOffset

32.4. Acceleration Structures

Acceleration structures are data structures used by the implementation to efficiently manage the scene geometry as it is traversed during a ray tracing query. The application is responsible for managing acceleration structure objects (see Acceleration Structures, including allocation, destruction, executing builds or updates, and synchronizing resources used uring ray tracing queries.

There are two types of acceleration structures, top level acceleration structures and bottom level acceleration structures.

accelstruct
Figure 34. Acceleration Structure
Caption

The diagram shows the relationship between top and bottom level acceleration structures.

32.4.1. Instances

Instances are found in top level acceleration structures and contain data that refer to a single bottom-level acceleration structure, a transform matrix, and shading information. Multiple instances may point to a single bottom level acceleration structure.

32.4.2. Geometry

Geometries refer to a triangle or axis-aligned bounding box.

32.4.3. Top Level Acceleration Structures

Opaque acceleration structure for an array of instances. The descriptor referencing this is the starting point for tracing

32.4.4. Bottom Level Acceleration Structures

Opaque acceleration structure for an array of geometries.

32.4.5. Building Acceleration Structures

To build an acceleration structure call:

void vkCmdBuildAccelerationStructureNVX(
    VkCommandBuffer                             cmdBuf,
    VkAccelerationStructureTypeNVX              type,
    uint32_t                                    instanceCount,
    VkBuffer                                    instanceData,
    VkDeviceSize                                instanceOffset,
    uint32_t                                    geometryCount,
    const VkGeometryNVX*                        pGeometries,
    VkBuildAccelerationStructureFlagsNVX        flags,
    VkBool32                                    update,
    VkAccelerationStructureNVX                  dst,
    VkAccelerationStructureNVX                  src,
    VkBuffer                                    scratch,
    VkDeviceSize                                scratchOffset);
  • cmdBuf is the command buffer into which the command will be recorded

  • type is the type of acceleration structure that is being built

  • instanceCount is the number of instances in the acceleration structure. This parameter must be 0 for bottom level acceleration structures.

  • instanceData is the buffer containing instance data that will be used to build the acceleration structure. This parameter must be NULL for bottom level acceleration structures.

  • instanceOffset is the offset (relative to the start of instanceData) at which the instance data is located.

  • geometryCount is the number of geometries in the acceleration structure. This parameter must be 0 for top level acceleration structures.

  • pGeometries is a pointer to an array of geometries used by bottom level acceleration structures. This parameter must be NULL for top level acceleration structures.

  • flags is a vkBuildAccelerationStructureFlagBitsNVX value that specifies additional parameters for the acceleration structure build.

  • update specifies whether to update the dst acceleration structure with the data in src.

  • dst points to the target acceleration structure for the build.

  • src points to an existing acceleration structure that can be used to update the dst acceleration structure.

  • scratch is the VkBuffer that will be used as scratch memory for the build.

  • scratchOffset is the offset relative to the start of scratch that will be used as scratch memory

Valid Usage
  • geometryCount must be less than or equal to VkPhysicalDeviceRaytracingPropertiesNVX::maxGeometryCount

Valid Usage (Implicit)
  • cmdBuf must be a valid VkCommandBuffer handle

  • type must be a valid VkAccelerationStructureTypeNVX value

  • If instanceData is not VK_NULL_HANDLE, instanceData must be a valid VkBuffer handle

  • If geometryCount is not 0, pGeometries must be a valid pointer to an array of geometryCount valid VkGeometryNVX structures

  • flags must be a valid combination of VkBuildAccelerationStructureFlagBitsNVX values

  • dst must be a valid VkAccelerationStructureNVX handle

  • If src is not VK_NULL_HANDLE, src must be a valid VkAccelerationStructureNVX handle

  • scratch must be a valid VkBuffer handle

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • Each of cmdBuf, dst, instanceData, scratch, and src that are valid handles must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Both

Graphics
Compute

Bits which can be set in vkBuildAccelerationStructureFlagBitsNVX::flags, specifying additional parameters for acceleration structure builds, are:

typedef enum VkBuildAccelerationStructureFlagBitsNVX {
    VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_UPDATE_BIT_NVX = 0x00000001,
    VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_COMPACTION_BIT_NVX = 0x00000002,
    VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_TRACE_BIT_NVX = 0x00000004,
    VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_BUILD_BIT_NVX = 0x00000008,
    VK_BUILD_ACCELERATION_STRUCTURE_LOW_MEMORY_BIT_NVX = 0x00000010,
} VkBuildAccelerationStructureFlagBitsNVX;

32.4.6. Copying Acceleration Structures

An additional command exists for copying acceleration structures without updating their contents. The acceleration structure object may be compacted in order to improve performance. Before copying, an application must query the size of the resulting acceleration structure.

To query acceleration structure size parameters call:

void vkCmdWriteAccelerationStructurePropertiesNVX(
    VkCommandBuffer                             cmdBuf,
    VkAccelerationStructureNVX                  accelerationStructure,
    VkQueryType                                 queryType,
    VkQueryPool                                 queryPool,
    uint32_t                                    query);
  • cmdBuf is the command buffer into which the command will be recorded.

  • accelerationStructure points to an existing acceleration structure which has been built.

  • queryType is a VkQueryType value specifying the type of queries managed by the pool.

  • queryPool is the query pool that will manage the results of the query.

  • query is the query index within the query pool that will contain the results.

Valid Usage
  • queryType must be VK_QUERY_TYPE_COMPACTED_SIZE_NVX

Valid Usage (Implicit)
  • cmdBuf must be a valid VkCommandBuffer handle

  • accelerationStructure must be a valid VkAccelerationStructureNVX handle

  • queryType must be a valid VkQueryType value

  • queryPool must be a valid VkQueryPool handle

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • Each of accelerationStructure, cmdBuf, and queryPool must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Both

Graphics
Compute

To copy an acceleration structure call:

void vkCmdCopyAccelerationStructureNVX(
    VkCommandBuffer                             cmdBuf,
    VkAccelerationStructureNVX                  dst,
    VkAccelerationStructureNVX                  src,
    VkCopyAccelerationStructureModeNVX          mode);
  • cmdBuf is the command buffer into which the command will be recorded.

  • dst points to the target acceleration structure for the build

  • src points to an existing acceleration structure that can be used to update the dst acceleration structure

  • mode is a VkCopyAccelerationStructureModeNVX value that specifies additional operations to perform during the copy.

Valid Usage
Valid Usage (Implicit)
  • cmdBuf must be a valid VkCommandBuffer handle

  • dst must be a valid VkAccelerationStructureNVX handle

  • src must be a valid VkAccelerationStructureNVX handle

  • mode must be a valid VkCopyAccelerationStructureModeNVX value

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • Each of cmdBuf, dst, and src must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Both

Graphics
Compute

Possible values of vkCmdCopyAccelerationStructureNVX::mode, specifying additional operations to perform during the copy, are:

typedef enum VkCopyAccelerationStructureModeNVX {
    VK_COPY_ACCELERATION_STRUCTURE_MODE_CLONE_NVX = 0,
    VK_COPY_ACCELERATION_STRUCTURE_MODE_COMPACT_NVX = 1,
} VkCopyAccelerationStructureModeNVX;