6. Synchronization and Cache Control

Synchronization of access to resources is primarily the responsibility of the application in Vulkan. The order of execution of commands with respect to the host and other commands on the device has few implicit guarantees, and needs to be explicitly specified. Memory caches and other optimizations are also explicitly managed, requiring that the flow of data through the system is largely under application control.

Whilst some implicit guarantees exist between commands, five explicit synchronization mechanisms are exposed by Vulkan:

Fences

Fences can be used to communicate to the host that execution of some task on the device has completed.

Semaphores

Semaphores can be used to control resource access across multiple queues.

Events

Events provide a fine-grained synchronization primitive which can be signaled either within a command buffer or by the host, and can be waited upon within a command buffer or queried on the host.

Pipeline Barriers

Pipeline barriers also provide synchronization control within a command buffer, but at a single point, rather than with separate signal and wait operations.

Render Passes

Render passes provide a useful synchronization framework for most rendering tasks, built upon the concepts in this chapter. Many cases that would otherwise need an application to use other synchronization primitives can be expressed more efficiently as part of a render pass.

6.1. Execution and Memory Dependencies

An operation is an arbitrary amount of work to be executed on the host, a device, or an external entity such as a presentation engine. Synchronization commands introduce explicit execution dependencies, and memory dependencies between two sets of operations defined by the command’s two synchronization scopes.

The synchronization scopes define which other operations a synchronization command is able to create execution dependencies with. Any type of operation that is not in a synchronization command’s synchronization scopes will not be included in the resulting dependency. For example, for many synchronization commands, the synchronization scopes can be limited to just operations executing in specific pipeline stages, which allows other pipeline stages to be excluded from a dependency. Other scoping options are possible, depending on the particular command.

An execution dependency is a guarantee that for two sets of operations, the first set must happen-before the second set. If an operation happens-before another operation, then the first operation must complete before the second operation is initiated. More precisely:

  • Let A and B be separate sets of operations.

  • Let S be a synchronization command.

  • Let AS and BS be the synchronization scopes of S.

  • Let A' be the intersection of sets A and AS.

  • Let B' be the intersection of sets B and BS.

  • Submitting A, S and B for execution, in that order, will result in execution dependency E between A' and B'.

  • Execution dependency E guarantees that A' happens-before B'.

An execution dependency chain is a sequence of execution dependencies that form a happens-before relation between the first dependency’s A' and the final dependency’s B'. For each consecutive pair of execution dependencies, a chain exists if the intersection of BS in the first dependency and AS in the second dependency is not an empty set. The formation of a single execution dependency from an execution dependency chain can be described by substituting the following in the description of execution dependencies:

  • Let S be a set of synchronization commands that generate an execution dependency chain.

  • Let AS be the first synchronization scope of the first command in S.

  • Let BS be the second synchronization scope of the last command in S.

Note

An execution dependency is inherently also multiple execution dependencies - a dependency exists between each subset of A' and each subset of B', and the same is true for execution dependency chains. For example, a synchronization command with multiple pipeline stages in its stage masks effectively generates one dependency between each source stage and each destination stage. This can be useful to think about when considering how execution chains are formed if they do not involve all parts of a synchronization command’s dependency. Similarly, any set of adjacent dependencies in an execution dependency chain can be considered an execution dependency chain in its own right.

Execution dependencies alone are not sufficient to guarantee that values resulting from writes in one set of operations can be read from another set of operations.

Three additional types of operation are used to control memory access. Availability operations cause the values generated by specified memory write accesses to become available to a memory domain for future access. Any available value remains available until a subsequent write to the same memory location occurs (whether it is made available or not) or the memory is freed. Memory domain operations cause writes that are available to a source memory domain to become available to a destination memory domain (an example of this is making writes available to the host domain available to the device domain). Visibility operations cause values available to a memory domain to become visible to specified memory accesses.

Availability, visibility, memory domains, and memory domain operations are formally defined in the Availability and Visibility section of the Memory Model chapter. Which API operations perform each of these operations is defined in Availability, Visibility, and Domain Operations.

A memory dependency is an execution dependency which includes availability and visibility operations such that:

  • The first set of operations happens-before the availability operation.

  • The availability operation happens-before the visibility operation.

  • The visibility operation happens-before the second set of operations.

Once written values are made visible to a particular type of memory access, they can be read or written by that type of memory access. Most synchronization commands in Vulkan define a memory dependency.

The specific memory accesses that are made available and visible are defined by the access scopes of a memory dependency. Any type of access that is in a memory dependency’s first access scope and occurs in A' is made available. Any type of access that is in a memory dependency’s second access scope and occurs in B' has any available writes made visible to it. Any type of operation that is not in a synchronization command’s access scopes will not be included in the resulting dependency.

A memory dependency enforces availability and visibility of memory accesses and execution order between two sets of operations. Adding to the description of execution dependency chains:

  • Let a be the set of memory accesses performed by A'.

  • Let b be the set of memory accesses performed by B'.

  • Let aS be the first access scope of the first command in S.

  • Let bS be the second access scope of the last command in S.

  • Let a' be the intersection of sets a and aS.

  • Let b' be the intersection of sets b and bS.

  • Submitting A, S and B for execution, in that order, will result in a memory dependency m between A' and B'.

  • Memory dependency m guarantees that:

    • Memory writes in a' are made available.

    • Available memory writes, including those from a', are made visible to b'.

Note

Execution and memory dependencies are used to solve data hazards, i.e. to ensure that read and write operations occur in a well-defined order. Write-after-read hazards can be solved with just an execution dependency, but read-after-write and write-after-write hazards need appropriate memory dependencies to be included between them. If an application does not include dependencies to solve these hazards, the results and execution orders of memory accesses are undefined.

6.1.1. Image Layout Transitions

Image subresources can be transitioned from one layout to another as part of a memory dependency (e.g. by using an image memory barrier). When a layout transition is specified in a memory dependency, it happens-after the availability operations in the memory dependency, and happens-before the visibility operations. Image layout transitions may perform read and write accesses on all memory bound to the image subresource range, so applications must ensure that all memory writes have been made available before a layout transition is executed. Available memory is automatically made visible to a layout transition, and writes performed by a layout transition are automatically made available.

Layout transitions always apply to a particular image subresource range, and specify both an old layout and new layout. If the old layout does not match the new layout, a transition occurs. The old layout must match the current layout of the image subresource range, with one exception. The old layout can always be specified as VK_IMAGE_LAYOUT_UNDEFINED, though doing so invalidates the contents of the image subresource range.

As image layout transitions may perform read and write accesses on the memory bound to the image, if the image subresource affected by the layout transition is bound to peer memory for any device in the current device mask then the memory heap the bound memory comes from must support the VK_PEER_MEMORY_FEATURE_GENERIC_SRC_BIT and VK_PEER_MEMORY_FEATURE_GENERIC_DST_BIT capabilities as returned by vkGetDeviceGroupPeerMemoryFeatures.

Note

Setting the old layout to VK_IMAGE_LAYOUT_UNDEFINED implies that the contents of the image subresource need not be preserved. Implementations may use this information to avoid performing expensive data transition operations.

Note

Applications must ensure that layout transitions happen-after all operations accessing the image with the old layout, and happen-before any operations that will access the image with the new layout. Layout transitions are potentially read/write operations, so not defining appropriate memory dependencies to guarantee this will result in a data race.

Image layout transitions interact with memory aliasing.

6.1.2. Pipeline Stages

The work performed by an action or synchronization command consists of multiple operations, which are performed as a sequence of logically independent steps known as pipeline stages. The exact pipeline stages executed depend on the particular command that is used, and current command buffer state when the command was recorded. Drawing commands, dispatching commands, copy commands, clear commands, and synchronization commands all execute in different sets of pipeline stages. Synchronization commands do not execute in a defined pipeline, but do execute VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT and VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT.

Note

Operations performed by synchronization commands (e.g. availability and visibility operations) are not executed by a defined pipeline stage. However other commands can still synchronize with them via the VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT and VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT pipeline stages.

Execution of operations across pipeline stages must adhere to implicit ordering guarantees, particularly including pipeline stage order. Otherwise, execution across pipeline stages may overlap or execute out of order with regards to other stages, unless otherwise enforced by an execution dependency.

Several of the synchronization commands include pipeline stage parameters, restricting the synchronization scopes for that command to just those stages. This allows fine grained control over the exact execution dependencies and accesses performed by action commands. Implementations should use these pipeline stages to avoid unnecessary stalls or cache flushing.

Bits which can be set, specifying pipeline stages, are:

typedef enum VkPipelineStageFlagBits {
    VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT = 0x00000001,
    VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT = 0x00000002,
    VK_PIPELINE_STAGE_VERTEX_INPUT_BIT = 0x00000004,
    VK_PIPELINE_STAGE_VERTEX_SHADER_BIT = 0x00000008,
    VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT = 0x00000010,
    VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT = 0x00000020,
    VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT = 0x00000040,
    VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT = 0x00000080,
    VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT = 0x00000100,
    VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT = 0x00000200,
    VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT = 0x00000400,
    VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT = 0x00000800,
    VK_PIPELINE_STAGE_TRANSFER_BIT = 0x00001000,
    VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT = 0x00002000,
    VK_PIPELINE_STAGE_HOST_BIT = 0x00004000,
    VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT = 0x00008000,
    VK_PIPELINE_STAGE_ALL_COMMANDS_BIT = 0x00010000,
    VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT = 0x01000000,
    VK_PIPELINE_STAGE_CONDITIONAL_RENDERING_BIT_EXT = 0x00040000,
    VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX = 0x00020000,
    VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV = 0x00400000,
    VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_NV = 0x00200000,
    VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_NV = 0x02000000,
    VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV = 0x00080000,
    VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV = 0x00100000,
} VkPipelineStageFlagBits;
  • VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT specifies the stage of the pipeline where any commands are initially received by the queue.

  • VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT specifies the stage of the pipeline where Draw/DispatchIndirect data structures are consumed. This stage also includes reading commands written by vkCmdProcessCommandsNVX.

  • VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV specifies the task shader stage.

  • VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV specifies the mesh shader stage.

  • VK_PIPELINE_STAGE_VERTEX_INPUT_BIT specifies the stage of the pipeline where vertex and index buffers are consumed.

  • VK_PIPELINE_STAGE_VERTEX_SHADER_BIT specifies the vertex shader stage.

  • VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT specifies the tessellation control shader stage.

  • VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT specifies the tessellation evaluation shader stage.

  • VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT specifies the geometry shader stage.

  • VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT specifies the fragment shader stage.

  • VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT specifies the stage of the pipeline where early fragment tests (depth and stencil tests before fragment shading) are performed. This stage also includes subpass load operations for framebuffer attachments with a depth/stencil format.

  • VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT specifies the stage of the pipeline where late fragment tests (depth and stencil tests after fragment shading) are performed. This stage also includes subpass store operations for framebuffer attachments with a depth/stencil format.

  • VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT specifies the stage of the pipeline after blending where the final color values are output from the pipeline. This stage also includes subpass load and store operations and multisample resolve operations for framebuffer attachments with a color format.

  • VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT specifies the execution of a compute shader.

  • VK_PIPELINE_STAGE_TRANSFER_BIT specifies the execution of copy commands. This includes the operations resulting from all copy commands, clear commands (with the exception of vkCmdClearAttachments), and vkCmdCopyQueryPoolResults.

  • VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT specifies the final stage in the pipeline where operations generated by all commands complete execution.

  • VK_PIPELINE_STAGE_HOST_BIT specifies a pseudo-stage indicating execution on the host of reads/writes of device memory. This stage is not invoked by any commands recorded in a command buffer.

  • VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_NV specifies the execution of the ray tracing shader stages.

  • VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_NV specifies the execution of vkCmdBuildAccelerationStructureNV, vkCmdCopyAccelerationStructureNV, and vkCmdWriteAccelerationStructuresPropertiesNV.

  • VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT specifies the execution of all graphics pipeline stages, and is equivalent to the logical OR of:

    • VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT

    • VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT

    • VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV

    • VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV

    • VK_PIPELINE_STAGE_VERTEX_INPUT_BIT

    • VK_PIPELINE_STAGE_VERTEX_SHADER_BIT

    • VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT

    • VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

    • VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

    • VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT

    • VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT

    • VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT

    • VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

    • VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT

    • VK_PIPELINE_STAGE_CONDITIONAL_RENDERING_BIT_EXT

    • VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT

    • VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV

  • VK_PIPELINE_STAGE_ALL_COMMANDS_BIT is equivalent to the logical OR of every other pipeline stage flag that is supported on the queue it is used with.

  • VK_PIPELINE_STAGE_CONDITIONAL_RENDERING_BIT_EXT specifies the stage of the pipeline where the predicate of conditional rendering is consumed.

  • VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT specifies the stage of the pipeline where vertex attribute output values are written to the transform feedback buffers.

  • VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX specifies the stage of the pipeline where device-side generation of commands via vkCmdProcessCommandsNVX is handled.

  • VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV specifies the stage of the pipeline where the shading rate image is read to determine the shading rate for portions of a rasterized primitive.

Note

An execution dependency with only VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT in the destination stage mask will only prevent that stage from executing in subsequently submitted commands. As this stage does not perform any actual execution, this is not observable - in effect, it does not delay processing of subsequent commands. Similarly an execution dependency with only VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT in the source stage mask will effectively not wait for any prior commands to complete.

When defining a memory dependency, using only VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT or VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT would never make any accesses available and/or visible because these stages do not access memory.

VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT and VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT are useful for accomplishing layout transitions and queue ownership operations when the required execution dependency is satisfied by other means - for example, semaphore operations between queues.

typedef VkFlags VkPipelineStageFlags;

VkPipelineStageFlags is a bitmask type for setting a mask of zero or more VkPipelineStageFlagBits.

If a synchronization command includes a source stage mask, its first synchronization scope only includes execution of the pipeline stages specified in that mask, and its first access scope only includes memory access performed by pipeline stages specified in that mask. If a synchronization command includes a destination stage mask, its second synchronization scope only includes execution of the pipeline stages specified in that mask, and its second access scope only includes memory access performed by pipeline stages specified in that mask.

Note

Including a particular pipeline stage in the first synchronization scope of a command implicitly includes logically earlier pipeline stages in the synchronization scope. Similarly, the second synchronization scope includes logically later pipeline stages.

However, note that access scopes are not affected in this way - only the precise stages specified are considered part of each access scope.

Certain pipeline stages are only available on queues that support a particular set of operations. The following table lists, for each pipeline stage flag, which queue capability flag must be supported by the queue. When multiple flags are enumerated in the second column of the table, it means that the pipeline stage is supported on the queue if it supports any of the listed capability flags. For further details on queue capabilities see Physical Device Enumeration and Queues.

Table 3. Supported pipeline stage flags
Pipeline stage flag Required queue capability flag

VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT

None required

VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT

VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT

VK_PIPELINE_STAGE_VERTEX_INPUT_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_VERTEX_SHADER_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT

VK_QUEUE_COMPUTE_BIT

VK_PIPELINE_STAGE_TRANSFER_BIT

VK_QUEUE_GRAPHICS_BIT, VK_QUEUE_COMPUTE_BIT or VK_QUEUE_TRANSFER_BIT

VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT

None required

VK_PIPELINE_STAGE_HOST_BIT

None required

VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_ALL_COMMANDS_BIT

None required

VK_PIPELINE_STAGE_CONDITIONAL_RENDERING_BIT_EXT

VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT

VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX

VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT

VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_NV

VK_QUEUE_COMPUTE_BIT

VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_NV

VK_QUEUE_COMPUTE_BIT

Pipeline stages that execute as a result of a command logically complete execution in a specific order, such that completion of a logically later pipeline stage must not happen-before completion of a logically earlier stage. This means that including any stage in the source stage mask for a particular synchronization command also implies that any logically earlier stages are included in AS for that command.

Similarly, initiation of a logically earlier pipeline stage must not happen-after initiation of a logically later pipeline stage. Including any given stage in the destination stage mask for a particular synchronization command also implies that any logically later stages are included in BS for that command.

Note

Implementations may not support synchronization at every pipeline stage for every synchronization operation. If a pipeline stage that an implementation does not support synchronization for appears in a source stage mask, it may substitute any logically later stage in its place for the first synchronization scope. If a pipeline stage that an implementation does not support synchronization for appears in a destination stage mask, it may substitute any logically earlier stage in its place for the second synchronization scope.

For example, if an implementation is unable to signal an event immediately after vertex shader execution is complete, it may instead signal the event after color attachment output has completed.

If an implementation makes such a substitution, it must not affect the semantics of execution or memory dependencies or image and buffer memory barriers.

The order and set of pipeline stages executed by a given command is determined by the command’s pipeline type, as described below:

For the graphics primitive shading pipeline, the following stages occur in this order:

  • VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT

  • VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT

  • VK_PIPELINE_STAGE_VERTEX_INPUT_BIT

  • VK_PIPELINE_STAGE_VERTEX_SHADER_BIT

  • VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT

  • VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT

  • VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV

  • VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT

  • VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT

  • VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT

  • VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

  • VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT

For the graphics mesh shading pipeline, the following stages occur in this order:

  • VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT

  • VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT

  • VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV

  • VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV

  • VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV

  • VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT

  • VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT

  • VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT

  • VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

  • VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT

For the compute pipeline, the following stages occur in this order:

  • VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT

  • VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT

  • VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT

  • VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT

The conditional rendering stage is formally part of both the graphics, and the compute pipeline. The pipeline stage where the predicate read happens has unspecified order relative to other stages of these pipelines:

  • VK_PIPELINE_STAGE_CONDITIONAL_RENDERING_BIT_EXT

For the transfer pipeline, the following stages occur in this order:

  • VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT

  • VK_PIPELINE_STAGE_TRANSFER_BIT

  • VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT

For host operations, only one pipeline stage occurs, so no order is guaranteed:

  • VK_PIPELINE_STAGE_HOST_BIT

For the command processing pipeline, the following stages occur in this order:

  • VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT

  • VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX

  • VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT

For the ray tracing shader pipeline, only one pipeline stage occurs, so no order is guaranteed:

  • VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_NV

For ray tracing acceleration structure operations, only one pipeline stage occurs, so no order is guaranteed:

  • VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_NV

6.1.3. Access Types

Memory in Vulkan can be accessed from within shader invocations and via some fixed-function stages of the pipeline. The access type is a function of the descriptor type used, or how a fixed-function stage accesses memory. Each access type corresponds to a bit flag in VkAccessFlagBits.

Some synchronization commands take sets of access types as parameters to define the access scopes of a memory dependency. If a synchronization command includes a source access mask, its first access scope only includes accesses via the access types specified in that mask. Similarly, if a synchronization command includes a destination access mask, its second access scope only includes accesses via the access types specified in that mask.

Access types that can be set in an access mask include:

typedef enum VkAccessFlagBits {
    VK_ACCESS_INDIRECT_COMMAND_READ_BIT = 0x00000001,
    VK_ACCESS_INDEX_READ_BIT = 0x00000002,
    VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT = 0x00000004,
    VK_ACCESS_UNIFORM_READ_BIT = 0x00000008,
    VK_ACCESS_INPUT_ATTACHMENT_READ_BIT = 0x00000010,
    VK_ACCESS_SHADER_READ_BIT = 0x00000020,
    VK_ACCESS_SHADER_WRITE_BIT = 0x00000040,
    VK_ACCESS_COLOR_ATTACHMENT_READ_BIT = 0x00000080,
    VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT = 0x00000100,
    VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT = 0x00000200,
    VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT = 0x00000400,
    VK_ACCESS_TRANSFER_READ_BIT = 0x00000800,
    VK_ACCESS_TRANSFER_WRITE_BIT = 0x00001000,
    VK_ACCESS_HOST_READ_BIT = 0x00002000,
    VK_ACCESS_HOST_WRITE_BIT = 0x00004000,
    VK_ACCESS_MEMORY_READ_BIT = 0x00008000,
    VK_ACCESS_MEMORY_WRITE_BIT = 0x00010000,
    VK_ACCESS_TRANSFORM_FEEDBACK_WRITE_BIT_EXT = 0x02000000,
    VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_READ_BIT_EXT = 0x04000000,
    VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_WRITE_BIT_EXT = 0x08000000,
    VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT_EXT = 0x00100000,
    VK_ACCESS_COMMAND_PROCESS_READ_BIT_NVX = 0x00020000,
    VK_ACCESS_COMMAND_PROCESS_WRITE_BIT_NVX = 0x00040000,
    VK_ACCESS_COLOR_ATTACHMENT_READ_NONCOHERENT_BIT_EXT = 0x00080000,
    VK_ACCESS_SHADING_RATE_IMAGE_READ_BIT_NV = 0x00800000,
    VK_ACCESS_ACCELERATION_STRUCTURE_READ_BIT_NV = 0x00200000,
    VK_ACCESS_ACCELERATION_STRUCTURE_WRITE_BIT_NV = 0x00400000,
} VkAccessFlagBits;
  • VK_ACCESS_INDIRECT_COMMAND_READ_BIT specifies read access to indirect command data read as part of an indirect drawing or dispatch command.

  • VK_ACCESS_INDEX_READ_BIT specifies read access to an index buffer as part of an indexed drawing command, bound by vkCmdBindIndexBuffer.

  • VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT specifies read access to a vertex buffer as part of a drawing command, bound by vkCmdBindVertexBuffers.

  • VK_ACCESS_UNIFORM_READ_BIT specifies read access to a uniform buffer.

  • VK_ACCESS_INPUT_ATTACHMENT_READ_BIT specifies read access to an input attachment within a render pass during fragment shading.

  • VK_ACCESS_SHADER_READ_BIT specifies read access to a storage buffer, uniform texel buffer, storage texel buffer, sampled image, or storage image.

  • VK_ACCESS_SHADER_WRITE_BIT specifies write access to a storage buffer, storage texel buffer, or storage image.

  • VK_ACCESS_COLOR_ATTACHMENT_READ_BIT specifies read access to a color attachment, such as via blending, logic operations, or via certain subpass load operations. It does not include advanced blend operations.

  • VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT specifies write access to a color or resolve attachment during a render pass or via certain subpass load and store operations.

  • VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT specifies read access to a depth/stencil attachment, via depth or stencil operations or via certain subpass load operations.

  • VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT specifies write access to a depth/stencil attachment, via depth or stencil operations or via certain subpass load and store operations.

  • VK_ACCESS_TRANSFER_READ_BIT specifies read access to an image or buffer in a copy operation.

  • VK_ACCESS_TRANSFER_WRITE_BIT specifies write access to an image or buffer in a clear or copy operation.

  • VK_ACCESS_HOST_READ_BIT specifies read access by a host operation. Accesses of this type are not performed through a resource, but directly on memory.

  • VK_ACCESS_HOST_WRITE_BIT specifies write access by a host operation. Accesses of this type are not performed through a resource, but directly on memory.

  • VK_ACCESS_MEMORY_READ_BIT specifies read access via non-specific entities. These entities include the Vulkan device and host, but may also include entities external to the Vulkan device or otherwise not part of the core Vulkan pipeline. When included in a destination access mask, makes all available writes visible to all future read accesses on entities known to the Vulkan device.

  • VK_ACCESS_MEMORY_WRITE_BIT specifies write access via non-specific entities. These entities include the Vulkan device and host, but may also include entities external to the Vulkan device or otherwise not part of the core Vulkan pipeline. When included in a source access mask, all writes that are performed by entities known to the Vulkan device are made available. When included in a destination access mask, makes all available writes visible to all future write accesses on entities known to the Vulkan device.

  • VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT_EXT specifies read access to a predicate as part of conditional rendering.

  • VK_ACCESS_TRANSFORM_FEEDBACK_WRITE_BIT_EXT specifies write access to a transform feedback buffer made when transform feedback is active.

  • VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_READ_BIT_EXT specifies read access to a transform feedback counter buffer which is read when vkCmdBeginTransformFeedbackEXT executes.

  • VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_WRITE_BIT_EXT specifies write access to a transform feedback counter buffer which is written when vkCmdEndTransformFeedbackEXT executes.

  • VK_ACCESS_COMMAND_PROCESS_READ_BIT_NVX specifies reads from VkBuffer inputs to vkCmdProcessCommandsNVX.

  • VK_ACCESS_COMMAND_PROCESS_WRITE_BIT_NVX specifies writes to the target command buffer in vkCmdProcessCommandsNVX.

  • VK_ACCESS_COLOR_ATTACHMENT_READ_NONCOHERENT_BIT_EXT is similar to VK_ACCESS_COLOR_ATTACHMENT_READ_BIT, but also includes advanced blend operations.

  • VK_ACCESS_SHADING_RATE_IMAGE_READ_BIT_NV specifies read access to a shading rate image as part of a drawing command, as bound by vkCmdBindShadingRateImageNV.

  • VK_ACCESS_ACCELERATION_STRUCTURE_READ_BIT_NV specifies read access to an acceleration structure as part of a trace or build command.

  • VK_ACCESS_ACCELERATION_STRUCTURE_WRITE_BIT_NV specifies write access to an acceleration structure as part of a build command.

Certain access types are only performed by a subset of pipeline stages. Any synchronization command that takes both stage masks and access masks uses both to define the access scopes - only the specified access types performed by the specified stages are included in the access scope. An application must not specify an access flag in a synchronization command if it does not include a pipeline stage in the corresponding stage mask that is able to perform accesses of that type. The following table lists, for each access flag, which pipeline stages can perform that type of access.

Table 4. Supported access types
Access flag Supported pipeline stages

VK_ACCESS_INDIRECT_COMMAND_READ_BIT

VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT

VK_ACCESS_INDEX_READ_BIT

VK_PIPELINE_STAGE_VERTEX_INPUT_BIT

VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT

VK_PIPELINE_STAGE_VERTEX_INPUT_BIT

VK_ACCESS_UNIFORM_READ_BIT

VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV, VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV, VK_PIPELINE_STAGE_VERTEX_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT, VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, or VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT

VK_ACCESS_SHADER_READ_BIT

VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV, VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV, VK_PIPELINE_STAGE_VERTEX_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT, VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, or VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT

VK_ACCESS_SHADER_WRITE_BIT

VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV, VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV, VK_PIPELINE_STAGE_VERTEX_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT, VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, or VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT

VK_ACCESS_INPUT_ATTACHMENT_READ_BIT

VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT

VK_ACCESS_COLOR_ATTACHMENT_READ_BIT

VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT

VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT

VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT, or VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT

VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT

VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT, or VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT

VK_ACCESS_TRANSFER_READ_BIT

VK_PIPELINE_STAGE_TRANSFER_BIT

VK_ACCESS_TRANSFER_WRITE_BIT

VK_PIPELINE_STAGE_TRANSFER_BIT

VK_ACCESS_HOST_READ_BIT

VK_PIPELINE_STAGE_HOST_BIT

VK_ACCESS_HOST_WRITE_BIT

VK_PIPELINE_STAGE_HOST_BIT

VK_ACCESS_MEMORY_READ_BIT

N/A

VK_ACCESS_MEMORY_WRITE_BIT

N/A

VK_ACCESS_COLOR_ATTACHMENT_READ_NONCOHERENT_BIT_EXT

VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

VK_ACCESS_COMMAND_PROCESS_READ_BIT_NVX

VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX

VK_ACCESS_COMMAND_PROCESS_WRITE_BIT_NVX

VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX

VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT_EXT

VK_PIPELINE_STAGE_CONDITIONAL_RENDERING_BIT_EXT

VK_ACCESS_SHADING_RATE_IMAGE_READ_BIT_NV

VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV

VK_ACCESS_TRANSFORM_FEEDBACK_WRITE_BIT_EXT

VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT

VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_WRITE_BIT_EXT

VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT

VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_READ_BIT_EXT

VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT

VK_ACCESS_ACCELERATION_STRUCTURE_READ_BIT_NV

VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_NV, or VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_NV

VK_ACCESS_ACCELERATION_STRUCTURE_WRITE_BIT_NV

VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_NV

If a memory object does not have the VK_MEMORY_PROPERTY_HOST_COHERENT_BIT property, then vkFlushMappedMemoryRanges must be called in order to guarantee that writes to the memory object from the host are made available to the host domain, where they can be further made available to the device domain via a domain operation. Similarly, vkInvalidateMappedMemoryRanges must be called to guarantee that writes which are available to the host domain are made visible to host operations.

If the memory object does have the VK_MEMORY_PROPERTY_HOST_COHERENT_BIT property flag, writes to the memory object from the host are automatically made available to the host domain. Similarly, writes made available to the host domain are automatically made visible to the host.

Note

The vkQueueSubmit command automatically performs a domain operation from host to device for all writes performed before the command executes, so in most cases an explicit memory barrier is not needed for this case. In the few circumstances where a submit does not occur between the host write and the device read access, writes can be made available by using an explicit memory barrier.

typedef VkFlags VkAccessFlags;

VkAccessFlags is a bitmask type for setting a mask of zero or more VkAccessFlagBits.

6.1.4. Framebuffer Region Dependencies

Pipeline stages that operate on, or with respect to, the framebuffer are collectively the framebuffer-space pipeline stages. These stages are:

  • VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT

  • VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT

  • VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT

  • VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

For these pipeline stages, an execution or memory dependency from the first set of operations to the second set can either be a single framebuffer-global dependency, or split into multiple framebuffer-local dependencies. A dependency with non-framebuffer-space pipeline stages is neither framebuffer-global nor framebuffer-local.

A framebuffer region is a set of sample (x, y, layer, sample) coordinates that is a subset of the entire framebuffer.

Both synchronization scopes of a framebuffer-local dependency include only the operations performed within corresponding framebuffer regions (as defined below). No ordering guarantees are made between different framebuffer regions for a framebuffer-local dependency.

Both synchronization scopes of a framebuffer-global dependency include operations on all framebuffer-regions.

If the first synchronization scope includes operations on pixels/fragments with N samples and the second synchronization scope includes operations on pixels/fragments with M samples, where N does not equal M, then a framebuffer region containing all samples at a given (x, y, layer) coordinate in the first synchronization scope corresponds to a region containing all samples at the same coordinate in the second synchronization scope. In other words, it is a pixel granularity dependency. If N equals M, then a framebuffer region containing a single (x, y, layer, sample) coordinate in the first synchronization scope corresponds to a region containing the same sample at the same coordinate in the second synchronization scope. In other words, it is a sample granularity dependency.

Note

Since fragment invocations are not specified to run in any particular groupings, the size of a framebuffer region is implementation-dependent, not known to the application, and must be assumed to be no larger than specified above.

Note

Practically, the pixel vs sample granularity dependency means that if an input attachment has a different number of samples than the pipeline’s rasterizationSamples, then a fragment can access any sample in the input attachment’s pixel even if it only uses framebuffer-local dependencies. If the input attachment has the same number of samples, then the fragment can only access the covered samples in its input SampleMask (i.e. the fragment operations happen-after a framebuffer-local dependency for each sample the fragment covers). To access samples that are not covered, a framebuffer-global dependency is required.

If a synchronization command includes a dependencyFlags parameter, and specifies the VK_DEPENDENCY_BY_REGION_BIT flag, then it defines framebuffer-local dependencies for the framebuffer-space pipeline stages in that synchronization command, for all framebuffer regions. If no dependencyFlags parameter is included, or the VK_DEPENDENCY_BY_REGION_BIT flag is not specified, then a framebuffer-global dependency is specified for those stages. The VK_DEPENDENCY_BY_REGION_BIT flag does not affect the dependencies between non-framebuffer-space pipeline stages, nor does it affect the dependencies between framebuffer-space and non-framebuffer-space pipeline stages.

Note

Framebuffer-local dependencies are more optimal for most architectures; particularly tile-based architectures - which can keep framebuffer-regions entirely in on-chip registers and thus avoid external bandwidth across such a dependency. Including a framebuffer-global dependency in your rendering will usually force all implementations to flush data to memory, or to a higher level cache, breaking any potential locality optimizations.

6.1.5. View-Local Dependencies

In a render pass instance that has multiview enabled, dependencies can be either view-local or view-global.

A view-local dependency only includes operations from a single source view from the source subpass in the first synchronization scope, and only includes operations from a single destination view from the destination subpass in the second synchronization scope. A view-global dependency includes all views in the view mask of the source and destination subpasses in the corresponding synchronization scopes.

If a synchronization command includes a dependencyFlags parameter and specifies the VK_DEPENDENCY_VIEW_LOCAL_BIT flag, then it defines view-local dependencies for that synchronization command, for all views. If no dependencyFlags parameter is included or the VK_DEPENDENCY_VIEW_LOCAL_BIT flag is not specified, then a view-global dependency is specified.

6.1.6. Device-Local Dependencies

Dependencies can be either device-local or non-device-local. A device-local dependency acts as multiple separate dependencies, one for each physical device that executes the synchronization command, where each dependency only includes operations from that physical device in both synchronization scopes. A non-device-local dependency is a single dependency where both synchronization scopes include operations from all physical devices that participate in the synchronization command. For subpass dependencies, all physical devices in the VkDeviceGroupRenderPassBeginInfo::deviceMask participate in the dependency, and for pipeline barriers all physical devices that are set in the command buffer’s current device mask participate in the dependency.

If a synchronization command includes a dependencyFlags parameter and specifies the VK_DEPENDENCY_DEVICE_GROUP_BIT flag, then it defines a non-device-local dependency for that synchronization command. If no dependencyFlags parameter is included or the VK_DEPENDENCY_DEVICE_GROUP_BIT flag is not specified, then it defines device-local dependencies for that synchronization command, for all participating physical devices.

Semaphore and event dependencies are device-local and only execute on the one physical device that performs the dependency.

6.2. Implicit Synchronization Guarantees

A small number of implicit ordering guarantees are provided by Vulkan, ensuring that the order in which commands are submitted is meaningful, and avoiding unnecessary complexity in common operations.

Submission order is a fundamental ordering in Vulkan, giving meaning to the order in which action and synchronization commands are recorded and submitted to a single queue. Explicit and implicit ordering guarantees between commands in Vulkan all work on the premise that this ordering is meaningful. This order does not itself define any execution or memory dependencies; synchronization commands and other orderings within the API use this ordering to define their scopes.

Submission order for any given set of commands is based on the order in which they were recorded to command buffers and then submitted. This order is determined as follows:

  1. The initial order is determined by the order in which vkQueueSubmit commands are executed on the host, for a single queue, from first to last.

  2. The order in which VkSubmitInfo structures are specified in the pSubmits parameter of vkQueueSubmit, from lowest index to highest.

  3. The order in which command buffers are specified in the pCommandBuffers member of VkSubmitInfo, from lowest index to highest.

  4. The order in which commands were recorded to a command buffer on the host, from first to last:

    • For commands recorded outside a render pass, this includes all other commands recorded outside a render pass, including vkCmdBeginRenderPass and vkCmdEndRenderPass commands; it does not directly include commands inside a render pass.

    • For commands recorded inside a render pass, this includes all other commands recorded inside the same subpass, including the vkCmdBeginRenderPass and vkCmdEndRenderPass commands that delimit the same render pass instance; it does not include commands recorded to other subpasses.

Action and synchronization commands recorded to a command buffer execute the VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT pipeline stage in submission order - forming an implicit execution dependency between this stage in each command.

State commands do not execute any operations on the device, instead they set the state of the command buffer when they execute on the host, in the order that they are recorded. Action commands consume the current state of the command buffer when they are recorded, and will execute state changes on the device as required to match the recorded state.

Execution of pipeline stages within a given command also has a loose ordering, dependent only on a single command.

6.3. Fences

Fences are a synchronization primitive that can be used to insert a dependency from a queue to the host. Fences have two states - signaled and unsignaled. A fence can be signaled as part of the execution of a queue submission command. Fences can be unsignaled on the host with vkResetFences. Fences can be waited on by the host with the vkWaitForFences command, and the current state can be queried with vkGetFenceStatus.

As with most objects in Vulkan, fences are an interface to internal data which is typically opaque to applications. This internal data is referred to as a fence’s payload.

However, in order to enable communication with agents outside of the current device, it is necessary to be able to export that payload to a commonly understood format, and subsequently import from that format as well.

The internal data of a fence may include a reference to any resources and pending work associated with signal or unsignal operations performed on that fence object. Mechanisms to import and export that internal data to and from fences are provided below. These mechanisms indirectly enable applications to share fence state between two or more fences and other synchronization primitives across process and API boundaries.

Fences are represented by VkFence handles:

VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkFence)

To create a fence, call:

VkResult vkCreateFence(
    VkDevice                                    device,
    const VkFenceCreateInfo*                    pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkFence*                                    pFence);
  • device is the logical device that creates the fence.

  • pCreateInfo is a pointer to an instance of the VkFenceCreateInfo structure which contains information about how the fence is to be created.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

  • pFence points to a handle in which the resulting fence object is returned.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pCreateInfo must be a valid pointer to a valid VkFenceCreateInfo structure

  • If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • pFence must be a valid pointer to a VkFence handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

The VkFenceCreateInfo structure is defined as:

typedef struct VkFenceCreateInfo {
    VkStructureType       sType;
    const void*           pNext;
    VkFenceCreateFlags    flags;
} VkFenceCreateInfo;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • flags is a bitmask of VkFenceCreateFlagBits specifying the initial state and behavior of the fence.

Valid Usage (Implicit)
typedef enum VkFenceCreateFlagBits {
    VK_FENCE_CREATE_SIGNALED_BIT = 0x00000001,
} VkFenceCreateFlagBits;
  • VK_FENCE_CREATE_SIGNALED_BIT specifies that the fence object is created in the signaled state. Otherwise, it is created in the unsignaled state.

typedef VkFlags VkFenceCreateFlags;

VkFenceCreateFlags is a bitmask type for setting a mask of zero or more VkFenceCreateFlagBits.

To create a fence whose payload can be exported to external handles, add the VkExportFenceCreateInfo structure to the pNext chain of the VkFenceCreateInfo structure. The VkExportFenceCreateInfo structure is defined as:

typedef struct VkExportFenceCreateInfo {
    VkStructureType                   sType;
    const void*                       pNext;
    VkExternalFenceHandleTypeFlags    handleTypes;
} VkExportFenceCreateInfo;

or the equivalent

typedef VkExportFenceCreateInfo VkExportFenceCreateInfoKHR;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • handleTypes is a bitmask of VkExternalFenceHandleTypeFlagBits specifying one or more fence handle types the application can export from the resulting fence. The application can request multiple handle types for the same fence.

Valid Usage
Valid Usage (Implicit)

To specify additional attributes of NT handles exported from a fence, add the VkExportFenceWin32HandleInfoKHR structure to the pNext chain of the VkFenceCreateInfo structure. The VkExportFenceWin32HandleInfoKHR structure is defined as:

typedef struct VkExportFenceWin32HandleInfoKHR {
    VkStructureType               sType;
    const void*                   pNext;
    const SECURITY_ATTRIBUTES*    pAttributes;
    DWORD                         dwAccess;
    LPCWSTR                       name;
} VkExportFenceWin32HandleInfoKHR;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • pAttributes is a pointer to a Windows SECURITY_ATTRIBUTES structure specifying security attributes of the handle.

  • dwAccess is a DWORD specifying access rights of the handle.

  • name is a NULL-terminated UTF-16 string to associate with the underlying synchronization primitive referenced by NT handles exported from the created fence.

If this structure is not present, or if pAttributes is set to NULL, default security descriptor values will be used, and child processes created by the application will not inherit the handle, as described in the MSDN documentation for “Synchronization Object Security and Access Rights”1. Further, if the structure is not present, the access rights will be

DXGI_SHARED_RESOURCE_READ | DXGI_SHARED_RESOURCE_WRITE

for handles of the following types:

VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_WIN32_BIT

Valid Usage
Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_EXPORT_FENCE_WIN32_HANDLE_INFO_KHR

  • If pAttributes is not NULL, pAttributes must be a valid pointer to a valid SECURITY_ATTRIBUTES value

To export a Windows handle representing the state of a fence, call:

VkResult vkGetFenceWin32HandleKHR(
    VkDevice                                    device,
    const VkFenceGetWin32HandleInfoKHR*         pGetWin32HandleInfo,
    HANDLE*                                     pHandle);
  • device is the logical device that created the fence being exported.

  • pGetWin32HandleInfo is a pointer to an instance of the VkFenceGetWin32HandleInfoKHR structure containing parameters of the export operation.

  • pHandle will return the Windows handle representing the fence state.

For handle types defined as NT handles, the handles returned by vkGetFenceWin32HandleKHR are owned by the application. To avoid leaking resources, the application must release ownership of them using the CloseHandle system call when they are no longer needed.

Exporting a Windows handle from a fence may have side effects depending on the transference of the specified handle type, as described in Importing Fence Payloads.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pGetWin32HandleInfo must be a valid pointer to a valid VkFenceGetWin32HandleInfoKHR structure

  • pHandle must be a valid pointer to a HANDLE value

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_TOO_MANY_OBJECTS

  • VK_ERROR_OUT_OF_HOST_MEMORY

The VkFenceGetWin32HandleInfoKHR structure is defined as:

typedef struct VkFenceGetWin32HandleInfoKHR {
    VkStructureType                      sType;
    const void*                          pNext;
    VkFence                              fence;
    VkExternalFenceHandleTypeFlagBits    handleType;
} VkFenceGetWin32HandleInfoKHR;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • fence is the fence from which state will be exported.

  • handleType is the type of handle requested.

The properties of the handle returned depend on the value of handleType. See VkExternalFenceHandleTypeFlagBits for a description of the properties of the defined external fence handle types.

Valid Usage
  • handleType must have been included in VkExportFenceCreateInfo::handleTypes when the fence’s current payload was created.

  • If handleType is defined as an NT handle, vkGetFenceWin32HandleKHR must be called no more than once for each valid unique combination of fence and handleType.

  • fence must not currently have its payload replaced by an imported payload as described below in Importing Fence Payloads unless that imported payload’s handle type was included in VkExternalFenceProperties::exportFromImportedHandleTypes for handleType.

  • If handleType refers to a handle type with copy payload transference semantics, fence must be signaled, or have an associated fence signal operation pending execution.

  • handleType must be defined as an NT handle or a global share handle.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_FENCE_GET_WIN32_HANDLE_INFO_KHR

  • pNext must be NULL

  • fence must be a valid VkFence handle

  • handleType must be a valid VkExternalFenceHandleTypeFlagBits value

To export a POSIX file descriptor representing the payload of a fence, call:

VkResult vkGetFenceFdKHR(
    VkDevice                                    device,
    const VkFenceGetFdInfoKHR*                  pGetFdInfo,
    int*                                        pFd);
  • device is the logical device that created the fence being exported.

  • pGetFdInfo is a pointer to an instance of the VkFenceGetFdInfoKHR structure containing parameters of the export operation.

  • pFd will return the file descriptor representing the fence payload.

Each call to vkGetFenceFdKHR must create a new file descriptor and transfer ownership of it to the application. To avoid leaking resources, the application must release ownership of the file descriptor when it is no longer needed.

Note

Ownership can be released in many ways. For example, the application can call close() on the file descriptor, or transfer ownership back to Vulkan by using the file descriptor to import a fence payload.

If pGetFdInfo->handleType is VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT and the fence is signaled at the time vkGetFenceFdKHR is called, pFd may return the value -1 instead of a valid file descriptor.

Where supported by the operating system, the implementation must set the file descriptor to be closed automatically when an execve system call is made.

Exporting a file descriptor from a fence may have side effects depending on the transference of the specified handle type, as described in Importing Fence State.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pGetFdInfo must be a valid pointer to a valid VkFenceGetFdInfoKHR structure

  • pFd must be a valid pointer to a int value

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_TOO_MANY_OBJECTS

  • VK_ERROR_OUT_OF_HOST_MEMORY

The VkFenceGetFdInfoKHR structure is defined as:

typedef struct VkFenceGetFdInfoKHR {
    VkStructureType                      sType;
    const void*                          pNext;
    VkFence                              fence;
    VkExternalFenceHandleTypeFlagBits    handleType;
} VkFenceGetFdInfoKHR;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • fence is the fence from which state will be exported.

  • handleType is the type of handle requested.

The properties of the file descriptor returned depend on the value of handleType. See VkExternalFenceHandleTypeFlagBits for a description of the properties of the defined external fence handle types.

Valid Usage
  • handleType must have been included in VkExportFenceCreateInfo::handleTypes when fence’s current payload was created.

  • If handleType refers to a handle type with copy payload transference semantics, fence must be signaled, or have an associated fence signal operation pending execution.

  • fence must not currently have its payload replaced by an imported payload as described below in Importing Fence Payloads unless that imported payload’s handle type was included in VkExternalFenceProperties::exportFromImportedHandleTypes for handleType.

  • handleType must be defined as a POSIX file descriptor handle.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_FENCE_GET_FD_INFO_KHR

  • pNext must be NULL

  • fence must be a valid VkFence handle

  • handleType must be a valid VkExternalFenceHandleTypeFlagBits value

To destroy a fence, call:

void vkDestroyFence(
    VkDevice                                    device,
    VkFence                                     fence,
    const VkAllocationCallbacks*                pAllocator);
  • device is the logical device that destroys the fence.

  • fence is the handle of the fence to destroy.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

Valid Usage
  • All queue submission commands that refer to fence must have completed execution

  • If VkAllocationCallbacks were provided when fence was created, a compatible set of callbacks must be provided here

  • If no VkAllocationCallbacks were provided when fence was created, pAllocator must be NULL

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • If fence is not VK_NULL_HANDLE, fence must be a valid VkFence handle

  • If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • If fence is a valid handle, it must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to fence must be externally synchronized

To query the status of a fence from the host, call:

VkResult vkGetFenceStatus(
    VkDevice                                    device,
    VkFence                                     fence);
  • device is the logical device that owns the fence.

  • fence is the handle of the fence to query.

Upon success, vkGetFenceStatus returns the status of the fence object, with the following return codes:

Table 5. Fence Object Status Codes
Status Meaning

VK_SUCCESS

The fence specified by fence is signaled.

VK_NOT_READY

The fence specified by fence is unsignaled.

VK_ERROR_DEVICE_LOST

The device has been lost. See Lost Device.

If a queue submission command is pending execution, then the value returned by this command may immediately be out of date.

If the device has been lost (see Lost Device), vkGetFenceStatus may return any of the above status codes. If the device has been lost and vkGetFenceStatus is called repeatedly, it will eventually return either VK_SUCCESS or VK_ERROR_DEVICE_LOST.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • fence must be a valid VkFence handle

  • fence must have been created, allocated, or retrieved from device

Return Codes
Success
  • VK_SUCCESS

  • VK_NOT_READY

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_DEVICE_LOST

To set the state of fences to unsignaled from the host, call:

VkResult vkResetFences(
    VkDevice                                    device,
    uint32_t                                    fenceCount,
    const VkFence*                              pFences);
  • device is the logical device that owns the fences.

  • fenceCount is the number of fences to reset.

  • pFences is a pointer to an array of fence handles to reset.

If any member of pFences currently has its payload imported with temporary permanence, that fence’s prior permanent payload is first restored. The remaining operations described therefore operate on the restored payload.

When vkResetFences is executed on the host, it defines a fence unsignal operation for each fence, which resets the fence to the unsignaled state.

If any member of pFences is already in the unsignaled state when vkResetFences is executed, then vkResetFences has no effect on that fence.

Valid Usage
  • Each element of pFences must not be currently associated with any queue command that has not yet completed execution on that queue

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pFences must be a valid pointer to an array of fenceCount valid VkFence handles

  • fenceCount must be greater than 0

  • Each element of pFences must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to each member of pFences must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

When a fence is submitted to a queue as part of a queue submission command, it defines a memory dependency on the batches that were submitted as part of that command, and defines a fence signal operation which sets the fence to the signaled state.

The first synchronization scope includes every batch submitted in the same queue submission command. Fence signal operations that are defined by vkQueueSubmit additionally include in the first synchronization scope all commands that occur earlier in submission order.

The second synchronization scope only includes the fence signal operation.

The first access scope includes all memory access performed by the device.

The second access scope is empty.

To wait for one or more fences to enter the signaled state on the host, call:

VkResult vkWaitForFences(
    VkDevice                                    device,
    uint32_t                                    fenceCount,
    const VkFence*                              pFences,
    VkBool32                                    waitAll,
    uint64_t                                    timeout);
  • device is the logical device that owns the fences.

  • fenceCount is the number of fences to wait on.

  • pFences is a pointer to an array of fenceCount fence handles.

  • waitAll is the condition that must be satisfied to successfully unblock the wait. If waitAll is VK_TRUE, then the condition is that all fences in pFences are signaled. Otherwise, the condition is that at least one fence in pFences is signaled.

  • timeout is the timeout period in units of nanoseconds. timeout is adjusted to the closest value allowed by the implementation-dependent timeout accuracy, which may be substantially longer than one nanosecond, and may be longer than the requested period.

If the condition is satisfied when vkWaitForFences is called, then vkWaitForFences returns immediately. If the condition is not satisfied at the time vkWaitForFences is called, then vkWaitForFences will block and wait up to timeout nanoseconds for the condition to become satisfied.

If timeout is zero, then vkWaitForFences does not wait, but simply returns the current state of the fences. VK_TIMEOUT will be returned in this case if the condition is not satisfied, even though no actual wait was performed.

If the specified timeout period expires before the condition is satisfied, vkWaitForFences returns VK_TIMEOUT. If the condition is satisfied before timeout nanoseconds has expired, vkWaitForFences returns VK_SUCCESS.

If device loss occurs (see Lost Device) before the timeout has expired, vkWaitForFences must return in finite time with either VK_SUCCESS or VK_ERROR_DEVICE_LOST.

Note

While we guarantee that vkWaitForFences must return in finite time, no guarantees are made that it returns immediately upon device loss. However, the client can reasonably expect that the delay will be on the order of seconds and that calling vkWaitForFences will not result in a permanently (or seemingly permanently) dead process.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pFences must be a valid pointer to an array of fenceCount valid VkFence handles

  • fenceCount must be greater than 0

  • Each element of pFences must have been created, allocated, or retrieved from device

Return Codes
Success
  • VK_SUCCESS

  • VK_TIMEOUT

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_DEVICE_LOST

An execution dependency is defined by waiting for a fence to become signaled, either via vkWaitForFences or by polling on vkGetFenceStatus.

The first synchronization scope includes only the fence signal operation.

The second synchronization scope includes the host operations of vkWaitForFences or vkGetFenceStatus indicating that the fence has become signaled.

Note

Signaling a fence and waiting on the host does not guarantee that the results of memory accesses will be visible to the host, as the access scope of a memory dependency defined by a fence only includes device access. A memory barrier or other memory dependency must be used to guarantee this. See the description of host access types for more information.

6.3.1. Alternate Methods to Signal Fences

Besides submitting a fence to a queue as part of a queue submission command, a fence may also be signaled when a particular event occurs on a device or display.

To create a fence that will be signaled when an event occurs on a device, call:

VkResult vkRegisterDeviceEventEXT(
    VkDevice                                    device,
    const VkDeviceEventInfoEXT*                 pDeviceEventInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkFence*                                    pFence);
  • device is a logical device on which the event may occur.

  • pDeviceEventInfo is a pointer to an instance of the VkDeviceEventInfoEXT structure describing the event of interest to the application.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

  • pFence points to a handle in which the resulting fence object is returned.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pDeviceEventInfo must be a valid pointer to a valid VkDeviceEventInfoEXT structure

  • If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • pFence must be a valid pointer to a VkFence handle

Return Codes
Success
  • VK_SUCCESS

The VkDeviceEventInfoEXT structure is defined as:

typedef struct VkDeviceEventInfoEXT {
    VkStructureType         sType;
    const void*             pNext;
    VkDeviceEventTypeEXT    deviceEvent;
} VkDeviceEventInfoEXT;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • device is a VkDeviceEventTypeEXT value specifying when the fence will be signaled.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_DEVICE_EVENT_INFO_EXT

  • pNext must be NULL

  • deviceEvent must be a valid VkDeviceEventTypeEXT value

Possible values of VkDeviceEventInfoEXT::device, specifying when a fence will be signaled, are:

typedef enum VkDeviceEventTypeEXT {
    VK_DEVICE_EVENT_TYPE_DISPLAY_HOTPLUG_EXT = 0,
} VkDeviceEventTypeEXT;
  • VK_DEVICE_EVENT_TYPE_DISPLAY_HOTPLUG_EXT specifies that the fence is signaled when a display is plugged into or unplugged from the specified device. Applications can use this notification to determine when they need to re-enumerate the available displays on a device.

To create a fence that will be signaled when an event occurs on a VkDisplayKHR object, call:

VkResult vkRegisterDisplayEventEXT(
    VkDevice                                    device,
    VkDisplayKHR                                display,
    const VkDisplayEventInfoEXT*                pDisplayEventInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkFence*                                    pFence);
  • device is a logical device associated with display

  • display is the display on which the event may occur.

  • pDisplayEventInfo is a pointer to an instance of the VkDisplayEventInfoEXT structure describing the event of interest to the application.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

  • pFence points to a handle in which the resulting fence object is returned.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • display must be a valid VkDisplayKHR handle

  • pDisplayEventInfo must be a valid pointer to a valid VkDisplayEventInfoEXT structure

  • If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • pFence must be a valid pointer to a VkFence handle

Return Codes
Success
  • VK_SUCCESS

The VkDisplayEventInfoEXT structure is defined as:

typedef struct VkDisplayEventInfoEXT {
    VkStructureType          sType;
    const void*              pNext;
    VkDisplayEventTypeEXT    displayEvent;
} VkDisplayEventInfoEXT;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • displayEvent is a VkDisplayEventTypeEXT specifying when the fence will be signaled.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_DISPLAY_EVENT_INFO_EXT

  • pNext must be NULL

  • displayEvent must be a valid VkDisplayEventTypeEXT value

Possible values of VkDisplayEventInfoEXT::displayEvent, specifying when a fence will be signaled, are:

typedef enum VkDisplayEventTypeEXT {
    VK_DISPLAY_EVENT_TYPE_FIRST_PIXEL_OUT_EXT = 0,
} VkDisplayEventTypeEXT;
  • VK_DISPLAY_EVENT_TYPE_FIRST_PIXEL_OUT_EXT specifies that the fence is signaled when the first pixel of the next display refresh cycle leaves the display engine for the display.

6.3.2. Importing Fence Payloads

Applications can import a fence payload into an existing fence using an external fence handle. The effects of the import operation will be either temporary or permanent, as specified by the application. If the import is temporary, the fence will be restored to its permanent state the next time that fence is passed to vkResetFences.

Note

Restoring a fence to its prior permanent payload is a distinct operation from resetting a fence payload. See vkResetFences for more detail.

Performing a subsequent temporary import on a fence before resetting it has no effect on this requirement; the next unsignal of the fence must still restore its last permanent state. A permanent payload import behaves as if the target fence was destroyed, and a new fence was created with the same handle but the imported payload. Because importing a fence payload temporarily or permanently detaches the existing payload from a fence, similar usage restrictions to those applied to vkDestroyFence are applied to any command that imports a fence payload. Which of these import types is used is referred to as the import operation’s permanence. Each handle type supports either one or both types of permanence.

The implementation must perform the import operation by either referencing or copying the payload referred to by the specified external fence handle, depending on the handle’s type. The import method used is referred to as the handle type’s transference. When using handle types with reference transference, importing a payload to a fence adds the fence to the set of all fences sharing that payload. This set includes the fence from which the payload was exported. Fence signaling, waiting, and resetting operations performed on any fence in the set must behave as if the set were a single fence. Importing a payload using handle types with copy transference creates a duplicate copy of the payload at the time of import, but makes no further reference to it. Fence signaling, waiting, and resetting operations performed on the target of copy imports must not affect any other fence or payload.

Export operations have the same transference as the specified handle type’s import operations. Additionally, exporting a fence payload to a handle with copy transference has the same side effects on the source fence’s payload as executing a fence reset operation. If the fence was using a temporarily imported payload, the fence’s prior permanent payload will be restored.

Note

The tables Handle Types Supported by VkImportFenceWin32HandleInfoKHR and Handle Types Supported by VkImportFenceFdInfoKHR define the permanence and transference of each handle type.

External synchronization allows implementations to modify an object’s internal state, i.e. payload, without internal synchronization. However, for fences sharing a payload across processes, satisfying the external synchronization requirements of VkFence parameters as if all fences in the set were the same object is sometimes infeasible. Satisfying valid usage constraints on the state of a fence would similarly require impractical coordination or levels of trust between processes. Therefore, these constraints only apply to a specific fence handle, not to its payload. For distinct fence objects which share a payload:

  • If multiple commands which queue a signal operation, or which unsignal a fence, are called concurrently, behavior will be as if the commands were called in an arbitrary sequential order.

  • If a queue submission command is called with a fence that is sharing a payload, and the payload is already associated with another queue command that has not yet completed execution, either one or both of the commands will cause the fence to become signaled when they complete execution.

  • If a fence payload is reset while it is associated with a queue command that has not yet completed execution, the payload will become unsignaled, but may become signaled again when the command completes execution.

  • In the preceding cases, any of the devices associated with the fences sharing the payload may be lost, or any of the queue submission or fence reset commands may return VK_ERROR_INITIALIZATION_FAILED.

Other than these non-deterministic results, behavior is well defined. In particular:

  • The implementation must not crash or enter an internally inconsistent state where future valid Vulkan commands might cause undefined results,

  • Timeouts on future wait commands on fences sharing the payload must be effective.

Note

These rules allow processes to synchronize access to shared memory without trusting each other. However, such processes must still be cautious not to use the shared fence for more than synchronizing access to the shared memory. For example, a process should not use a fence with shared payload to tell when commands it submitted to a queue have completed and objects used by those commands may be destroyed, since the other process can accidentally or maliciously cause the fence to signal before the commands actually complete.

When a fence is using an imported payload, its VkExportFenceCreateInfo::handleTypes value is that specified when creating the fence from which the payload was exported, rather than that specified when creating the fence. Additionally, VkExternalFenceProperties::exportFromImportedHandleTypes restricts which handle types can be exported from such a fence based on the specific handle type used to import the current payload. Passing a fence to vkAcquireNextImageKHR is equivalent to temporarily importing a fence payload to that fence.

Note

Because the exportable handle types of an imported fence correspond to its current imported payload, and vkAcquireNextImageKHR behaves the same as a temporary import operation for which the source fence is opaque to the application, applications have no way of determining whether any external handle types can be exported from a fence in this state. Therefore, applications must not attempt to export handles from fences using a temporarily imported payload from vkAcquireNextImageKHR.

When importing a fence payload, it is the responsibility of the application to ensure the external handles meet all valid usage requirements. However, implementations must perform sufficient validation of external handles to ensure that the operation results in a valid fence which will not cause program termination, device loss, queue stalls, host thread stalls, or corruption of other resources when used as allowed according to its import parameters. If the external handle provided does not meet these requirements, the implementation must fail the fence payload import operation with the error code VK_ERROR_INVALID_EXTERNAL_HANDLE.

To import a fence payload from a Windows handle, call:

VkResult vkImportFenceWin32HandleKHR(
    VkDevice                                    device,
    const VkImportFenceWin32HandleInfoKHR*      pImportFenceWin32HandleInfo);
  • device is the logical device that created the fence.

  • pImportFenceWin32HandleInfo points to a VkImportFenceWin32HandleInfoKHR structure specifying the fence and import parameters.

Importing a fence payload from Windows handles does not transfer ownership of the handle to the Vulkan implementation. For handle types defined as NT handles, the application must release ownership using the CloseHandle system call when the handle is no longer needed.

Applications can import the same fence payload into multiple instances of Vulkan, into the same instance from which it was exported, and multiple times into a given Vulkan instance.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pImportFenceWin32HandleInfo must be a valid pointer to a valid VkImportFenceWin32HandleInfoKHR structure

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_INVALID_EXTERNAL_HANDLE

The VkImportFenceWin32HandleInfoKHR structure is defined as:

typedef struct VkImportFenceWin32HandleInfoKHR {
    VkStructureType                      sType;
    const void*                          pNext;
    VkFence                              fence;
    VkFenceImportFlags                   flags;
    VkExternalFenceHandleTypeFlagBits    handleType;
    HANDLE                               handle;
    LPCWSTR                              name;
} VkImportFenceWin32HandleInfoKHR;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • fence is the fence into which the state will be imported.

  • flags is a bitmask of VkFenceImportFlagBits specifying additional parameters for the fence payload import operation.

  • handleType specifies the type of handle.

  • handle is the external handle to import, or NULL.

  • name is the NULL-terminated UTF-16 string naming the underlying synchronization primitive to import, or NULL.

The handle types supported by handleType are:

Table 6. Handle Types Supported by VkImportFenceWin32HandleInfoKHR
Handle Type Transference Permanence Supported

VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_WIN32_BIT

Reference

Temporary,Permanent

VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT

Reference

Temporary,Permanent

Valid Usage
  • handleType must be a value included in the Handle Types Supported by VkImportFenceWin32HandleInfoKHR table.

  • If handleType is not VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_WIN32_BIT, name must be NULL.

  • If handleType is not 0 and handle is NULL, name must name a valid synchronization primitive of the type specified by handleType.

  • If handleType is not 0 and name is NULL, handle must be a valid handle of the type specified by handleType.

  • If handle is not NULL, name must be NULL.

  • If handle is not NULL, it must obey any requirements listed for handleType in external fence handle types compatibility.

  • If name is not NULL, it must obey any requirements listed for handleType in external fence handle types compatibility.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_IMPORT_FENCE_WIN32_HANDLE_INFO_KHR

  • pNext must be NULL

  • fence must be a valid VkFence handle

  • flags must be a valid combination of VkFenceImportFlagBits values

  • If handleType is not 0, handleType must be a valid VkExternalFenceHandleTypeFlagBits value

Host Synchronization
  • Host access to fence must be externally synchronized

To import a fence payload from a POSIX file descriptor, call:

VkResult vkImportFenceFdKHR(
    VkDevice                                    device,
    const VkImportFenceFdInfoKHR*               pImportFenceFdInfo);
  • device is the logical device that created the fence.

  • pImportFenceFdInfo points to a VkImportFenceFdInfoKHR structure specifying the fence and import parameters.

Importing a fence payload from a file descriptor transfers ownership of the file descriptor from the application to the Vulkan implementation. The application must not perform any operations on the file descriptor after a successful import.

Applications can import the same fence payload into multiple instances of Vulkan, into the same instance from which it was exported, and multiple times into a given Vulkan instance.

Valid Usage
  • fence must not be associated with any queue command that has not yet completed execution on that queue

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pImportFenceFdInfo must be a valid pointer to a valid VkImportFenceFdInfoKHR structure

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_INVALID_EXTERNAL_HANDLE

The VkImportFenceFdInfoKHR structure is defined as:

typedef struct VkImportFenceFdInfoKHR {
    VkStructureType                      sType;
    const void*                          pNext;
    VkFence                              fence;
    VkFenceImportFlags                   flags;
    VkExternalFenceHandleTypeFlagBits    handleType;
    int                                  fd;
} VkImportFenceFdInfoKHR;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • fence is the fence into which the payload will be imported.

  • flags is a bitmask of VkFenceImportFlagBits specifying additional parameters for the fence payload import operation.

  • handleType specifies the type of fd.

  • fd is the external handle to import.

The handle types supported by handleType are:

Table 7. Handle Types Supported by VkImportFenceFdInfoKHR
Handle Type Transference Permanence Supported

VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT

Reference

Temporary,Permanent

VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT

Copy

Temporary

Valid Usage

If handleType is VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT, the special value -1 for fd is treated like a valid sync file descriptor referring to an object that has already signaled. The import operation will succeed and the VkFence will have a temporarily imported payload as if a valid file descriptor had been provided.

Note

This special behavior for importing an invalid sync file descriptor allows easier interoperability with other system APIs which use the convention that an invalid sync file descriptor represents work that has already completed and does not need to be waited for. It is consistent with the option for implementations to return a -1 file descriptor when exporting a VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT from a VkFence which is signaled.

Valid Usage (Implicit)
Host Synchronization
  • Host access to fence must be externally synchronized

Bits which can be set in VkImportFenceWin32HandleInfoKHR::flags and VkImportFenceFdInfoKHR::flags specifying additional parameters of a fence import operation are:

typedef enum VkFenceImportFlagBits {
    VK_FENCE_IMPORT_TEMPORARY_BIT = 0x00000001,
    VK_FENCE_IMPORT_TEMPORARY_BIT_KHR = VK_FENCE_IMPORT_TEMPORARY_BIT,
} VkFenceImportFlagBits;

or the equivalent

typedef VkFenceImportFlagBits VkFenceImportFlagBitsKHR;
  • VK_FENCE_IMPORT_TEMPORARY_BIT specifies that the fence payload will be imported only temporarily, as described in Importing Fence Payloads, regardless of the permanence of handleType.

typedef VkFlags VkFenceImportFlags;

or the equivalent

typedef VkFenceImportFlags VkFenceImportFlagsKHR;

VkFenceImportFlags is a bitmask type for setting a mask of zero or more VkFenceImportFlagBits.

6.4. Semaphores

Semaphores are a synchronization primitive that can be used to insert a dependency between batches submitted to queues. Semaphores have two states - signaled and unsignaled. The state of a semaphore can be signaled after execution of a batch of commands is completed. A batch can wait for a semaphore to become signaled before it begins execution, and the semaphore is also unsignaled before the batch begins execution.

As with most objects in Vulkan, semaphores are an interface to internal data which is typically opaque to applications. This internal data is referred to as a semaphore’s payload.

However, in order to enable communication with agents outside of the current device, it is necessary to be able to export that payload to a commonly understood format, and subsequently import from that format as well.

The internal data of a semaphore may include a reference to any resources and pending work associated with signal or unsignal operations performed on that semaphore object. Mechanisms to import and export that internal data to and from semaphores are provided below. These mechanisms indirectly enable applications to share semaphore state between two or more semaphores and other synchronization primitives across process and API boundaries.

Semaphores are represented by VkSemaphore handles:

VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkSemaphore)

To create a semaphore, call:

VkResult vkCreateSemaphore(
    VkDevice                                    device,
    const VkSemaphoreCreateInfo*                pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkSemaphore*                                pSemaphore);
  • device is the logical device that creates the semaphore.

  • pCreateInfo is a pointer to an instance of the VkSemaphoreCreateInfo structure which contains information about how the semaphore is to be created.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

  • pSemaphore points to a handle in which the resulting semaphore object is returned.

When created, the semaphore is in the unsignaled state.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pCreateInfo must be a valid pointer to a valid VkSemaphoreCreateInfo structure

  • If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • pSemaphore must be a valid pointer to a VkSemaphore handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

The VkSemaphoreCreateInfo structure is defined as:

typedef struct VkSemaphoreCreateInfo {
    VkStructureType           sType;
    const void*               pNext;
    VkSemaphoreCreateFlags    flags;
} VkSemaphoreCreateInfo;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • flags is reserved for future use.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO

  • Each pNext member of any structure (including this one) in the pNext chain must be either NULL or a pointer to a valid instance of VkExportSemaphoreCreateInfo or VkExportSemaphoreWin32HandleInfoKHR

  • Each sType member in the pNext chain must be unique

  • flags must be 0

typedef VkFlags VkSemaphoreCreateFlags;

VkSemaphoreCreateFlags is a bitmask type for setting a mask, but is currently reserved for future use.

To create a semaphore whose payload can be exported to external handles, add the VkExportSemaphoreCreateInfo structure to the pNext chain of the VkSemaphoreCreateInfo structure. The VkExportSemaphoreCreateInfo structure is defined as:

typedef struct VkExportSemaphoreCreateInfo {
    VkStructureType                       sType;
    const void*                           pNext;
    VkExternalSemaphoreHandleTypeFlags    handleTypes;
} VkExportSemaphoreCreateInfo;

or the equivalent

typedef VkExportSemaphoreCreateInfo VkExportSemaphoreCreateInfoKHR;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • handleTypes is a bitmask of VkExternalSemaphoreHandleTypeFlagBits specifying one or more semaphore handle types the application can export from the resulting semaphore. The application can request multiple handle types for the same semaphore.

Valid Usage
Valid Usage (Implicit)

To specify additional attributes of NT handles exported from a semaphore, add the VkExportSemaphoreWin32HandleInfoKHR structure to the pNext chain of the VkSemaphoreCreateInfo structure. The VkExportSemaphoreWin32HandleInfoKHR structure is defined as:

typedef struct VkExportSemaphoreWin32HandleInfoKHR {
    VkStructureType               sType;
    const void*                   pNext;
    const SECURITY_ATTRIBUTES*    pAttributes;
    DWORD                         dwAccess;
    LPCWSTR                       name;
} VkExportSemaphoreWin32HandleInfoKHR;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • pAttributes is a pointer to a Windows SECURITY_ATTRIBUTES structure specifying security attributes of the handle.

  • dwAccess is a DWORD specifying access rights of the handle.

  • name is a NULL-terminated UTF-16 string to associate with the underlying synchronization primitive referenced by NT handles exported from the created semaphore.

If this structure is not present, or if pAttributes is set to NULL, default security descriptor values will be used, and child processes created by the application will not inherit the handle, as described in the MSDN documentation for “Synchronization Object Security and Access Rights”1. Further, if the structure is not present, the access rights will be

DXGI_SHARED_RESOURCE_READ | DXGI_SHARED_RESOURCE_WRITE

for handles of the following types:

VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_BIT

And

GENERIC_ALL

for handles of the following types:

VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D12_FENCE_BIT

Valid Usage
  • If VkExportSemaphoreCreateInfo::handleTypes does not include VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_BIT or VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D12_FENCE_BIT, VkExportSemaphoreWin32HandleInfoKHR must not be in the pNext chain of VkSemaphoreCreateInfo.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_EXPORT_SEMAPHORE_WIN32_HANDLE_INFO_KHR

  • If pAttributes is not NULL, pAttributes must be a valid pointer to a valid SECURITY_ATTRIBUTES value

To export a Windows handle representing the payload of a semaphore, call:

VkResult vkGetSemaphoreWin32HandleKHR(
    VkDevice                                    device,
    const VkSemaphoreGetWin32HandleInfoKHR*     pGetWin32HandleInfo,
    HANDLE*                                     pHandle);
  • device is the logical device that created the semaphore being exported.

  • pGetWin32HandleInfo is a pointer to an instance of the VkSemaphoreGetWin32HandleInfoKHR structure containing parameters of the export operation.

  • pHandle will return the Windows handle representing the semaphore state.

For handle types defined as NT handles, the handles returned by vkGetSemaphoreWin32HandleKHR are owned by the application. To avoid leaking resources, the application must release ownership of them using the CloseHandle system call when they are no longer needed.

Exporting a Windows handle from a semaphore may have side effects depending on the transference of the specified handle type, as described in Importing Semaphore Payloads.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pGetWin32HandleInfo must be a valid pointer to a valid VkSemaphoreGetWin32HandleInfoKHR structure

  • pHandle must be a valid pointer to a HANDLE value

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_TOO_MANY_OBJECTS

  • VK_ERROR_OUT_OF_HOST_MEMORY

The VkSemaphoreGetWin32HandleInfoKHR structure is defined as:

typedef struct VkSemaphoreGetWin32HandleInfoKHR {
    VkStructureType                          sType;
    const void*                              pNext;
    VkSemaphore                              semaphore;
    VkExternalSemaphoreHandleTypeFlagBits    handleType;
} VkSemaphoreGetWin32HandleInfoKHR;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • semaphore is the semaphore from which state will be exported.

  • handleType is the type of handle requested.

The properties of the handle returned depend on the value of handleType. See VkExternalSemaphoreHandleTypeFlagBits for a description of the properties of the defined external semaphore handle types.

Valid Usage
  • handleType must have been included in VkExportSemaphoreCreateInfo::handleTypes when the semaphore’s current payload was created.

  • If handleType is defined as an NT handle, vkGetSemaphoreWin32HandleKHR must be called no more than once for each valid unique combination of semaphore and handleType.

  • semaphore must not currently have its payload replaced by an imported payload as described below in Importing Semaphore Payloads unless that imported payload’s handle type was included in VkExternalSemaphoreProperties::exportFromImportedHandleTypes for handleType.

  • If handleType refers to a handle type with copy payload transference semantics, as defined below in Importing Semaphore Payloads, there must be no queue waiting on semaphore.

  • If handleType refers to a handle type with copy payload transference semantics, semaphore must be signaled, or have an associated semaphore signal operation pending execution.

  • handleType must be defined as an NT handle or a global share handle.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_SEMAPHORE_GET_WIN32_HANDLE_INFO_KHR

  • pNext must be NULL

  • semaphore must be a valid VkSemaphore handle

  • handleType must be a valid VkExternalSemaphoreHandleTypeFlagBits value

To export a POSIX file descriptor representing the payload of a semaphore, call:

VkResult vkGetSemaphoreFdKHR(
    VkDevice                                    device,
    const VkSemaphoreGetFdInfoKHR*              pGetFdInfo,
    int*                                        pFd);
  • device is the logical device that created the semaphore being exported.

  • pGetFdInfo is a pointer to an instance of the VkSemaphoreGetFdInfoKHR structure containing parameters of the export operation.

  • pFd will return the file descriptor representing the semaphore payload.

Each call to vkGetSemaphoreFdKHR must create a new file descriptor and transfer ownership of it to the application. To avoid leaking resources, the application must release ownership of the file descriptor when it is no longer needed.

Note

Ownership can be released in many ways. For example, the application can call close() on the file descriptor, or transfer ownership back to Vulkan by using the file descriptor to import a semaphore payload.

Where supported by the operating system, the implementation must set the file descriptor to be closed automatically when an execve system call is made.

Exporting a file descriptor from a semaphore may have side effects depending on the transference of the specified handle type, as described in Importing Semaphore State.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pGetFdInfo must be a valid pointer to a valid VkSemaphoreGetFdInfoKHR structure

  • pFd must be a valid pointer to a int value

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_TOO_MANY_OBJECTS

  • VK_ERROR_OUT_OF_HOST_MEMORY

The VkSemaphoreGetFdInfoKHR structure is defined as:

typedef struct VkSemaphoreGetFdInfoKHR {
    VkStructureType                          sType;
    const void*                              pNext;
    VkSemaphore                              semaphore;
    VkExternalSemaphoreHandleTypeFlagBits    handleType;
} VkSemaphoreGetFdInfoKHR;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • semaphore is the semaphore from which state will be exported.

  • handleType is the type of handle requested.

The properties of the file descriptor returned depend on the value of handleType. See VkExternalSemaphoreHandleTypeFlagBits for a description of the properties of the defined external semaphore handle types.

Valid Usage
  • handleType must have been included in VkExportSemaphoreCreateInfo::handleTypes when semaphore’s current payload was created.

  • semaphore must not currently have its payload replaced by an imported payload as described below in Importing Semaphore Payloads unless that imported payload’s handle type was included in VkExternalSemaphoreProperties::exportFromImportedHandleTypes for handleType.

  • If handleType refers to a handle type with copy payload transference semantics, as defined below in Importing Semaphore Payloads, there must be no queue waiting on semaphore.

  • If handleType refers to a handle type with copy payload transference semantics, semaphore must be signaled, or have an associated semaphore signal operation pending execution.

  • handleType must be defined as a POSIX file descriptor handle.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_SEMAPHORE_GET_FD_INFO_KHR

  • pNext must be NULL

  • semaphore must be a valid VkSemaphore handle

  • handleType must be a valid VkExternalSemaphoreHandleTypeFlagBits value

To destroy a semaphore, call:

void vkDestroySemaphore(
    VkDevice                                    device,
    VkSemaphore                                 semaphore,
    const VkAllocationCallbacks*                pAllocator);
  • device is the logical device that destroys the semaphore.

  • semaphore is the handle of the semaphore to destroy.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

Valid Usage
  • All submitted batches that refer to semaphore must have completed execution

  • If VkAllocationCallbacks were provided when semaphore was created, a compatible set of callbacks must be provided here

  • If no VkAllocationCallbacks were provided when semaphore was created, pAllocator must be NULL

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • If semaphore is not VK_NULL_HANDLE, semaphore must be a valid VkSemaphore handle

  • If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • If semaphore is a valid handle, it must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to semaphore must be externally synchronized

6.4.1. Semaphore Signaling

When a batch is submitted to a queue via a queue submission, and it includes semaphores to be signaled, it defines a memory dependency on the batch, and defines semaphore signal operations which set the semaphores to the signaled state.

The first synchronization scope includes every command submitted in the same batch. Semaphore signal operations that are defined by vkQueueSubmit additionally include all commands that occur earlier in submission order.

The second synchronization scope includes only the semaphore signal operation.

The first access scope includes all memory access performed by the device.

The second access scope is empty.

6.4.2. Semaphore Waiting & Unsignaling

When a batch is submitted to a queue via a queue submission, and it includes semaphores to be waited on, it defines a memory dependency between prior semaphore signal operations and the batch, and defines semaphore unsignal operations which set the semaphores to the unsignaled state.

The first synchronization scope includes all semaphore signal operations that operate on semaphores waited on in the same batch, and that happen-before the wait completes.

The second synchronization scope includes every command submitted in the same batch. In the case of vkQueueSubmit, the second synchronization scope is limited to operations on the pipeline stages determined by the destination stage mask specified by the corresponding element of pWaitDstStageMask. Also, in the case of vkQueueSubmit, the second synchronization scope additionally includes all commands that occur later in submission order.

The first access scope is empty.

The second access scope includes all memory access performed by the device.

The semaphore unsignal operation happens-after the first set of operations in the execution dependency, and happens-before the second set of operations in the execution dependency.

Note

Unlike fences or events, the act of waiting for a semaphore also unsignals that semaphore. If two operations are separately specified to wait for the same semaphore, and there are no other execution dependencies between those operations, behaviour is undefined. An execution dependency must be present that guarantees that the semaphore unsignal operation for the first of those waits, happens-before the semaphore is signalled again, and before the second unsignal operation. Semaphore waits and signals should thus occur in discrete 1:1 pairs.

Note

A common scenario for using pWaitDstStageMask with values other than VK_PIPELINE_STAGE_ALL_COMMANDS_BIT is when synchronizing a window system presentation operation against subsequent command buffers which render the next frame. In this case, a presentation image must not be overwritten until the presentation operation completes, but other pipeline stages can execute without waiting. A mask of VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT prevents subsequent color attachment writes from executing until the semaphore signals. Some implementations may be able to execute transfer operations and/or vertex processing work before the semaphore is signaled.

If an image layout transition needs to be performed on a presentable image before it is used in a framebuffer, that can be performed as the first operation submitted to the queue after acquiring the image, and should not prevent other work from overlapping with the presentation operation. For example, a VkImageMemoryBarrier could use:

  • srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

  • srcAccessMask = 0

  • dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

  • dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT.

  • oldLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR

  • newLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL

Alternatively, oldLayout can be VK_IMAGE_LAYOUT_UNDEFINED, if the image’s contents need not be preserved.

This barrier accomplishes a dependency chain between previous presentation operations and subsequent color attachment output operations, with the layout transition performed in between, and does not introduce a dependency between previous work and any vertex processing stages. More precisely, the semaphore signals after the presentation operation completes, the semaphore wait stalls the VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT stage, and there is a dependency from that same stage to itself with the layout transition performed in between.

6.4.3. Semaphore State Requirements For Wait Operations

Before waiting on a semaphore, the application must ensure the semaphore is in a valid state for a wait operation. Specifically, when a semaphore wait and unsignal operation is submitted to a queue:

  • The semaphore must be signaled, or have an associated semaphore signal operation that is pending execution.

  • There must be no other queue waiting on the same semaphore when the operation executes.

6.4.4. Importing Semaphore Payloads

Applications can import a semaphore payload into an existing semaphore using an external semaphore handle. The effects of the import operation will be either temporary or permanent, as specified by the application. If the import is temporary, the implementation must restore the semaphore to its prior permanent state after submitting the next semaphore wait operation. Performing a subsequent temporary import on a semaphore before performing a semaphore wait has no effect on this requirement; the next wait submitted on the semaphore must still restore its last permanent state. A permanent payload import behaves as if the target semaphore was destroyed, and a new semaphore was created with the same handle but the imported payload. Because importing a semaphore payload temporarily or permanently detaches the existing payload from a semaphore, similar usage restrictions to those applied to vkDestroySemaphore are applied to any command that imports a semaphore payload. Which of these import types is used is referred to as the import operation’s permanence. Each handle type supports either one or both types of permanence.

The implementation must perform the import operation by either referencing or copying the payload referred to by the specified external semaphore handle, depending on the handle’s type. The import method used is referred to as the handle type’s transference. When using handle types with reference transference, importing a payload to a semaphore adds the semaphore to the set of all semaphores sharing that payload. This set includes the semaphore from which the payload was exported. Semaphore signaling and waiting operations performed on any semaphore in the set must behave as if the set were a single semaphore. Importing a payload using handle types with copy transference creates a duplicate copy of the payload at the time of import, but makes no further reference to it. Semaphore signaling and waiting operations performed on the target of copy imports must not affect any other semaphore or payload.

Export operations have the same transference as the specified handle type’s import operations. Additionally, exporting a semaphore payload to a handle with copy transference has the same side effects on the source semaphore’s payload as executing a semaphore wait operation. If the semaphore was using a temporarily imported payload, the semaphore’s prior permanent payload will be restored.

Note

The tables Handle Types Supported by VkImportSemaphoreWin32HandleInfoKHR and Handle Types Supported by VkImportSemaphoreFdInfoKHR define the permanence and transference of each handle type.

External synchronization allows implementations to modify an object’s internal state, i.e. payload, without internal synchronization. However, for semaphores sharing a payload across processes, satisfying the external synchronization requirements of VkSemaphore parameters as if all semaphores in the set were the same object is sometimes infeasible. Satisfying the wait operation state requirements would similarly require impractical coordination or levels of trust between processes. Therefore, these constraints only apply to a specific semaphore handle, not to its payload. For distinct semaphore objects which share a payload, if the semaphores are passed to separate queue submission commands concurrently, behavior will be as if the commands were called in an arbitrary sequential order. If the wait operation state requirements are violated for the shared payload by a queue submission command, or if a signal operation is queued for a shared payload that is already signaled or has a pending signal operation, effects must be limited to one or more of the following:

  • Returning VK_ERROR_INITIALIZATION_FAILED from the command which resulted in the violation.

  • Losing the logical device on which the violation occurred immediately or at a future time, resulting in a VK_ERROR_DEVICE_LOST error from subsequent commands, including the one causing the violation.

  • Continuing execution of the violating command or operation as if the semaphore wait completed successfully after an implementation-dependent timeout. In this case, the state of the payload becomes undefined, and future operations on semaphores sharing the payload will be subject to these same rules. The semaphore must be destroyed or have its payload replaced by an import operation to again have a well-defined state.

Note

These rules allow processes to synchronize access to shared memory without trusting each other. However, such processes must still be cautious not to use the shared semaphore for more than synchronizing access to the shared memory. For example, a process should not use a shared semaphore as part of an execution dependency chain that, when complete, leads to objects being destroyed, if it does not trust other processes sharing the semaphore payload.

When a semaphore is using an imported payload, its VkExportSemaphoreCreateInfo::handleTypes value is that specified when creating the semaphore from which the payload was exported, rather than that specified when creating the semaphore. Additionally, VkExternalSemaphoreProperties::exportFromImportedHandleTypes restricts which handle types can be exported from such a semaphore based on the specific handle type used to import the current payload. Passing a semaphore to vkAcquireNextImageKHR is equivalent to temporarily importing a semaphore payload to that semaphore.

Note

Because the exportable handle types of an imported semaphore correspond to its current imported payload, and vkAcquireNextImageKHR behaves the same as a temporary import operation for which the source semaphore is opaque to the application, applications have no way of determining whether any external handle types can be exported from a semaphore in this state. Therefore, applications must not attempt to export external handles from semaphores using a temporarily imported payload from vkAcquireNextImageKHR.

When importing a semaphore payload, it is the responsibility of the application to ensure the external handles meet all valid usage requirements. However, implementations must perform sufficient validation of external handles to ensure that the operation results in a valid semaphore which will not cause program termination, device loss, queue stalls, or corruption of other resources when used as allowed according to its import parameters, and excepting those side effects allowed for violations of the valid semaphore state for wait operations rules. If the external handle provided does not meet these requirements, the implementation must fail the semaphore payload import operation with the error code VK_ERROR_INVALID_EXTERNAL_HANDLE.

To import a semaphore payload from a Windows handle, call:

VkResult vkImportSemaphoreWin32HandleKHR(
    VkDevice                                    device,
    const VkImportSemaphoreWin32HandleInfoKHR*  pImportSemaphoreWin32HandleInfo);
  • device is the logical device that created the semaphore.

  • pImportSemaphoreWin32HandleInfo points to a VkImportSemaphoreWin32HandleInfoKHR structure specifying the semaphore and import parameters.

Importing a semaphore payload from Windows handles does not transfer ownership of the handle to the Vulkan implementation. For handle types defined as NT handles, the application must release ownership using the CloseHandle system call when the handle is no longer needed.

Applications can import the same semaphore payload into multiple instances of Vulkan, into the same instance from which it was exported, and multiple times into a given Vulkan instance.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pImportSemaphoreWin32HandleInfo must be a valid pointer to a valid VkImportSemaphoreWin32HandleInfoKHR structure

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_INVALID_EXTERNAL_HANDLE

The VkImportSemaphoreWin32HandleInfoKHR structure is defined as:

typedef struct VkImportSemaphoreWin32HandleInfoKHR {
    VkStructureType                          sType;
    const void*                              pNext;
    VkSemaphore                              semaphore;
    VkSemaphoreImportFlags                   flags;
    VkExternalSemaphoreHandleTypeFlagBits    handleType;
    HANDLE                                   handle;
    LPCWSTR                                  name;
} VkImportSemaphoreWin32HandleInfoKHR;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • semaphore is the semaphore into which the payload will be imported.

  • flags is a bitmask of VkSemaphoreImportFlagBits specifying additional parameters for the semaphore payload import operation.

  • handleType specifies the type of handle.

  • handle is the external handle to import, or NULL.

  • name is a NULL-terminated UTF-16 string naming the underlying synchronization primitive to import, or NULL.

The handle types supported by handleType are:

Table 8. Handle Types Supported by VkImportSemaphoreWin32HandleInfoKHR
Handle Type Transference Permanence Supported

VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_BIT

Reference

Temporary,Permanent

VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT

Reference

Temporary,Permanent

VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D12_FENCE_BIT

Reference

Temporary,Permanent

Valid Usage
  • handleType must be a value included in the Handle Types Supported by VkImportSemaphoreWin32HandleInfoKHR table.

  • If handleType is not VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_BIT or VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D12_FENCE_BIT, name must be NULL.

  • If handleType is not 0 and handle is NULL, name must name a valid synchronization primitive of the type specified by handleType.

  • If handleType is not 0 and name is NULL, handle must be a valid handle of the type specified by handleType.

  • If handle is not NULL, name must be NULL.

  • If handle is not NULL, it must obey any requirements listed for handleType in external semaphore handle types compatibility.

  • If name is not NULL, it must obey any requirements listed for handleType in external semaphore handle types compatibility.

Valid Usage (Implicit)
Host Synchronization
  • Host access to semaphore must be externally synchronized

To import a semaphore payload from a POSIX file descriptor, call:

VkResult vkImportSemaphoreFdKHR(
    VkDevice                                    device,
    const VkImportSemaphoreFdInfoKHR*           pImportSemaphoreFdInfo);
  • device is the logical device that created the semaphore.

  • pImportSemaphoreFdInfo points to a VkImportSemaphoreFdInfoKHR structure specifying the semaphore and import parameters.

Importing a semaphore payload from a file descriptor transfers ownership of the file descriptor from the application to the Vulkan implementation. The application must not perform any operations on the file descriptor after a successful import.

Applications can import the same semaphore payload into multiple instances of Vulkan, into the same instance from which it was exported, and multiple times into a given Vulkan instance.

Valid Usage
  • semaphore must not be associated with any queue command that has not yet completed execution on that queue

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pImportSemaphoreFdInfo must be a valid pointer to a valid VkImportSemaphoreFdInfoKHR structure

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_INVALID_EXTERNAL_HANDLE

The VkImportSemaphoreFdInfoKHR structure is defined as:

typedef struct VkImportSemaphoreFdInfoKHR {
    VkStructureType                          sType;
    const void*                              pNext;
    VkSemaphore                              semaphore;
    VkSemaphoreImportFlags                   flags;
    VkExternalSemaphoreHandleTypeFlagBits    handleType;
    int                                      fd;
} VkImportSemaphoreFdInfoKHR;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • semaphore is the semaphore into which the payload will be imported.

  • flags is a bitmask of VkSemaphoreImportFlagBits specifying additional parameters for the semaphore payload import operation.

  • handleType specifies the type of fd.

  • fd is the external handle to import.

The handle types supported by handleType are:

Table 9. Handle Types Supported by VkImportSemaphoreFdInfoKHR
Handle Type Transference Permanence Supported

VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT

Reference

Temporary,Permanent

VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT

Copy

Temporary

Valid Usage
Valid Usage (Implicit)
Host Synchronization
  • Host access to semaphore must be externally synchronized

Additional parameters of a semaphore import operation are specified by VkImportSemaphoreWin32HandleInfoKHR::flags or VkImportSemaphoreFdInfoKHR::flags . Bits which can be set include:

typedef enum VkSemaphoreImportFlagBits {
    VK_SEMAPHORE_IMPORT_TEMPORARY_BIT = 0x00000001,
    VK_SEMAPHORE_IMPORT_TEMPORARY_BIT_KHR = VK_SEMAPHORE_IMPORT_TEMPORARY_BIT,
} VkSemaphoreImportFlagBits;

or the equivalent

typedef VkSemaphoreImportFlagBits VkSemaphoreImportFlagBitsKHR;

These bits have the following meanings:

  • VK_SEMAPHORE_IMPORT_TEMPORARY_BIT specifies that the semaphore payload will be imported only temporarily, as described in Importing Semaphore Payloads, regardless of the permanence of handleType.

typedef VkFlags VkSemaphoreImportFlags;

or the equivalent

typedef VkSemaphoreImportFlags VkSemaphoreImportFlagsKHR;

VkSemaphoreImportFlags is a bitmask type for setting a mask of zero or more VkSemaphoreImportFlagBits.

6.5. Events

Events are a synchronization primitive that can be used to insert a fine-grained dependency between commands submitted to the same queue, or between the host and a queue. Events must not be used to insert a dependency between commands submitted to different queues. Events have two states - signaled and unsignaled. An application can signal an event, or unsignal it, on either the host or the device. A device can wait for an event to become signaled before executing further operations. No command exists to wait for an event to become signaled on the host, but the current state of an event can be queried.

Events are represented by VkEvent handles:

VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkEvent)

To create an event, call:

VkResult vkCreateEvent(
    VkDevice                                    device,
    const VkEventCreateInfo*                    pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkEvent*                                    pEvent);
  • device is the logical device that creates the event.

  • pCreateInfo is a pointer to an instance of the VkEventCreateInfo structure which contains information about how the event is to be created.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

  • pEvent points to a handle in which the resulting event object is returned.

When created, the event object is in the unsignaled state.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pCreateInfo must be a valid pointer to a valid VkEventCreateInfo structure

  • If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • pEvent must be a valid pointer to a VkEvent handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

The VkEventCreateInfo structure is defined as:

typedef struct VkEventCreateInfo {
    VkStructureType       sType;
    const void*           pNext;
    VkEventCreateFlags    flags;
} VkEventCreateInfo;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • flags is reserved for future use.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_EVENT_CREATE_INFO

  • pNext must be NULL

  • flags must be 0

typedef VkFlags VkEventCreateFlags;

VkEventCreateFlags is a bitmask type for setting a mask, but is currently reserved for future use.

To destroy an event, call:

void vkDestroyEvent(
    VkDevice                                    device,
    VkEvent                                     event,
    const VkAllocationCallbacks*                pAllocator);
  • device is the logical device that destroys the event.

  • event is the handle of the event to destroy.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

Valid Usage
  • All submitted commands that refer to event must have completed execution

  • If VkAllocationCallbacks were provided when event was created, a compatible set of callbacks must be provided here

  • If no VkAllocationCallbacks were provided when event was created, pAllocator must be NULL

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • If event is not VK_NULL_HANDLE, event must be a valid VkEvent handle

  • If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • If event is a valid handle, it must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to event must be externally synchronized

To query the state of an event from the host, call:

VkResult vkGetEventStatus(
    VkDevice                                    device,
    VkEvent                                     event);
  • device is the logical device that owns the event.

  • event is the handle of the event to query.

Upon success, vkGetEventStatus returns the state of the event object with the following return codes:

Table 10. Event Object Status Codes
Status Meaning

VK_EVENT_SET

The event specified by event is signaled.

VK_EVENT_RESET

The event specified by event is unsignaled.

If a vkCmdSetEvent or vkCmdResetEvent command is in a command buffer that is in the pending state, then the value returned by this command may immediately be out of date.

The state of an event can be updated by the host. The state of the event is immediately changed, and subsequent calls to vkGetEventStatus will return the new state. If an event is already in the requested state, then updating it to the same state has no effect.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • event must be a valid VkEvent handle

  • event must have been created, allocated, or retrieved from device

Return Codes
Success
  • VK_EVENT_SET

  • VK_EVENT_RESET

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_DEVICE_LOST

To set the state of an event to signaled from the host, call:

VkResult vkSetEvent(
    VkDevice                                    device,
    VkEvent                                     event);
  • device is the logical device that owns the event.

  • event is the event to set.

When vkSetEvent is executed on the host, it defines an event signal operation which sets the event to the signaled state.

If event is already in the signaled state when vkSetEvent is executed, then vkSetEvent has no effect, and no event signal operation occurs.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • event must be a valid VkEvent handle

  • event must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to event must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

To set the state of an event to unsignaled from the host, call:

VkResult vkResetEvent(
    VkDevice                                    device,
    VkEvent                                     event);
  • device is the logical device that owns the event.

  • event is the event to reset.

When vkResetEvent is executed on the host, it defines an event unsignal operation which resets the event to the unsignaled state.

If event is already in the unsignaled state when vkResetEvent is executed, then vkResetEvent has no effect, and no event unsignal operation occurs.

Valid Usage
  • event must not be waited on by a vkCmdWaitEvents command that is currently executing

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • event must be a valid VkEvent handle

  • event must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to event must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

The state of an event can also be updated on the device by commands inserted in command buffers.

To set the state of an event to signaled from a device, call:

void vkCmdSetEvent(
    VkCommandBuffer                             commandBuffer,
    VkEvent                                     event,
    VkPipelineStageFlags                        stageMask);
  • commandBuffer is the command buffer into which the command is recorded.

  • event is the event that will be signaled.

  • stageMask specifies the source stage mask used to determine when the event is signaled.

When vkCmdSetEvent is submitted to a queue, it defines an execution dependency on commands that were submitted before it, and defines an event signal operation which sets the event to the signaled state.

The first synchronization scope includes all commands that occur earlier in submission order. The synchronization scope is limited to operations on the pipeline stages determined by the source stage mask specified by stageMask.

The second synchronization scope includes only the event signal operation.

If event is already in the signaled state when vkCmdSetEvent is executed on the device, then vkCmdSetEvent has no effect, no event signal operation occurs, and no execution dependency is generated.

Valid Usage
  • stageMask must not include VK_PIPELINE_STAGE_HOST_BIT

  • If the geometry shaders feature is not enabled, stageMask must not contain VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • If the tessellation shaders feature is not enabled, stageMask must not contain VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT or VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • commandBuffer’s current device mask must include exactly one physical device.

  • If the mesh shaders feature is not enabled, stageMask must not contain VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV

  • If the task shaders feature is not enabled, stageMask must not contain VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • event must be a valid VkEvent handle

  • stageMask must be a valid combination of VkPipelineStageFlagBits values

  • stageMask must not be 0

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • This command must only be called outside of a render pass instance

  • Both of commandBuffer, and event must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Outside

Graphics
Compute

To set the state of an event to unsignaled from a device, call:

void vkCmdResetEvent(
    VkCommandBuffer                             commandBuffer,
    VkEvent                                     event,
    VkPipelineStageFlags                        stageMask);
  • commandBuffer is the command buffer into which the command is recorded.

  • event is the event that will be unsignaled.

  • stageMask is a bitmask of VkPipelineStageFlagBits specifying the source stage mask used to determine when the event is unsignaled.

When vkCmdResetEvent is submitted to a queue, it defines an execution dependency on commands that were submitted before it, and defines an event unsignal operation which resets the event to the unsignaled state.

The first synchronization scope includes all commands that occur earlier in submission order. The synchronization scope is limited to operations on the pipeline stages determined by the source stage mask specified by stageMask.

The second synchronization scope includes only the event unsignal operation.

If event is already in the unsignaled state when vkCmdResetEvent is executed on the device, then vkCmdResetEvent has no effect, no event unsignal operation occurs, and no execution dependency is generated.

Valid Usage
  • stageMask must not include VK_PIPELINE_STAGE_HOST_BIT

  • If the geometry shaders feature is not enabled, stageMask must not contain VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • If the tessellation shaders feature is not enabled, stageMask must not contain VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT or VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • When this command executes, event must not be waited on by a vkCmdWaitEvents command that is currently executing

  • commandBuffer’s current device mask must include exactly one physical device.

  • If the mesh shaders feature is not enabled, stageMask must not contain VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV

  • If the task shaders feature is not enabled, stageMask must not contain VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • event must be a valid VkEvent handle

  • stageMask must be a valid combination of VkPipelineStageFlagBits values

  • stageMask must not be 0

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • This command must only be called outside of a render pass instance

  • Both of commandBuffer, and event must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Outside

Graphics
Compute

To wait for one or more events to enter the signaled state on a device, call:

void vkCmdWaitEvents(
    VkCommandBuffer                             commandBuffer,
    uint32_t                                    eventCount,
    const VkEvent*                              pEvents,
    VkPipelineStageFlags                        srcStageMask,
    VkPipelineStageFlags                        dstStageMask,
    uint32_t                                    memoryBarrierCount,
    const VkMemoryBarrier*                      pMemoryBarriers,
    uint32_t                                    bufferMemoryBarrierCount,
    const VkBufferMemoryBarrier*                pBufferMemoryBarriers,
    uint32_t                                    imageMemoryBarrierCount,
    const VkImageMemoryBarrier*                 pImageMemoryBarriers);
  • commandBuffer is the command buffer into which the command is recorded.

  • eventCount is the length of the pEvents array.

  • pEvents is an array of event object handles to wait on.

  • srcStageMask is a bitmask of VkPipelineStageFlagBits specifying the source stage mask.

  • dstStageMask is a bitmask of VkPipelineStageFlagBits specifying the destination stage mask.

  • memoryBarrierCount is the length of the pMemoryBarriers array.

  • pMemoryBarriers is a pointer to an array of VkMemoryBarrier structures.

  • bufferMemoryBarrierCount is the length of the pBufferMemoryBarriers array.

  • pBufferMemoryBarriers is a pointer to an array of VkBufferMemoryBarrier structures.

  • imageMemoryBarrierCount is the length of the pImageMemoryBarriers array.

  • pImageMemoryBarriers is a pointer to an array of VkImageMemoryBarrier structures.

When vkCmdWaitEvents is submitted to a queue, it defines a memory dependency between prior event signal operations on the same queue or the host, and subsequent commands. vkCmdWaitEvents must not be used to wait on event signal operations occurring on other queues.

The first synchronization scope only includes event signal operations that operate on members of pEvents, and the operations that happened-before the event signal operations. Event signal operations performed by vkCmdSetEvent that occur earlier in submission order are included in the first synchronization scope, if the logically latest pipeline stage in their stageMask parameter is logically earlier than or equal to the logically latest pipeline stage in srcStageMask. Event signal operations performed by vkSetEvent are only included in the first synchronization scope if VK_PIPELINE_STAGE_HOST_BIT is included in srcStageMask.

The second synchronization scope includes all commands that occur later in submission order. The second synchronization scope is limited to operations on the pipeline stages determined by the destination stage mask specified by dstStageMask.

The first access scope is limited to access in the pipeline stages determined by the source stage mask specified by srcStageMask. Within that, the first access scope only includes the first access scopes defined by elements of the pMemoryBarriers, pBufferMemoryBarriers and pImageMemoryBarriers arrays, which each define a set of memory barriers. If no memory barriers are specified, then the first access scope includes no accesses.

The second access scope is limited to access in the pipeline stages determined by the destination stage mask specified by dstStageMask. Within that, the second access scope only includes the second access scopes defined by elements of the pMemoryBarriers, pBufferMemoryBarriers and pImageMemoryBarriers arrays, which each define a set of memory barriers. If no memory barriers are specified, then the second access scope includes no accesses.

Note

vkCmdWaitEvents is used with vkCmdSetEvent to define a memory dependency between two sets of action commands, roughly in the same way as pipeline barriers, but split into two commands such that work between the two may execute unhindered.

Note

Applications should be careful to avoid race conditions when using events. There is no direct ordering guarantee between a vkCmdResetEvent command and a vkCmdWaitEvents command submitted after it, so some other execution dependency must be included between these commands (e.g. a semaphore).

Valid Usage
  • srcStageMask must be the bitwise OR of the stageMask parameter used in previous calls to vkCmdSetEvent with any of the members of pEvents and VK_PIPELINE_STAGE_HOST_BIT if any of the members of pEvents was set using vkSetEvent

  • If the geometry shaders feature is not enabled, srcStageMask must not contain VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • If the geometry shaders feature is not enabled, dstStageMask must not contain VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • If the tessellation shaders feature is not enabled, srcStageMask must not contain VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT or VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • If the tessellation shaders feature is not enabled, dstStageMask must not contain VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT or VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • If pEvents includes one or more events that will be signaled by vkSetEvent after commandBuffer has been submitted to a queue, then vkCmdWaitEvents must not be called inside a render pass instance

  • Any pipeline stage included in srcStageMask or dstStageMask must be supported by the capabilities of the queue family specified by the queueFamilyIndex member of the VkCommandPoolCreateInfo structure that was used to create the VkCommandPool that commandBuffer was allocated from, as specified in the table of supported pipeline stages.

  • Each element of pMemoryBarriers, pBufferMemoryBarriers or pImageMemoryBarriers must not have any access flag included in its srcAccessMask member if that bit is not supported by any of the pipeline stages in srcStageMask, as specified in the table of supported access types.

  • Each element of pMemoryBarriers, pBufferMemoryBarriers or pImageMemoryBarriers must not have any access flag included in its dstAccessMask member if that bit is not supported by any of the pipeline stages in dstStageMask, as specified in the table of supported access types.

  • commandBuffer’s current device mask must include exactly one physical device.

  • If the mesh shaders feature is not enabled, srcStageMask must not contain VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV

  • If the task shaders feature is not enabled, srcStageMask must not contain VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV

  • If the mesh shaders feature is not enabled, dstStageMask must not contain VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV

  • If the task shaders feature is not enabled, dstStageMask must not contain VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • pEvents must be a valid pointer to an array of eventCount valid VkEvent handles

  • srcStageMask must be a valid combination of VkPipelineStageFlagBits values

  • srcStageMask must not be 0

  • dstStageMask must be a valid combination of VkPipelineStageFlagBits values

  • dstStageMask must not be 0

  • If memoryBarrierCount is not 0, pMemoryBarriers must be a valid pointer to an array of memoryBarrierCount valid VkMemoryBarrier structures

  • If bufferMemoryBarrierCount is not 0, pBufferMemoryBarriers must be a valid pointer to an array of bufferMemoryBarrierCount valid VkBufferMemoryBarrier structures

  • If imageMemoryBarrierCount is not 0, pImageMemoryBarriers must be a valid pointer to an array of imageMemoryBarrierCount valid VkImageMemoryBarrier structures

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • eventCount must be greater than 0

  • Both of commandBuffer, and the elements of pEvents must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Both

Graphics
Compute

6.6. Pipeline Barriers

vkCmdPipelineBarrier is a synchronization command that inserts a dependency between commands submitted to the same queue, or between commands in the same subpass.

To record a pipeline barrier, call:

void vkCmdPipelineBarrier(
    VkCommandBuffer                             commandBuffer,
    VkPipelineStageFlags                        srcStageMask,
    VkPipelineStageFlags                        dstStageMask,
    VkDependencyFlags                           dependencyFlags,
    uint32_t                                    memoryBarrierCount,
    const VkMemoryBarrier*                      pMemoryBarriers,
    uint32_t                                    bufferMemoryBarrierCount,
    const VkBufferMemoryBarrier*                pBufferMemoryBarriers,
    uint32_t                                    imageMemoryBarrierCount,
    const VkImageMemoryBarrier*                 pImageMemoryBarriers);
  • commandBuffer is the command buffer into which the command is recorded.

  • srcStageMask is a bitmask of VkPipelineStageFlagBits specifying the source stage mask.

  • dstStageMask is a bitmask of VkPipelineStageFlagBits specifying the destination stage mask.

  • dependencyFlags is a bitmask of VkDependencyFlagBits specifying how execution and memory dependencies are formed.

  • memoryBarrierCount is the length of the pMemoryBarriers array.

  • pMemoryBarriers is a pointer to an array of VkMemoryBarrier structures.

  • bufferMemoryBarrierCount is the length of the pBufferMemoryBarriers array.

  • pBufferMemoryBarriers is a pointer to an array of VkBufferMemoryBarrier structures.

  • imageMemoryBarrierCount is the length of the pImageMemoryBarriers array.

  • pImageMemoryBarriers is a pointer to an array of VkImageMemoryBarrier structures.

When vkCmdPipelineBarrier is submitted to a queue, it defines a memory dependency between commands that were submitted before it, and those submitted after it.

If vkCmdPipelineBarrier was recorded outside a render pass instance, the first synchronization scope includes all commands that occur earlier in submission order. If vkCmdPipelineBarrier was recorded inside a render pass instance, the first synchronization scope includes only commands that occur earlier in submission order within the same subpass. In either case, the first synchronization scope is limited to operations on the pipeline stages determined by the source stage mask specified by srcStageMask.

If vkCmdPipelineBarrier was recorded outside a render pass instance, the second synchronization scope includes all commands that occur later in submission order. If vkCmdPipelineBarrier was recorded inside a render pass instance, the second synchronization scope includes only commands that occur later in submission order within the same subpass. In either case, the second synchronization scope is limited to operations on the pipeline stages determined by the destination stage mask specified by dstStageMask.

The first access scope is limited to access in the pipeline stages determined by the source stage mask specified by srcStageMask. Within that, the first access scope only includes the first access scopes defined by elements of the pMemoryBarriers, pBufferMemoryBarriers and pImageMemoryBarriers arrays, which each define a set of memory barriers. If no memory barriers are specified, then the first access scope includes no accesses.

The second access scope is limited to access in the pipeline stages determined by the destination stage mask specified by dstStageMask. Within that, the second access scope only includes the second access scopes defined by elements of the pMemoryBarriers, pBufferMemoryBarriers and pImageMemoryBarriers arrays, which each define a set of memory barriers. If no memory barriers are specified, then the second access scope includes no accesses.

If dependencyFlags includes VK_DEPENDENCY_BY_REGION_BIT, then any dependency between framebuffer-space pipeline stages is framebuffer-local - otherwise it is framebuffer-global.

Valid Usage
  • If the geometry shaders feature is not enabled, srcStageMask must not contain VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • If the geometry shaders feature is not enabled, dstStageMask must not contain VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • If the tessellation shaders feature is not enabled, srcStageMask must not contain VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT or VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • If the tessellation shaders feature is not enabled, dstStageMask must not contain VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT or VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • If vkCmdPipelineBarrier is called within a render pass instance, the render pass must have been created with at least one VkSubpassDependency instance in VkRenderPassCreateInfo::pDependencies that expresses a dependency from the current subpass to itself, and for which srcStageMask contains a subset of the bit values in VkSubpassDependency::srcStageMask, dstStageMask contains a subset of the bit values in VkSubpassDependency::dstStageMask, dependencyFlags is equal to VkSubpassDependency::dependencyFlags, srcAccessMask member of each element of pMemoryBarriers and pImageMemoryBarriers contains a subset of the bit values in VkSubpassDependency::srcAccessMask, and dstAccessMask member of each element of pMemoryBarriers and pImageMemoryBarriers contains a subset of the bit values in VkSubpassDependency::dstAccessMask

  • If vkCmdPipelineBarrier is called within a render pass instance, bufferMemoryBarrierCount must be 0

  • If vkCmdPipelineBarrier is called within a render pass instance, the image member of any element of pImageMemoryBarriers must be equal to one of the elements of pAttachments that the current framebuffer was created with, that is also referred to by one of the elements of the pColorAttachments, pResolveAttachments or pDepthStencilAttachment members of the VkSubpassDescription instance that the current subpass was created with

  • If vkCmdPipelineBarrier is called within a render pass instance, the oldLayout and newLayout members of any element of pImageMemoryBarriers must be equal to the layout member of an element of the pColorAttachments, pResolveAttachments or pDepthStencilAttachment members of the VkSubpassDescription instance that the current subpass was created with, that refers to the same image

  • If vkCmdPipelineBarrier is called within a render pass instance, the oldLayout and newLayout members of an element of pImageMemoryBarriers must be equal

  • If vkCmdPipelineBarrier is called within a render pass instance, the srcQueueFamilyIndex and dstQueueFamilyIndex members of any element of pImageMemoryBarriers must be VK_QUEUE_FAMILY_IGNORED

  • Any pipeline stage included in srcStageMask or dstStageMask must be supported by the capabilities of the queue family specified by the queueFamilyIndex member of the VkCommandPoolCreateInfo structure that was used to create the VkCommandPool that commandBuffer was allocated from, as specified in the table of supported pipeline stages.

  • Each element of pMemoryBarriers, pBufferMemoryBarriers and pImageMemoryBarriers must not have any access flag included in its srcAccessMask member if that bit is not supported by any of the pipeline stages in srcStageMask, as specified in the table of supported access types.

  • Each element of pMemoryBarriers, pBufferMemoryBarriers and pImageMemoryBarriers must not have any access flag included in its dstAccessMask member if that bit is not supported by any of the pipeline stages in dstStageMask, as specified in the table of supported access types.

  • If vkCmdPipelineBarrier is called outside of a render pass instance, dependencyFlags must not include VK_DEPENDENCY_VIEW_LOCAL_BIT

  • If the mesh shaders feature is not enabled, srcStageMask must not contain VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV

  • If the task shaders feature is not enabled, srcStageMask must not contain VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV

  • If the mesh shaders feature is not enabled, dstStageMask must not contain VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV

  • If the task shaders feature is not enabled, dstStageMask must not contain VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • srcStageMask must be a valid combination of VkPipelineStageFlagBits values

  • srcStageMask must not be 0

  • dstStageMask must be a valid combination of VkPipelineStageFlagBits values

  • dstStageMask must not be 0

  • dependencyFlags must be a valid combination of VkDependencyFlagBits values

  • If memoryBarrierCount is not 0, pMemoryBarriers must be a valid pointer to an array of memoryBarrierCount valid VkMemoryBarrier structures

  • If bufferMemoryBarrierCount is not 0, pBufferMemoryBarriers must be a valid pointer to an array of bufferMemoryBarrierCount valid VkBufferMemoryBarrier structures

  • If imageMemoryBarrierCount is not 0, pImageMemoryBarriers must be a valid pointer to an array of imageMemoryBarrierCount valid VkImageMemoryBarrier structures

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support transfer, graphics, or compute operations

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Both

Transfer
Graphics
Compute

Bits which can be set in vkCmdPipelineBarrier::dependencyFlags, specifying how execution and memory dependencies are formed, are:

typedef enum VkDependencyFlagBits {
    VK_DEPENDENCY_BY_REGION_BIT = 0x00000001,
    VK_DEPENDENCY_DEVICE_GROUP_BIT = 0x00000004,
    VK_DEPENDENCY_VIEW_LOCAL_BIT = 0x00000002,
    VK_DEPENDENCY_VIEW_LOCAL_BIT_KHR = VK_DEPENDENCY_VIEW_LOCAL_BIT,
    VK_DEPENDENCY_DEVICE_GROUP_BIT_KHR = VK_DEPENDENCY_DEVICE_GROUP_BIT,
} VkDependencyFlagBits;
typedef VkFlags VkDependencyFlags;

VkDependencyFlags is a bitmask type for setting a mask of zero or more VkDependencyFlagBits.

6.6.1. Subpass Self-dependency

If vkCmdPipelineBarrier is called inside a render pass instance, the following restrictions apply. For a given subpass to allow a pipeline barrier, the render pass must declare a self-dependency from that subpass to itself. That is, there must exist a VkSubpassDependency in the subpass dependency list for the render pass with srcSubpass and dstSubpass equal to that subpass index. More than one self-dependency can be declared for each subpass. Self-dependencies must only include pipeline stage bits that are graphics stages. Self-dependencies must not have any earlier pipeline stages depend on any later pipeline stages (according to the order of graphics pipeline stages), unless all of the stages are framebuffer-space stages. If the source and destination stage masks both include framebuffer-space stages, then dependencyFlags must include VK_DEPENDENCY_BY_REGION_BIT. If the subpass has more than one view, then dependencyFlags must include VK_DEPENDENCY_VIEW_LOCAL_BIT.

A vkCmdPipelineBarrier command inside a render pass instance must be a subset of one of the self-dependencies of the subpass it is used in, meaning that the stage masks and access masks must each include only a subset of the bits of the corresponding mask in that self-dependency. If the self-dependency has VK_DEPENDENCY_BY_REGION_BIT or VK_DEPENDENCY_VIEW_LOCAL_BIT set, then so must the pipeline barrier. Pipeline barriers within a render pass instance can only be types VkMemoryBarrier or VkImageMemoryBarrier. If a VkImageMemoryBarrier is used, the image and image subresource range specified in the barrier must be a subset of one of the image views used by the framebuffer in the current subpass. Additionally, oldLayout must be equal to newLayout, and both the srcQueueFamilyIndex and dstQueueFamilyIndex must be VK_QUEUE_FAMILY_IGNORED.

6.7. Memory Barriers

Memory barriers are used to explicitly control access to buffer and image subresource ranges. Memory barriers are used to transfer ownership between queue families, change image layouts, and define availability and visibility operations. They explicitly define the access types and buffer and image subresource ranges that are included in the access scopes of a memory dependency that is created by a synchronization command that includes them.

6.7.1. Global Memory Barriers

Global memory barriers apply to memory accesses involving all memory objects that exist at the time of its execution.

The VkMemoryBarrier structure is defined as:

typedef struct VkMemoryBarrier {
    VkStructureType    sType;
    const void*        pNext;
    VkAccessFlags      srcAccessMask;
    VkAccessFlags      dstAccessMask;
} VkMemoryBarrier;

The first access scope is limited to access types in the source access mask specified by srcAccessMask.

The second access scope is limited to access types in the destination access mask specified by dstAccessMask.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_MEMORY_BARRIER

  • pNext must be NULL

  • srcAccessMask must be a valid combination of VkAccessFlagBits values

  • dstAccessMask must be a valid combination of VkAccessFlagBits values

6.7.2. Buffer Memory Barriers

Buffer memory barriers only apply to memory accesses involving a specific buffer range. That is, a memory dependency formed from a buffer memory barrier is scoped to access via the specified buffer range. Buffer memory barriers can also be used to define a queue family ownership transfer for the specified buffer range.

The VkBufferMemoryBarrier structure is defined as:

typedef struct VkBufferMemoryBarrier {
    VkStructureType    sType;
    const void*        pNext;
    VkAccessFlags      srcAccessMask;
    VkAccessFlags      dstAccessMask;
    uint32_t           srcQueueFamilyIndex;
    uint32_t           dstQueueFamilyIndex;
    VkBuffer           buffer;
    VkDeviceSize       offset;
    VkDeviceSize       size;
} VkBufferMemoryBarrier;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • srcAccessMask is a bitmask of VkAccessFlagBits specifying a source access mask.

  • dstAccessMask is a bitmask of VkAccessFlagBits specifying a destination access mask.

  • srcQueueFamilyIndex is the source queue family for a queue family ownership transfer.

  • dstQueueFamilyIndex is the destination queue family for a queue family ownership transfer.

  • buffer is a handle to the buffer whose backing memory is affected by the barrier.

  • offset is an offset in bytes into the backing memory for buffer; this is relative to the base offset as bound to the buffer (see vkBindBufferMemory).

  • size is a size in bytes of the affected area of backing memory for buffer, or VK_WHOLE_SIZE to use the range from offset to the end of the buffer.

The first access scope is limited to access to memory through the specified buffer range, via access types in the source access mask specified by srcAccessMask. If srcAccessMask includes VK_ACCESS_HOST_WRITE_BIT, memory writes performed by that access type are also made visible, as that access type is not performed through a resource.

The second access scope is limited to access to memory through the specified buffer range, via access types in the destination access mask. specified by dstAccessMask. If dstAccessMask includes VK_ACCESS_HOST_WRITE_BIT or VK_ACCESS_HOST_READ_BIT, available memory writes are also made visible to accesses of those types, as those access types are not performed through a resource.

If srcQueueFamilyIndex is not equal to dstQueueFamilyIndex, and srcQueueFamilyIndex is equal to the current queue family, then the memory barrier defines a queue family release operation for the specified buffer range, and the second access scope includes no access, as if dstAccessMask was 0.

If dstQueueFamilyIndex is not equal to srcQueueFamilyIndex, and dstQueueFamilyIndex is equal to the current queue family, then the memory barrier defines a queue family acquire operation for the specified buffer range, and the first access scope includes no access, as if srcAccessMask was 0.

Valid Usage
  • offset must be less than the size of buffer

  • If size is not equal to VK_WHOLE_SIZE, size must be greater than 0

  • If size is not equal to VK_WHOLE_SIZE, size must be less than or equal to than the size of buffer minus offset

  • If buffer was created with a sharing mode of VK_SHARING_MODE_CONCURRENT, at least one of srcQueueFamilyIndex and dstQueueFamilyIndex must be VK_QUEUE_FAMILY_IGNORED

  • If buffer was created with a sharing mode of VK_SHARING_MODE_CONCURRENT, and one of srcQueueFamilyIndex and dstQueueFamilyIndex is VK_QUEUE_FAMILY_IGNORED, the other must be VK_QUEUE_FAMILY_IGNORED or a special queue family reserved for external memory ownership transfers, as described in Queue Family Ownership Transfer.

  • If buffer was created with a sharing mode of VK_SHARING_MODE_EXCLUSIVE and srcQueueFamilyIndex is VK_QUEUE_FAMILY_IGNORED, dstQueueFamilyIndex must also be VK_QUEUE_FAMILY_IGNORED

  • If buffer was created with a sharing mode of VK_SHARING_MODE_EXCLUSIVE and srcQueueFamilyIndex is not VK_QUEUE_FAMILY_IGNORED, it must be a valid queue family or a special queue family reserved for external memory transfers, as described in Queue Family Ownership Transfer.

  • If buffer was created with a sharing mode of VK_SHARING_MODE_EXCLUSIVE and dstQueueFamilyIndex is not VK_QUEUE_FAMILY_IGNORED, it must be a valid queue family or a special queue family reserved for external memory transfers, as described in Queue Family Ownership Transfer.

  • If buffer was created with a sharing mode of VK_SHARING_MODE_EXCLUSIVE, and srcQueueFamilyIndex and dstQueueFamilyIndex are not VK_QUEUE_FAMILY_IGNORED, at least one of them must be the same as the family of the queue that will execute this barrier

  • If buffer is non-sparse then it must be bound completely and contiguously to a single VkDeviceMemory object

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER

  • pNext must be NULL

  • srcAccessMask must be a valid combination of VkAccessFlagBits values

  • dstAccessMask must be a valid combination of VkAccessFlagBits values

  • buffer must be a valid VkBuffer handle

6.7.3. Image Memory Barriers

Image memory barriers only apply to memory accesses involving a specific image subresource range. That is, a memory dependency formed from an image memory barrier is scoped to access via the specified image subresource range. Image memory barriers can also be used to define image layout transitions or a queue family ownership transfer for the specified image subresource range.

The VkImageMemoryBarrier structure is defined as:

typedef struct VkImageMemoryBarrier {
    VkStructureType            sType;
    const void*                pNext;
    VkAccessFlags              srcAccessMask;
    VkAccessFlags              dstAccessMask;
    VkImageLayout              oldLayout;
    VkImageLayout              newLayout;
    uint32_t                   srcQueueFamilyIndex;
    uint32_t                   dstQueueFamilyIndex;
    VkImage                    image;
    VkImageSubresourceRange    subresourceRange;
} VkImageMemoryBarrier;

The first access scope is limited to access to memory through the specified image subresource range, via access types in the source access mask specified by srcAccessMask. If srcAccessMask includes VK_ACCESS_HOST_WRITE_BIT, memory writes performed by that access type are also made visible, as that access type is not performed through a resource.

The second access scope is limited to access to memory through the specified image subresource range, via access types in the destination access mask specified by dstAccessMask. If dstAccessMask includes VK_ACCESS_HOST_WRITE_BIT or VK_ACCESS_HOST_READ_BIT, available memory writes are also made visible to accesses of those types, as those access types are not performed through a resource.

If srcQueueFamilyIndex is not equal to dstQueueFamilyIndex, and srcQueueFamilyIndex is equal to the current queue family, then the memory barrier defines a queue family release operation for the specified image subresource range, and the second access scope includes no access, as if dstAccessMask was 0.

If dstQueueFamilyIndex is not equal to srcQueueFamilyIndex, and dstQueueFamilyIndex is equal to the current queue family, then the memory barrier defines a queue family acquire operation for the specified image subresource range, and the first access scope includes no access, as if srcAccessMask was 0.

If oldLayout is not equal to newLayout, then the memory barrier defines an image layout transition for the specified image subresource range.

Layout transitions that are performed via image memory barriers execute in their entirety in submission order, relative to other image layout transitions submitted to the same queue, including those performed by render passes. In effect there is an implicit execution dependency from each such layout transition to all layout transitions previously submitted to the same queue.

The image layout of each image subresource of a depth/stencil image created with VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT is dependent on the last sample locations used to render to the image subresource as a depth/stencil attachment, thus when the image member of an VkImageMemoryBarrier is an image created with this flag the application can chain a VkSampleLocationsInfoEXT structure to the pNext chain of VkImageMemoryBarrier to specify the sample locations to use during the image layout transition.

If the VkSampleLocationsInfoEXT structure in the pNext chain of VkImageMemoryBarrier does not match the sample location state last used to render to the image subresource range specified by subresourceRange or if no VkSampleLocationsInfoEXT structure is in the pNext chain of VkImageMemoryBarrier then the contents of the given image subresource range becomes undefined as if oldLayout would equal VK_IMAGE_LAYOUT_UNDEFINED.

If image has a multi-planar format and the image is disjoint, then including VK_IMAGE_ASPECT_COLOR_BIT in the aspectMask member of subresourceRange is equivalent to including VK_IMAGE_ASPECT_PLANE_0_BIT, VK_IMAGE_ASPECT_PLANE_1_BIT, and (for three-plane formats only) VK_IMAGE_ASPECT_PLANE_2_BIT.

Valid Usage
  • oldLayout must be VK_IMAGE_LAYOUT_UNDEFINED or the current layout of the image subresources affected by the barrier

  • newLayout must not be VK_IMAGE_LAYOUT_UNDEFINED or VK_IMAGE_LAYOUT_PREINITIALIZED

  • If image was created with a sharing mode of VK_SHARING_MODE_CONCURRENT, at least one of srcQueueFamilyIndex and dstQueueFamilyIndex must be VK_QUEUE_FAMILY_IGNORED

  • If image was created with a sharing mode of VK_SHARING_MODE_CONCURRENT, and one of srcQueueFamilyIndex and dstQueueFamilyIndex is VK_QUEUE_FAMILY_IGNORED, the other must be VK_QUEUE_FAMILY_IGNORED or a special queue family reserved for external memory transfers, as described in Queue Family Ownership Transfer.

  • If image was created with a sharing mode of VK_SHARING_MODE_EXCLUSIVE and srcQueueFamilyIndex is VK_QUEUE_FAMILY_IGNORED, dstQueueFamilyIndex must also be VK_QUEUE_FAMILY_IGNORED.

  • If image was created with a sharing mode of VK_SHARING_MODE_EXCLUSIVE and srcQueueFamilyIndex is not VK_QUEUE_FAMILY_IGNORED, it must be a valid queue family or a special queue family reserved for external memory transfers, as described in Queue Family Ownership Transfer.

  • If image was created with a sharing mode of VK_SHARING_MODE_EXCLUSIVE and dstQueueFamilyIndex is not VK_QUEUE_FAMILY_IGNORED, it must be a valid queue family or a special queue family reserved for external memory transfers, as described in Queue Family Ownership Transfer.

  • If image was created with a sharing mode of VK_SHARING_MODE_EXCLUSIVE, and srcQueueFamilyIndex and dstQueueFamilyIndex are not VK_QUEUE_FAMILY_IGNORED, at least one of them must be the same as the family of the queue that will execute this barrier

  • subresourceRange.baseMipLevel must be less than the mipLevels specified in VkImageCreateInfo when image was created

  • If subresourceRange.levelCount is not VK_REMAINING_MIP_LEVELS, subresourceRange.baseMipLevel + subresourceRange.levelCount must be less than or equal to the mipLevels specified in VkImageCreateInfo when image was created

  • subresourceRange.baseArrayLayer must be less than the arrayLayers specified in VkImageCreateInfo when image was created

  • If subresourceRange.layerCount is not VK_REMAINING_ARRAY_LAYERS, subresourceRange.baseArrayLayer + subresourceRange.layerCount must be less than or equal to the arrayLayers specified in VkImageCreateInfo when image was created

  • If image has a depth/stencil format with both depth and stencil components, then the aspectMask member of subresourceRange must include both VK_IMAGE_ASPECT_DEPTH_BIT and VK_IMAGE_ASPECT_STENCIL_BIT

  • If image has a single-plane color format or is not disjoint, then the aspectMask member of subresourceRange must be VK_IMAGE_ASPECT_COLOR_BIT

  • If image has a multi-planar format and the image is disjoint, then the aspectMask member of subresourceRange must include either at least one of VK_IMAGE_ASPECT_PLANE_0_BIT, VK_IMAGE_ASPECT_PLANE_1_BIT, and VK_IMAGE_ASPECT_PLANE_2_BIT; or must include VK_IMAGE_ASPECT_COLOR_BIT

  • If image has a multi-planar format with only two planes, then the aspectMask member of subresourceRange must not include VK_IMAGE_ASPECT_PLANE_2_BIT

  • If either oldLayout or newLayout is VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL then image must have been created with VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT set

  • If either oldLayout or newLayout is VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL then image must have been created with VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT set

  • If either oldLayout or newLayout is VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL then image must have been created with VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT set

  • If either oldLayout or newLayout is VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL then image must have been created with VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT set

  • If either oldLayout or newLayout is VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL then image must have been created with VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT set

  • If either oldLayout or newLayout is VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL then image must have been created with VK_IMAGE_USAGE_SAMPLED_BIT or VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT set

  • If either oldLayout or newLayout is VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL then image must have been created with VK_IMAGE_USAGE_TRANSFER_SRC_BIT set

  • If either oldLayout or newLayout is VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL then image must have been created with VK_IMAGE_USAGE_TRANSFER_DST_BIT set

  • If image is non-sparse then it must be bound completely and contiguously to a single VkDeviceMemory object

  • If either oldLayout or newLayout is VK_IMAGE_LAYOUT_SHADING_RATE_OPTIMAL_NV then image must have been created with VK_IMAGE_USAGE_SHADING_RATE_IMAGE_BIT_NV set

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER

  • pNext must be NULL or a pointer to a valid instance of VkSampleLocationsInfoEXT

  • srcAccessMask must be a valid combination of VkAccessFlagBits values

  • dstAccessMask must be a valid combination of VkAccessFlagBits values

  • oldLayout must be a valid VkImageLayout value

  • newLayout must be a valid VkImageLayout value

  • image must be a valid VkImage handle

  • subresourceRange must be a valid VkImageSubresourceRange structure

6.7.4. Queue Family Ownership Transfer

Resources created with a VkSharingMode of VK_SHARING_MODE_EXCLUSIVE must have their ownership explicitly transferred from one queue family to another in order to access their content in a well-defined manner on a queue in a different queue family. Resources shared with external APIs or instances using external memory must also explicitly manage ownership transfers between local and external queues (or equivalent constructs in external APIs) regardless of the VkSharingMode specified when creating them. The special queue family index VK_QUEUE_FAMILY_EXTERNAL represents any queue external to the resource’s current Vulkan instance, as long as the queue uses the same underlying physical device or device group and uses the same driver version as the resource’s VkDevice, as indicated by VkPhysicalDeviceIDProperties::deviceUUID and VkPhysicalDeviceIDProperties::driverUUID. The special queue family index VK_QUEUE_FAMILY_FOREIGN_EXT represents any queue external to the resource’s current Vulkan instance, regardless of the queue’s underlying physical device or driver version. This includes, for example, queues for fixed-function image processing devices, media codec devices, and display devices, as well as all queues that use the same underlying physical device (or device group) and driver version as the resource’s VkDevice. If memory dependencies are correctly expressed between uses of such a resource between two queues in different families, but no ownership transfer is defined, the contents of that resource are undefined for any read accesses performed by the second queue family.

Note

If an application does not need the contents of a resource to remain valid when transferring from one queue family to another, then the ownership transfer should be skipped.

Note

Applications should expect transfers to/from VK_QUEUE_FAMILY_FOREIGN_EXT to be more expensive than transfers to/from VK_QUEUE_FAMILY_EXTERNAL_KHR.

A queue family ownership transfer consists of two distinct parts:

  1. Release exclusive ownership from the source queue family

  2. Acquire exclusive ownership for the destination queue family

An application must ensure that these operations occur in the correct order by defining an execution dependency between them, e.g. using a semaphore.

A release operation is used to release exclusive ownership of a range of a buffer or image subresource range. A release operation is defined by executing a buffer memory barrier (for a buffer range) or an image memory barrier (for an image subresource range), on a queue from the source queue family. The srcQueueFamilyIndex parameter of the barrier must be set to the source queue family index, and the dstQueueFamilyIndex parameter to the destination queue family index. dstStageMask is ignored for such a barrier, such that no visibility operation is executed - the value of this mask does not affect the validity of the barrier. The release operation happens-after the availability operation.

An acquire operation is used to acquire exclusive ownership of a range of a buffer or image subresource range. An acquire operation is defined by executing a buffer memory barrier (for a buffer range) or an image memory barrier (for an image subresource range), on a queue from the destination queue family. The buffer range or image subresource range specified in an acquire operation must match exactly that of a previous release operation. The srcQueueFamilyIndex parameter of the barrier must be set to the source queue family index, and the dstQueueFamilyIndex parameter to the destination queue family index. srcStageMask is ignored for such a barrier, such that no availability operation is executed - the value of this mask does not affect the validity of the barrier. The acquire operation happens-before the visibility operation.

Note

Whilst it is not invalid to provide destination or source access masks for memory barriers used for release or acquire operations, respectively, they have no practical effect. Access after a release operation has undefined results, and so visibility for those accesses has no practical effect. Similarly, write access before an acquire operation will produce undefined results for future access, so availability of those writes has no practical use. In an earlier version of the specification, these were required to match on both sides - but this was subsequently relaxed. These masks should be set to 0.

If the transfer is via an image memory barrier, and an image layout transition is desired, then the values of oldLayout and newLayout in the release memory barrier must be equal to values of oldLayout and newLayout in the acquire memory barrier. Although the image layout transition is submitted twice, it will only be executed once. A layout transition specified in this way happens-after the release operation and happens-before the acquire operation.

If the values of srcQueueFamilyIndex and dstQueueFamilyIndex are equal, no ownership transfer is performed, and the barrier operates as if they were both set to VK_QUEUE_FAMILY_IGNORED.

Queue family ownership transfers may perform read and write accesses on all memory bound to the image subresource or buffer range, so applications must ensure that all memory writes have been made available before a queue family ownership transfer is executed. Available memory is automatically made visible to queue family release and acquire operations, and writes performed by those operations are automatically made available.

Once a queue family has acquired ownership of a buffer range or image subresource range of an VK_SHARING_MODE_EXCLUSIVE resource, its contents are undefined to other queue families unless ownership is transferred. The contents of any portion of another resource which aliases memory that is bound to the transferred buffer or image subresource range are undefined after a release or acquire operation.

6.8. Wait Idle Operations

To wait on the host for the completion of outstanding queue operations for a given queue, call:

VkResult vkQueueWaitIdle(
    VkQueue                                     queue);
  • queue is the queue on which to wait.

vkQueueWaitIdle is equivalent to submitting a fence to a queue and waiting with an infinite timeout for that fence to signal.

Valid Usage (Implicit)
  • queue must be a valid VkQueue handle

Host Synchronization
  • Host access to queue must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_DEVICE_LOST

To wait on the host for the completion of outstanding queue operations for all queues on a given logical device, call:

VkResult vkDeviceWaitIdle(
    VkDevice                                    device);
  • device is the logical device to idle.

vkDeviceWaitIdle is equivalent to calling vkQueueWaitIdle for all queues owned by device.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

Host Synchronization
  • Host access to all VkQueue objects created from device must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_DEVICE_LOST

6.9. Host Write Ordering Guarantees

When batches of command buffers are submitted to a queue via vkQueueSubmit, it defines a memory dependency with prior host operations, and execution of command buffers submitted to the queue.

The first synchronization scope is defined by the host execution model, but includes execution of vkQueueSubmit on the host and anything that happened-before it.

The second synchronization scope includes all commands submitted in the same queue submission, and all commands that occur later in submission order.

The first access scope includes all host writes to mappable device memory that are available to the host memory domain.

The second access scope includes all memory access performed by the device.

6.10. Synchronization and Multiple Physical Devices

If a logical device includes more than one physical device, then fences, semaphores, and events all still have a single instance of the signaled state.

A fence becomes signaled when all physical devices complete the necessary queue operations.

Semaphore wait and signal operations all include a device index that is the sole physical device that performs the operation. These indices are provided in the VkDeviceGroupSubmitInfo and VkDeviceGroupBindSparseInfo structures. Semaphores are not exclusively owned by any physical device. For example, a semaphore can be signaled by one physical device and then waited on by a different physical device.

An event can only be waited on by the same physical device that signaled it (or the host).

6.11. Calibrated timestamps

In order to be able to correlate the time a particular operation took place at on timelines of different time domains (e.g. a device operation vs a host operation), Vulkan allows querying calibrated timestamps from multiple time domains.

To query calibrated timestamps from a set of time domains, call:

VkResult vkGetCalibratedTimestampsEXT(
    VkDevice                                    device,
    uint32_t                                    timestampCount,
    const VkCalibratedTimestampInfoEXT*         pTimestampInfos,
    uint64_t*                                   pTimestamps,
    uint64_t*                                   pMaxDeviation);
  • device is the logical device used to perform the query.

  • timestampCount is the number of timestamps to query.

  • pTimestampInfos is a pointer to an array of timestampCount number of structures of type VkCalibratedTimestampInfoEXT, describing the time domains the calibrated timestamps should be captured from.

  • pTimestamps is a pointer to an array of timestampCount number of 64-bit unsigned integer values in which the requested calibrated timestamp values are returned.

  • pMaxDeviation is a pointer to a 64-bit unsigned integer value in which the strictly positive maximum deviation, in nanoseconds, of the calibrated timestamp values is returned.

Note

The maximum deviation may vary between calls to vkGetCalibratedTimestampsEXT even for the same set of time domains due to implementation and platform specific reasons. It’s the application’s responsibility to assess whether the returned maximum deviation makes the timestamp values suitable for any particular purpose and can choose to re-issue the timestamp calibration call pursuing a lower devation value.

Calibrated timestamp values can be extrapolated to estimate future coinciding timestamp values, however, depending on the nature of the time domains and other properties of the platform extrapolating values over a sufficiently long period of time may no longer be accurate enough to fit any particular purpose so applications are expected to re-calibrate the timestamps on a regular basis.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pTimestampInfos must be a valid pointer to an array of timestampCount valid VkCalibratedTimestampInfoEXT structures

  • pTimestamps must be a valid pointer to an array of timestampCount uint64_t values

  • pMaxDeviation must be a valid pointer to a uint64_t value

  • timestampCount must be greater than 0

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

The VkCalibratedTimestampInfoEXT structure is defined as:

typedef struct VkCalibratedTimestampInfoEXT {
    VkStructureType    sType;
    const void*        pNext;
    VkTimeDomainEXT    timeDomain;
} VkCalibratedTimestampInfoEXT;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • timeDomain is a VkTimeDomainEXT value specifying the time domain from which the calibrated timestamp value should be returned.

Valid Usage
Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_CALIBRATED_TIMESTAMP_INFO_EXT

  • pNext must be NULL

  • timeDomain must be a valid VkTimeDomainEXT value

The set of supported time domains consists of:

typedef enum VkTimeDomainEXT {
    VK_TIME_DOMAIN_DEVICE_EXT = 0,
    VK_TIME_DOMAIN_CLOCK_MONOTONIC_EXT = 1,
    VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT = 2,
    VK_TIME_DOMAIN_QUERY_PERFORMANCE_COUNTER_EXT = 3,
} VkTimeDomainEXT;
  • VK_TIME_DOMAIN_DEVICE_EXT specifies the device time domain. Timestamp values in this time domain are comparable with device timestamp values captured using vkCmdWriteTimestamp and are defined to be incrementing according to the timestampPeriod of the device.

  • VK_TIME_DOMAIN_CLOCK_MONOTONIC_EXT specifies the CLOCK_MONOTONIC time domain available on POSIX platforms.

  • VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT specifies the CLOCK_MONOTONIC_RAW time domain available on POSIX platforms.

  • VK_TIME_DOMAIN_QUERY_PERFORMANCE_COUNTER_EXT specifies the performance counter (QPC) time domain available on Windows.