28. Dispatching Commands

Dispatching commands (commands with Dispatch in the name) provoke work in a compute pipeline. Dispatching commands are recorded into a command buffer and when executed by a queue, will produce work which executes according to the bound compute pipeline. A compute pipeline must be bound to a command buffer before any dispatch commands are recorded in that command buffer.

To record a dispatch, call:

void vkCmdDispatch(
    VkCommandBuffer                             commandBuffer,
    uint32_t                                    groupCountX,
    uint32_t                                    groupCountY,
    uint32_t                                    groupCountZ);
  • commandBuffer is the command buffer into which the command will be recorded.

  • groupCountX is the number of local workgroups to dispatch in the X dimension.

  • groupCountY is the number of local workgroups to dispatch in the Y dimension.

  • groupCountZ is the number of local workgroups to dispatch in the Z dimension.

When the command is executed, a global workgroup consisting of groupCountX × groupCountY × groupCountZ local workgroups is assembled.

Valid Usage
  • groupCountX must be less than or equal to VkPhysicalDeviceLimits::maxComputeWorkGroupCount[0]

  • groupCountY must be less than or equal to VkPhysicalDeviceLimits::maxComputeWorkGroupCount[1]

  • groupCountZ must be less than or equal to VkPhysicalDeviceLimits::maxComputeWorkGroupCount[2]

  • For each set n that is statically used by the VkPipeline bound to VK_PIPELINE_BIND_POINT_COMPUTE, a descriptor set must have been bound to n at VK_PIPELINE_BIND_POINT_COMPUTE, with a VkPipelineLayout that is compatible for set n, with the VkPipelineLayout used to create the current VkPipeline, as described in Pipeline Layout Compatibility

  • Descriptors in each bound descriptor set, specified via vkCmdBindDescriptorSets, must be valid if they are statically used by the bound VkPipeline object, specified via vkCmdBindPipeline

  • A valid compute pipeline must be bound to the current command buffer with VK_PIPELINE_BIND_POINT_COMPUTE

  • For each push constant that is statically used by the VkPipeline bound to VK_PIPELINE_BIND_POINT_COMPUTE, a push constant value must have been set for VK_PIPELINE_BIND_POINT_COMPUTE, with a VkPipelineLayout that is compatible for push constants with the one used to create the current VkPipeline, as described in Pipeline Layout Compatibility

  • If any VkSampler object that is accessed from a shader by the VkPipeline bound to VK_PIPELINE_BIND_POINT_COMPUTE uses unnormalized coordinates, it must not be used to sample from any VkImage with a VkImageView of the type VK_IMAGE_VIEW_TYPE_3D, VK_IMAGE_VIEW_TYPE_CUBE, VK_IMAGE_VIEW_TYPE_1D_ARRAY, VK_IMAGE_VIEW_TYPE_2D_ARRAY or VK_IMAGE_VIEW_TYPE_CUBE_ARRAY, in any shader stage

  • If any VkSampler object that is accessed from a shader by the VkPipeline bound to VK_PIPELINE_BIND_POINT_COMPUTE uses unnormalized coordinates, it must not be used with any of the SPIR-V OpImageSample* or OpImageSparseSample* instructions with ImplicitLod, Dref or Proj in their name, in any shader stage

  • If any VkSampler object that is accessed from a shader by the VkPipeline bound to VK_PIPELINE_BIND_POINT_COMPUTE uses unnormalized coordinates, it must not be used with any of the SPIR-V OpImageSample* or OpImageSparseSample* instructions that includes a LOD bias or any offset values, in any shader stage

  • If the robust buffer access feature is not enabled, and any shader stage in the VkPipeline object bound to VK_PIPELINE_BIND_POINT_COMPUTE accesses a uniform buffer, it must not access values outside of the range of that buffer specified in the bound descriptor set

  • If the robust buffer access feature is not enabled, and any shader stage in the VkPipeline object bound to VK_PIPELINE_BIND_POINT_COMPUTE accesses a storage buffer, it must not access values outside of the range of that buffer specified in the bound descriptor set

  • If a VkImageView is sampled with VK_FILTER_LINEAR as a result of this command, then the image view’s format features must contain VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT.

  • If a VkImageView is sampled with VK_FILTER_CUBIC_IMG as a result of this command, then the image view’s format features must contain VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_CUBIC_BIT_IMG.

  • Any VkImageView being sampled with VK_FILTER_CUBIC_IMG as a result of this command must not have a VkImageViewType of VK_IMAGE_VIEW_TYPE_3D, VK_IMAGE_VIEW_TYPE_CUBE, or VK_IMAGE_VIEW_TYPE_CUBE_ARRAY

  • If commandBuffer is an unprotected command buffer, and any pipeline stage in the VkPipeline object bound to VK_PIPELINE_BIND_POINT_COMPUTE reads from or writes to any image or buffer, that image or buffer must not be a protected image or protected buffer.

  • If commandBuffer is a protected command buffer, and any pipeline stage in the VkPipeline object bound to VK_PIPELINE_BIND_POINT_COMPUTE writes to any image or buffer, that image or buffer must not be an unprotected image or unprotected buffer.

  • If commandBuffer is a protected command buffer, and any pipeline stage other than the compute pipeline stage in the VkPipeline object bound to VK_PIPELINE_BIND_POINT_COMPUTE reads from any image or buffer, the image or buffer must not be a protected image or protected buffer.

  • Any VkImage created with a VkImageCreateInfo::flags containing VK_IMAGE_CREATE_CORNER_SAMPLED_BIT_NV sampled as a result of this command must only be sampled using a VkSamplerAddressMode of VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE.

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support compute operations

  • This command must only be called outside of a render pass instance

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Outside

Compute

Compute

To record an indirect command dispatch, call:

void vkCmdDispatchIndirect(
    VkCommandBuffer                             commandBuffer,
    VkBuffer                                    buffer,
    VkDeviceSize                                offset);
  • commandBuffer is the command buffer into which the command will be recorded.

  • buffer is the buffer containing dispatch parameters.

  • offset is the byte offset into buffer where parameters begin.

vkCmdDispatchIndirect behaves similarly to vkCmdDispatch except that the parameters are read by the device from a buffer during execution. The parameters of the dispatch are encoded in a VkDispatchIndirectCommand structure taken from buffer starting at offset.

Valid Usage
  • If buffer is non-sparse then it must be bound completely and contiguously to a single VkDeviceMemory object

  • For each set n that is statically used by the VkPipeline bound to VK_PIPELINE_BIND_POINT_COMPUTE, a descriptor set must have been bound to n at VK_PIPELINE_BIND_POINT_COMPUTE, with a VkPipelineLayout that is compatible for set n, with the VkPipelineLayout used to create the current VkPipeline, as described in Pipeline Layout Compatibility

  • Descriptors in each bound descriptor set, specified via vkCmdBindDescriptorSets, must be valid if they are statically used by the bound VkPipeline object, specified via vkCmdBindPipeline

  • A valid compute pipeline must be bound to the current command buffer with VK_PIPELINE_BIND_POINT_COMPUTE

  • buffer must have been created with the VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT bit set

  • offset must be a multiple of 4

  • The sum of offset and the size of VkDispatchIndirectCommand must be less than or equal to the size of buffer

  • For each push constant that is statically used by the VkPipeline bound to VK_PIPELINE_BIND_POINT_COMPUTE, a push constant value must have been set for VK_PIPELINE_BIND_POINT_COMPUTE, with a VkPipelineLayout that is compatible for push constants with the one used to create the current VkPipeline, as described in Pipeline Layout Compatibility

  • If any VkSampler object that is accessed from a shader by the VkPipeline bound to VK_PIPELINE_BIND_POINT_COMPUTE uses unnormalized coordinates, it must not be used to sample from any VkImage with a VkImageView of the type VK_IMAGE_VIEW_TYPE_3D, VK_IMAGE_VIEW_TYPE_CUBE, VK_IMAGE_VIEW_TYPE_1D_ARRAY, VK_IMAGE_VIEW_TYPE_2D_ARRAY or VK_IMAGE_VIEW_TYPE_CUBE_ARRAY, in any shader stage

  • If any VkSampler object that is accessed from a shader by the VkPipeline bound to VK_PIPELINE_BIND_POINT_COMPUTE uses unnormalized coordinates, it must not be used with any of the SPIR-V OpImageSample* or OpImageSparseSample* instructions with ImplicitLod, Dref or Proj in their name, in any shader stage

  • If any VkSampler object that is accessed from a shader by the VkPipeline bound to VK_PIPELINE_BIND_POINT_COMPUTE uses unnormalized coordinates, it must not be used with any of the SPIR-V OpImageSample* or OpImageSparseSample* instructions that includes a LOD bias or any offset values, in any shader stage

  • If the robust buffer access feature is not enabled, and any shader stage in the VkPipeline object bound to VK_PIPELINE_BIND_POINT_COMPUTE accesses a uniform buffer, it must not access values outside of the range of that buffer specified in the bound descriptor set

  • If the robust buffer access feature is not enabled, and any shader stage in the VkPipeline object bound to VK_PIPELINE_BIND_POINT_COMPUTE accesses a storage buffer, it must not access values outside of the range of that buffer specified in the bound descriptor set

  • If a VkImageView is sampled with VK_FILTER_LINEAR as a result of this command, then the image view’s format features must contain VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT.

  • If a VkImageView is sampled with VK_FILTER_CUBIC_IMG as a result of this command, then the image view’s format features must contain VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_CUBIC_BIT_IMG.

  • Any VkImageView being sampled with VK_FILTER_CUBIC_IMG as a result of this command must not have a VkImageViewType of VK_IMAGE_VIEW_TYPE_3D, VK_IMAGE_VIEW_TYPE_CUBE, or VK_IMAGE_VIEW_TYPE_CUBE_ARRAY

  • If commandBuffer is an unprotected command buffer, and any pipeline stage in the VkPipeline object bound to VK_PIPELINE_BIND_POINT_COMPUTE reads from or writes to any image or buffer, that image or buffer must not be a protected image or protected buffer.

  • If commandBuffer is a protected command buffer, and any pipeline stage in the VkPipeline object bound to VK_PIPELINE_BIND_POINT_COMPUTE writes to any image or buffer, that image or buffer must not be an unprotected image or unprotected buffer.

  • If commandBuffer is a protected command buffer, and any pipeline stage other than the compute pipeline stage in the VkPipeline object bound to VK_PIPELINE_BIND_POINT_COMPUTE reads from any image or buffer, the image or buffer must not be a protected image or protected buffer.

  • Any VkImage created with a VkImageCreateInfo::flags containing VK_IMAGE_CREATE_CORNER_SAMPLED_BIT_NV sampled as a result of this command must only be sampled using a VkSamplerAddressMode of VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE.

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • buffer must be a valid VkBuffer handle

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support compute operations

  • This command must only be called outside of a render pass instance

  • Both of buffer, and commandBuffer must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Outside

Compute

Compute

The VkDispatchIndirectCommand structure is defined as:

typedef struct VkDispatchIndirectCommand {
    uint32_t    x;
    uint32_t    y;
    uint32_t    z;
} VkDispatchIndirectCommand;
  • x is the number of local workgroups to dispatch in the X dimension.

  • y is the number of local workgroups to dispatch in the Y dimension.

  • z is the number of local workgroups to dispatch in the Z dimension.

The members of VkDispatchIndirectCommand have the same meaning as the corresponding parameters of vkCmdDispatch.

Valid Usage
  • x must be less than or equal to VkPhysicalDeviceLimits::maxComputeWorkGroupCount[0]

  • y must be less than or equal to VkPhysicalDeviceLimits::maxComputeWorkGroupCount[1]

  • z must be less than or equal to VkPhysicalDeviceLimits::maxComputeWorkGroupCount[2]

To record a dispatch using non-zero base values for the components of WorkgroupId, call:

void vkCmdDispatchBase(
    VkCommandBuffer                             commandBuffer,
    uint32_t                                    baseGroupX,
    uint32_t                                    baseGroupY,
    uint32_t                                    baseGroupZ,
    uint32_t                                    groupCountX,
    uint32_t                                    groupCountY,
    uint32_t                                    groupCountZ);

or the equivalent command

void vkCmdDispatchBaseKHR(
    VkCommandBuffer                             commandBuffer,
    uint32_t                                    baseGroupX,
    uint32_t                                    baseGroupY,
    uint32_t                                    baseGroupZ,
    uint32_t                                    groupCountX,
    uint32_t                                    groupCountY,
    uint32_t                                    groupCountZ);
  • commandBuffer is the command buffer into which the command will be recorded.

  • baseGroupX is the start value for the X component of WorkgroupId.

  • baseGroupY is the start value for the Y component of WorkgroupId.

  • baseGroupZ is the start value for the Z component of WorkgroupId.

  • groupCountX is the number of local workgroups to dispatch in the X dimension.

  • groupCountY is the number of local workgroups to dispatch in the Y dimension.

  • groupCountZ is the number of local workgroups to dispatch in the Z dimension.

When the command is executed, a global workgroup consisting of groupCountX × groupCountY × groupCountZ local workgroups is assembled, with WorkgroupId values ranging from [baseGroup, baseGroup + groupCount) in each component. vkCmdDispatch is equivalent to vkCmdDispatchBase(0,0,0,groupCountX,groupCountY,groupCountZ).

Valid Usage
  • All valid usage rules from vkCmdDispatch apply

  • baseGroupX must be less than VkPhysicalDeviceLimits::maxComputeWorkGroupCount[0]

  • baseGroupX must be less than VkPhysicalDeviceLimits::maxComputeWorkGroupCount[1]

  • baseGroupZ must be less than VkPhysicalDeviceLimits::maxComputeWorkGroupCount[2]

  • groupCountX must be less than or equal to VkPhysicalDeviceLimits::maxComputeWorkGroupCount[0] minus baseGroupX

  • groupCountY must be less than or equal to VkPhysicalDeviceLimits::maxComputeWorkGroupCount[1] minus baseGroupY

  • groupCountZ must be less than or equal to VkPhysicalDeviceLimits::maxComputeWorkGroupCount[2] minus baseGroupZ

  • If any of baseGroupX, baseGroupY, or baseGroupZ are not zero, then the bound compute pipeline must have been created with the VK_PIPELINE_CREATE_DISPATCH_BASE flag.

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support compute operations

  • This command must only be called outside of a render pass instance

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Outside

Compute