16. Queries

Queries provide a mechanism to return information about the processing of a sequence of Vulkan commands. Query operations are asynchronous, and as such, their results are not returned immediately. Instead, their results, and their availability status, are stored in a Query Pool. The state of these queries can be read back on the host, or copied to a buffer object on the device.

The supported query types are Occlusion Queries, Pipeline Statistics Queries, and Timestamp Queries.

16.1. Query Pools

Queries are managed using query pool objects. Each query pool is a collection of a specific number of queries of a particular type.

Query pools are represented by VkQueryPool handles:

VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkQueryPool)

To create a query pool, call:

VkResult vkCreateQueryPool(
    VkDevice                                    device,
    const VkQueryPoolCreateInfo*                pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkQueryPool*                                pQueryPool);
  • device is the logical device that creates the query pool.

  • pCreateInfo is a pointer to an instance of the VkQueryPoolCreateInfo structure containing the number and type of queries to be managed by the pool.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

  • pQueryPool is a pointer to a VkQueryPool handle in which the resulting query pool object is returned.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pCreateInfo must be a valid pointer to a valid VkQueryPoolCreateInfo structure

  • If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • pQueryPool must be a valid pointer to a VkQueryPool handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

The VkQueryPoolCreateInfo structure is defined as:

typedef struct VkQueryPoolCreateInfo {
    VkStructureType                  sType;
    const void*                      pNext;
    VkQueryPoolCreateFlags           flags;
    VkQueryType                      queryType;
    uint32_t                         queryCount;
    VkQueryPipelineStatisticFlags    pipelineStatistics;
} VkQueryPoolCreateInfo;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to an extension-specific structure.

  • flags is reserved for future use.

  • queryType is a VkQueryType value specifying the type of queries managed by the pool.

  • queryCount is the number of queries managed by the pool.

  • pipelineStatistics is a bitmask of VkQueryPipelineStatisticFlagBits specifying which counters will be returned in queries on the new pool, as described below in Pipeline Statistics Queries.

pipelineStatistics is ignored if queryType is not VK_QUERY_TYPE_PIPELINE_STATISTICS.

Valid Usage
Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO

  • pNext must be NULL

  • flags must be 0

  • queryType must be a valid VkQueryType value

typedef VkFlags VkQueryPoolCreateFlags;

VkQueryPoolCreateFlags is a bitmask type for setting a mask, but is currently reserved for future use.

To destroy a query pool, call:

void vkDestroyQueryPool(
    VkDevice                                    device,
    VkQueryPool                                 queryPool,
    const VkAllocationCallbacks*                pAllocator);
  • device is the logical device that destroys the query pool.

  • queryPool is the query pool to destroy.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

Valid Usage
  • All submitted commands that refer to queryPool must have completed execution

  • If VkAllocationCallbacks were provided when queryPool was created, a compatible set of callbacks must be provided here

  • If no VkAllocationCallbacks were provided when queryPool was created, pAllocator must be NULL

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • If queryPool is not VK_NULL_HANDLE, queryPool must be a valid VkQueryPool handle

  • If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • If queryPool is a valid handle, it must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to queryPool must be externally synchronized

Possible values of VkQueryPoolCreateInfo::queryType, specifying the type of queries managed by the pool, are:

typedef enum VkQueryType {
    VK_QUERY_TYPE_OCCLUSION = 0,
    VK_QUERY_TYPE_PIPELINE_STATISTICS = 1,
    VK_QUERY_TYPE_TIMESTAMP = 2,
    VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT = 1000028004,
    VK_QUERY_TYPE_ACCELERATION_STRUCTURE_COMPACTED_SIZE_NV = 1000165000,
} VkQueryType;

16.2. Query Operation

The operation of queries is controlled by the commands vkCmdBeginQuery, vkCmdEndQuery, vkCmdResetQueryPool, vkCmdCopyQueryPoolResults, and vkCmdWriteTimestamp.

In order for a VkCommandBuffer to record query management commands, the queue family for which its VkCommandPool was created must support the appropriate type of operations (graphics, compute) suitable for the query type of a given query pool.

Each query in a query pool has a status that is either unavailable or available, and also has state to store the numerical results of a query operation of the type requested when the query pool was created. Resetting a query via vkCmdResetQueryPool sets the status to unavailable and makes the numerical results undefined. Performing a query operation with vkCmdBeginQuery and vkCmdEndQuery changes the status to available when the query finishes, and updates the numerical results. Both the availability status and numerical results are retrieved by calling either vkGetQueryPoolResults or vkCmdCopyQueryPoolResults.

Query commands, for the same query and submitted to the same queue, execute in their entirety in submission order, relative to each other. In effect there is an implicit execution dependency from each such query command to all query command previously submitted to the same queue. There is one significant exception to this; if the flags parameter of vkCmdCopyQueryPoolResults does not include VK_QUERY_RESULT_WAIT_BIT, execution of vkCmdCopyQueryPoolResults may happen-before the results of vkCmdEndQuery are available.

After query pool creation, each query is in an undefined state and must be reset prior to use. Queries must also be reset between uses. Using a query that has not been reset will result in undefined behavior.

If a logical device includes multiple physical devices, then each command that writes a query must execute on a single physical device, and any call to vkCmdBeginQuery must execute the corresponding vkCmdEndQuery command on the same physical device.

To reset a range of queries in a query pool, call:

void vkCmdResetQueryPool(
    VkCommandBuffer                             commandBuffer,
    VkQueryPool                                 queryPool,
    uint32_t                                    firstQuery,
    uint32_t                                    queryCount);
  • commandBuffer is the command buffer into which this command will be recorded.

  • queryPool is the handle of the query pool managing the queries being reset.

  • firstQuery is the initial query index to reset.

  • queryCount is the number of queries to reset.

When executed on a queue, this command sets the status of query indices [firstQuery, firstQuery + queryCount - 1] to unavailable.

Valid Usage
  • firstQuery must be less than the number of queries in queryPool

  • The sum of firstQuery and queryCount must be less than or equal to the number of queries in queryPool

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • queryPool must be a valid VkQueryPool handle

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • This command must only be called outside of a render pass instance

  • Both of commandBuffer, and queryPool must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Outside

Graphics
Compute

Once queries are reset and ready for use, query commands can be issued to a command buffer. Occlusion queries and pipeline statistics queries count events - drawn samples and pipeline stage invocations, respectively - resulting from commands that are recorded between a vkCmdBeginQuery command and a vkCmdEndQuery command within a specified command buffer, effectively scoping a set of drawing and/or compute commands. Timestamp queries write timestamps to a query pool.

A query must begin and end in the same command buffer, although if it is a primary command buffer, and the inherited queries feature is enabled, it can execute secondary command buffers during the query operation. For a secondary command buffer to be executed while a query is active, it must set the occlusionQueryEnable, queryFlags, and/or pipelineStatistics members of VkCommandBufferInheritanceInfo to conservative values, as described in the Command Buffer Recording section. A query must either begin and end inside the same subpass of a render pass instance, or must both begin and end outside of a render pass instance (i.e. contain entire render pass instances).

If queries are used while executing a render pass instance that has multiview enabled, the query uses N consecutive query indices in the query pool (starting at query) where N is the number of bits set in the view mask in the subpass the query is used in. How the numerical results of the query are distributed among the queries is implementation-dependent. For example, some implementations may write each view’s results to a distinct query, while other implementations may write the total result to the first query and write zero to the other queries. However, the sum of the results in all the queries must accurately reflect the total result of the query summed over all views. Applications can sum the results from all the queries to compute the total result.

Queries used with multiview rendering must not span subpasses, i.e. they must begin and end in the same subpass.

To begin a query, call:

void vkCmdBeginQuery(
    VkCommandBuffer                             commandBuffer,
    VkQueryPool                                 queryPool,
    uint32_t                                    query,
    VkQueryControlFlags                         flags);
  • commandBuffer is the command buffer into which this command will be recorded.

  • queryPool is the query pool that will manage the results of the query.

  • query is the query index within the query pool that will contain the results.

  • flags is a bitmask of VkQueryControlFlagBits specifying constraints on the types of queries that can be performed.

If the queryType of the pool is VK_QUERY_TYPE_OCCLUSION and flags contains VK_QUERY_CONTROL_PRECISE_BIT, an implementation must return a result that matches the actual number of samples passed. This is described in more detail in Occlusion Queries.

After beginning a query, that query is considered active within the command buffer it was called in until that same query is ended. Queries active in a primary command buffer when secondary command buffers are executed are considered active for those secondary command buffers.

Valid Usage
  • queryPool must have been created with a queryType that differs from that of any queries that are active within commandBuffer

  • All queries used by the command must be unavailable

  • If the precise occlusion queries feature is not enabled, or the queryType used to create queryPool was not VK_QUERY_TYPE_OCCLUSION, flags must not contain VK_QUERY_CONTROL_PRECISE_BIT

  • query must be less than the number of queries in queryPool

  • If the queryType used to create queryPool was VK_QUERY_TYPE_OCCLUSION, the VkCommandPool that commandBuffer was allocated from must support graphics operations

  • If the queryType used to create queryPool was VK_QUERY_TYPE_PIPELINE_STATISTICS and any of the pipelineStatistics indicate graphics operations, the VkCommandPool that commandBuffer was allocated from must support graphics operations

  • If the queryType used to create queryPool was VK_QUERY_TYPE_PIPELINE_STATISTICS and any of the pipelineStatistics indicate compute operations, the VkCommandPool that commandBuffer was allocated from must support compute operations

  • commandBuffer must not be a protected command buffer

  • If vkCmdBeginQuery is called within a render pass instance, the sum of query and the number of bits set in the current subpass’s view mask must be less than or equal to the number of queries in queryPool

  • If the queryType used to create queryPool was VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT the VkCommandPool that commandBuffer was allocated from must support graphics operations

  • If the queryType used to create queryPool was VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT then VkPhysicalDeviceTransformFeedbackPropertiesEXT::transformFeedbackQueries must be supported

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • queryPool must be a valid VkQueryPool handle

  • flags must be a valid combination of VkQueryControlFlagBits values

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • Both of commandBuffer, and queryPool must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Both

Graphics
Compute

To begin an indexed query, call:

void vkCmdBeginQueryIndexedEXT(
    VkCommandBuffer                             commandBuffer,
    VkQueryPool                                 queryPool,
    uint32_t                                    query,
    VkQueryControlFlags                         flags,
    uint32_t                                    index);
  • commandBuffer is the command buffer into which this command will be recorded.

  • queryPool is the query pool that will manage the results of the query.

  • query is the query index within the query pool that will contain the results.

  • flags is a bitmask of VkQueryControlFlagBits specifying constraints on the types of queries that can be performed.

  • index is the query type specific index. When the query type is VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT the index represents the vertex stream.

The vkCmdBeginQueryIndexedEXT command operates the same as the vkCmdBeginQuery command, except that it also accepts a query type specific index parameter.

Valid Usage
  • queryPool must have been created with a queryType that differs from that of any queries that are active within commandBuffer

  • All queries used by the command must be unavailable

  • If the precise occlusion queries feature is not enabled, or the queryType used to create queryPool was not VK_QUERY_TYPE_OCCLUSION, flags must not contain VK_QUERY_CONTROL_PRECISE_BIT

  • query must be less than the number of queries in queryPool

  • If the queryType used to create queryPool was VK_QUERY_TYPE_OCCLUSION, the VkCommandPool that commandBuffer was allocated from must support graphics operations

  • If the queryType used to create queryPool was VK_QUERY_TYPE_PIPELINE_STATISTICS and any of the pipelineStatistics indicate graphics operations, the VkCommandPool that commandBuffer was allocated from must support graphics operations

  • If the queryType used to create queryPool was VK_QUERY_TYPE_PIPELINE_STATISTICS and any of the pipelineStatistics indicate compute operations, the VkCommandPool that commandBuffer was allocated from must support compute operations

  • commandBuffer must not be a protected command buffer

  • If vkCmdBeginQuery is called within a render pass instance, the sum of query and the number of bits set in the current subpass’s view mask must be less than or equal to the number of queries in queryPool

  • If the queryType used to create queryPool was VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT the VkCommandPool that commandBuffer was allocated from must support graphics operations

  • If the queryType used to create queryPool was VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT the index parameter must be less than VkPhysicalDeviceTransformFeedbackPropertiesEXT::maxTransformFeedbackStreams

  • If the queryType used to create queryPool was not VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT the index must be zero

  • If the queryType used to create queryPool was VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT then VkPhysicalDeviceTransformFeedbackPropertiesEXT::transformFeedbackQueries must be supported

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • queryPool must be a valid VkQueryPool handle

  • flags must be a valid combination of VkQueryControlFlagBits values

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • Both of commandBuffer, and queryPool must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Both

Graphics
Compute

Bits which can be set in vkCmdBeginQuery::flags, specifying constraints on the types of queries that can be performed, are:

typedef enum VkQueryControlFlagBits {
    VK_QUERY_CONTROL_PRECISE_BIT = 0x00000001,
} VkQueryControlFlagBits;
typedef VkFlags VkQueryControlFlags;

VkQueryControlFlags is a bitmask type for setting a mask of zero or more VkQueryControlFlagBits.

To end a query after the set of desired draw or dispatch commands is executed, call:

void vkCmdEndQuery(
    VkCommandBuffer                             commandBuffer,
    VkQueryPool                                 queryPool,
    uint32_t                                    query);
  • commandBuffer is the command buffer into which this command will be recorded.

  • queryPool is the query pool that is managing the results of the query.

  • query is the query index within the query pool where the result is stored.

As queries operate asynchronously, ending a query does not immediately set the query’s status to available. A query is considered finished when the final results of the query are ready to be retrieved by vkGetQueryPoolResults and vkCmdCopyQueryPoolResults, and this is when the query’s status is set to available.

Once a query is ended the query must finish in finite time, unless the state of the query is changed using other commands, e.g. by issuing a reset of the query.

Valid Usage
  • All queries used by the command must be active

  • query must be less than the number of queries in queryPool

  • commandBuffer must not be a protected command buffer

  • If vkCmdEndQuery is called within a render pass instance, the sum of query and the number of bits set in the current subpass’s view mask must be less than or equal to the number of queries in queryPool

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • queryPool must be a valid VkQueryPool handle

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • Both of commandBuffer, and queryPool must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Both

Graphics
Compute

To end an indexed query after the set of desired draw or dispatch commands is recorded, call:

void vkCmdEndQueryIndexedEXT(
    VkCommandBuffer                             commandBuffer,
    VkQueryPool                                 queryPool,
    uint32_t                                    query,
    uint32_t                                    index);
  • commandBuffer is the command buffer into which this command will be recorded.

  • queryPool is the query pool that is managing the results of the query.

  • query is the query index within the query pool where the result is stored.

  • index is the query type specific index.

The vkCmdEndQueryIndexedEXT command operates the same as the vkCmdEndQuery command, except that it also accepts a query type specific index parameter.

Valid Usage
  • All queries used by the command must be active

  • query must be less than the number of queries in queryPool

  • commandBuffer must not be a protected command buffer

  • If vkCmdEndQuery is called within a render pass instance, the sum of query and the number of bits set in the current subpass’s view mask must be less than or equal to the number of queries in queryPool

  • If the queryType used to create queryPool was VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT the index parameter must be less than VkPhysicalDeviceTransformFeedbackPropertiesEXT::maxTransformFeedbackStreams

  • If the queryType used to create queryPool was not VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT the index must be zero

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • queryPool must be a valid VkQueryPool handle

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • Both of commandBuffer, and queryPool must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Both

Graphics
Compute

An application can retrieve results either by requesting they be written into application-provided memory, or by requesting they be copied into a VkBuffer. In either case, the layout in memory is defined as follows:

  • The first query’s result is written starting at the first byte requested by the command, and each subsequent query’s result begins stride bytes later.

  • Each query’s result is a tightly packed array of unsigned integers, either 32- or 64-bits as requested by the command, storing the numerical results and, if requested, the availability status.

  • If VK_QUERY_RESULT_WITH_AVAILABILITY_BIT is used, the final element of each query’s result is an integer indicating whether the query’s result is available, with any non-zero value indicating that it is available.

  • Occlusion queries write one integer value - the number of samples passed. Pipeline statistics queries write one integer value for each bit that is enabled in the pipelineStatistics when the pool is created, and the statistics values are written in bit order starting from the least significant bit. Timestamps write one integer value. Transform feedback queries write two integers; the first integer is the number of primitives successfully written to the corresponding transform feedback buffer and the second is the number of primitives output to the vertex stream, regardless of whether they were successfully captured or not. In other words, if the transform feedback buffer was sized too small for the number of primitives output by the vertex stream, the first integer represents the number of primitives actually written and the second is the number that would have been written if all the transform feedback buffers associated with that vertex stream were large enough.

  • If more than one query is retrieved and stride is not at least as large as the size of the array of integers corresponding to a single query, the values written to memory are undefined.

To retrieve status and results for a set of queries, call:

VkResult vkGetQueryPoolResults(
    VkDevice                                    device,
    VkQueryPool                                 queryPool,
    uint32_t                                    firstQuery,
    uint32_t                                    queryCount,
    size_t                                      dataSize,
    void*                                       pData,
    VkDeviceSize                                stride,
    VkQueryResultFlags                          flags);
  • device is the logical device that owns the query pool.

  • queryPool is the query pool managing the queries containing the desired results.

  • firstQuery is the initial query index.

  • queryCount is the number of queries. firstQuery and queryCount together define a range of queries. For pipeline statistics queries, each query index in the pool contains one integer value for each bit that is enabled in VkQueryPoolCreateInfo::pipelineStatistics when the pool is created.

  • dataSize is the size in bytes of the buffer pointed to by pData.

  • pData is a pointer to a user-allocated buffer where the results will be written

  • stride is the stride in bytes between results for individual queries within pData.

  • flags is a bitmask of VkQueryResultFlagBits specifying how and when results are returned.

If no bits are set in flags, and all requested queries are in the available state, results are written as an array of 32-bit unsigned integer values. The behavior when not all queries are available, is described below.

If VK_QUERY_RESULT_64_BIT is not set and the result overflows a 32-bit value, the value may either wrap or saturate. Similarly, if VK_QUERY_RESULT_64_BIT is set and the result overflows a 64-bit value, the value may either wrap or saturate.

If VK_QUERY_RESULT_WAIT_BIT is set, Vulkan will wait for each query to be in the available state before retrieving the numerical results for that query. In this case, vkGetQueryPoolResults is guaranteed to succeed and return VK_SUCCESS if the queries become available in a finite time (i.e. if they have been issued and not reset). If queries will never finish (e.g. due to being reset but not issued), then vkGetQueryPoolResults may not return in finite time.

If VK_QUERY_RESULT_WAIT_BIT and VK_QUERY_RESULT_PARTIAL_BIT are both not set then no result values are written to pData for queries that are in the unavailable state at the time of the call, and vkGetQueryPoolResults returns VK_NOT_READY. However, availability state is still written to pData for those queries if VK_QUERY_RESULT_WITH_AVAILABILITY_BIT is set.

Note

Applications must take care to ensure that use of the VK_QUERY_RESULT_WAIT_BIT bit has the desired effect.

For example, if a query has been used previously and a command buffer records the commands vkCmdResetQueryPool, vkCmdBeginQuery, and vkCmdEndQuery for that query, then the query will remain in the available state until the vkCmdResetQueryPool command executes on a queue. Applications can use fences or events to ensure that a query has already been reset before checking for its results or availability status. Otherwise, a stale value could be returned from a previous use of the query.

The above also applies when VK_QUERY_RESULT_WAIT_BIT is used in combination with VK_QUERY_RESULT_WITH_AVAILABILITY_BIT. In this case, the returned availability status may reflect the result of a previous use of the query unless the vkCmdResetQueryPool command has been executed since the last use of the query.

Note

Applications can double-buffer query pool usage, with a pool per frame, and reset queries at the end of the frame in which they are read.

If VK_QUERY_RESULT_PARTIAL_BIT is set, VK_QUERY_RESULT_WAIT_BIT is not set, and the query’s status is unavailable, an intermediate result value between zero and the final result value is written to pData for that query.

VK_QUERY_RESULT_PARTIAL_BIT must not be used if the pool’s queryType is VK_QUERY_TYPE_TIMESTAMP.

If VK_QUERY_RESULT_WITH_AVAILABILITY_BIT is set, the final integer value written for each query is non-zero if the query’s status was available or zero if the status was unavailable. When VK_QUERY_RESULT_WITH_AVAILABILITY_BIT is used, implementations must guarantee that if they return a non-zero availability value then the numerical results must be valid, assuming the results are not reset by a subsequent command.

Note

Satisfying this guarantee may require careful ordering by the application, e.g. to read the availability status before reading the results.

Valid Usage
  • firstQuery must be less than the number of queries in queryPool

  • If VK_QUERY_RESULT_64_BIT is not set in flags then pData and stride must be multiples of 4

  • If VK_QUERY_RESULT_64_BIT is set in flags then pData and stride must be multiples of 8

  • The sum of firstQuery and queryCount must be less than or equal to the number of queries in queryPool

  • dataSize must be large enough to contain the result of each query, as described here

  • If the queryType used to create queryPool was VK_QUERY_TYPE_TIMESTAMP, flags must not contain VK_QUERY_RESULT_PARTIAL_BIT

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • queryPool must be a valid VkQueryPool handle

  • pData must be a valid pointer to an array of dataSize bytes

  • flags must be a valid combination of VkQueryResultFlagBits values

  • dataSize must be greater than 0

  • queryPool must have been created, allocated, or retrieved from device

Return Codes
Success
  • VK_SUCCESS

  • VK_NOT_READY

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_DEVICE_LOST

Bits which can be set in vkGetQueryPoolResults::flags and vkCmdCopyQueryPoolResults::flags, specifying how and when results are returned, are:

typedef enum VkQueryResultFlagBits {
    VK_QUERY_RESULT_64_BIT = 0x00000001,
    VK_QUERY_RESULT_WAIT_BIT = 0x00000002,
    VK_QUERY_RESULT_WITH_AVAILABILITY_BIT = 0x00000004,
    VK_QUERY_RESULT_PARTIAL_BIT = 0x00000008,
} VkQueryResultFlagBits;
  • VK_QUERY_RESULT_64_BIT specifies the results will be written as an array of 64-bit unsigned integer values. If this bit is not set, the results will be written as an array of 32-bit unsigned integer values.

  • VK_QUERY_RESULT_WAIT_BIT specifies that Vulkan will wait for each query’s status to become available before retrieving its results.

  • VK_QUERY_RESULT_WITH_AVAILABILITY_BIT specifies that the availability status accompanies the results.

  • VK_QUERY_RESULT_PARTIAL_BIT specifies that returning partial results is acceptable.

typedef VkFlags VkQueryResultFlags;

VkQueryResultFlags is a bitmask type for setting a mask of zero or more VkQueryResultFlagBits.

To copy query statuses and numerical results directly to buffer memory, call:

void vkCmdCopyQueryPoolResults(
    VkCommandBuffer                             commandBuffer,
    VkQueryPool                                 queryPool,
    uint32_t                                    firstQuery,
    uint32_t                                    queryCount,
    VkBuffer                                    dstBuffer,
    VkDeviceSize                                dstOffset,
    VkDeviceSize                                stride,
    VkQueryResultFlags                          flags);
  • commandBuffer is the command buffer into which this command will be recorded.

  • queryPool is the query pool managing the queries containing the desired results.

  • firstQuery is the initial query index.

  • queryCount is the number of queries. firstQuery and queryCount together define a range of queries.

  • dstBuffer is a VkBuffer object that will receive the results of the copy command.

  • dstOffset is an offset into dstBuffer.

  • stride is the stride in bytes between results for individual queries within dstBuffer. The required size of the backing memory for dstBuffer is determined as described above for vkGetQueryPoolResults.

  • flags is a bitmask of VkQueryResultFlagBits specifying how and when results are returned.

vkCmdCopyQueryPoolResults is guaranteed to see the effect of previous uses of vkCmdResetQueryPool in the same queue, without any additional synchronization. Thus, the results will always reflect the most recent use of the query.

flags has the same possible values described above for the flags parameter of vkGetQueryPoolResults, but the different style of execution causes some subtle behavioral differences. Because vkCmdCopyQueryPoolResults executes in order with respect to other query commands, there is less ambiguity about which use of a query is being requested.

If no bits are set in flags, results for all requested queries in the available state are written as 32-bit unsigned integer values, and nothing is written for queries in the unavailable state.

If VK_QUERY_RESULT_64_BIT is set, the results are written as an array of 64-bit unsigned integer values as described for vkGetQueryPoolResults.

If VK_QUERY_RESULT_WAIT_BIT is set, the implementation will wait for each query’s status to be in the available state before retrieving the numerical results for that query. This is guaranteed to reflect the most recent use of the query on the same queue, assuming that the query is not being simultaneously used by other queues. If the query does not become available in a finite amount of time (e.g. due to not issuing a query since the last reset), a VK_ERROR_DEVICE_LOST error may occur.

Similarly, if VK_QUERY_RESULT_WITH_AVAILABILITY_BIT is set and VK_QUERY_RESULT_WAIT_BIT is not set, the availability is guaranteed to reflect the most recent use of the query on the same queue, assuming that the query is not being simultaneously used by other queues. As with vkGetQueryPoolResults, implementations must guarantee that if they return a non-zero availability value, then the numerical results are valid.

If VK_QUERY_RESULT_PARTIAL_BIT is set, VK_QUERY_RESULT_WAIT_BIT is not set, and the query’s status is unavailable, an intermediate result value between zero and the final result value is written for that query.

VK_QUERY_RESULT_PARTIAL_BIT must not be used if the pool’s queryType is VK_QUERY_TYPE_TIMESTAMP.

vkCmdCopyQueryPoolResults is considered to be a transfer operation, and its writes to buffer memory must be synchronized using VK_PIPELINE_STAGE_TRANSFER_BIT and VK_ACCESS_TRANSFER_WRITE_BIT before using the results.

Valid Usage
  • dstOffset must be less than the size of dstBuffer

  • firstQuery must be less than the number of queries in queryPool

  • The sum of firstQuery and queryCount must be less than or equal to the number of queries in queryPool

  • If VK_QUERY_RESULT_64_BIT is not set in flags then dstOffset and stride must be multiples of 4

  • If VK_QUERY_RESULT_64_BIT is set in flags then dstOffset and stride must be multiples of 8

  • dstBuffer must have enough storage, from dstOffset, to contain the result of each query, as described here

  • dstBuffer must have been created with VK_BUFFER_USAGE_TRANSFER_DST_BIT usage flag

  • If dstBuffer is non-sparse then it must be bound completely and contiguously to a single VkDeviceMemory object

  • If the queryType used to create queryPool was VK_QUERY_TYPE_TIMESTAMP, flags must not contain VK_QUERY_RESULT_PARTIAL_BIT

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • queryPool must be a valid VkQueryPool handle

  • dstBuffer must be a valid VkBuffer handle

  • flags must be a valid combination of VkQueryResultFlagBits values

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • This command must only be called outside of a render pass instance

  • Each of commandBuffer, dstBuffer, and queryPool must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Outside

Graphics
Compute

Transfer

Rendering operations such as clears, MSAA resolves, attachment load/store operations, and blits may count towards the results of queries. This behavior is implementation-dependent and may vary depending on the path used within an implementation. For example, some implementations have several types of clears, some of which may include vertices and some not.

16.3. Occlusion Queries

Occlusion queries track the number of samples that pass the per-fragment tests for a set of drawing commands. As such, occlusion queries are only available on queue families supporting graphics operations. The application can then use these results to inform future rendering decisions. An occlusion query is begun and ended by calling vkCmdBeginQuery and vkCmdEndQuery, respectively. When an occlusion query begins, the count of passing samples always starts at zero. For each drawing command, the count is incremented as described in Sample Counting. If flags does not contain VK_QUERY_CONTROL_PRECISE_BIT an implementation may generate any non-zero result value for the query if the count of passing samples is non-zero.

Note

Not setting VK_QUERY_CONTROL_PRECISE_BIT mode may be more efficient on some implementations, and should be used where it is sufficient to know a boolean result on whether any samples passed the per-fragment tests. In this case, some implementations may only return zero or one, indifferent to the actual number of samples passing the per-fragment tests.

When an occlusion query finishes, the result for that query is marked as available. The application can then either copy the result to a buffer (via vkCmdCopyQueryPoolResults) or request it be put into host memory (via vkGetQueryPoolResults).

Note

If occluding geometry is not drawn first, samples can pass the depth test, but still not be visible in a final image.

16.4. Pipeline Statistics Queries

Pipeline statistics queries allow the application to sample a specified set of VkPipeline counters. These counters are accumulated by Vulkan for a set of either draw or dispatch commands while a pipeline statistics query is active. As such, pipeline statistics queries are available on queue families supporting either graphics or compute operations. Further, the availability of pipeline statistics queries is indicated by the pipelineStatisticsQuery member of the VkPhysicalDeviceFeatures object (see vkGetPhysicalDeviceFeatures and vkCreateDevice for detecting and requesting this query type on a VkDevice).

A pipeline statistics query is begun and ended by calling vkCmdBeginQuery and vkCmdEndQuery, respectively. When a pipeline statistics query begins, all statistics counters are set to zero. While the query is active, the pipeline type determines which set of statistics are available, but these must be configured on the query pool when it is created. If a statistic counter is issued on a command buffer that does not support the corresponding operation, the value of that counter is undefined after the query has finished. At least one statistic counter relevant to the operations supported on the recording command buffer must be enabled.

Bits which can be set to individually enable pipeline statistics counters for query pools with VkQueryPoolCreateInfo::pipelineStatistics, and for secondary command buffers with VkCommandBufferInheritanceInfo::pipelineStatistics, are:

typedef enum VkQueryPipelineStatisticFlagBits {
    VK_QUERY_PIPELINE_STATISTIC_INPUT_ASSEMBLY_VERTICES_BIT = 0x00000001,
    VK_QUERY_PIPELINE_STATISTIC_INPUT_ASSEMBLY_PRIMITIVES_BIT = 0x00000002,
    VK_QUERY_PIPELINE_STATISTIC_VERTEX_SHADER_INVOCATIONS_BIT = 0x00000004,
    VK_QUERY_PIPELINE_STATISTIC_GEOMETRY_SHADER_INVOCATIONS_BIT = 0x00000008,
    VK_QUERY_PIPELINE_STATISTIC_GEOMETRY_SHADER_PRIMITIVES_BIT = 0x00000010,
    VK_QUERY_PIPELINE_STATISTIC_CLIPPING_INVOCATIONS_BIT = 0x00000020,
    VK_QUERY_PIPELINE_STATISTIC_CLIPPING_PRIMITIVES_BIT = 0x00000040,
    VK_QUERY_PIPELINE_STATISTIC_FRAGMENT_SHADER_INVOCATIONS_BIT = 0x00000080,
    VK_QUERY_PIPELINE_STATISTIC_TESSELLATION_CONTROL_SHADER_PATCHES_BIT = 0x00000100,
    VK_QUERY_PIPELINE_STATISTIC_TESSELLATION_EVALUATION_SHADER_INVOCATIONS_BIT = 0x00000200,
    VK_QUERY_PIPELINE_STATISTIC_COMPUTE_SHADER_INVOCATIONS_BIT = 0x00000400,
} VkQueryPipelineStatisticFlagBits;
  • VK_QUERY_PIPELINE_STATISTIC_INPUT_ASSEMBLY_VERTICES_BIT specifies that queries managed by the pool will count the number of vertices processed by the input assembly stage. Vertices corresponding to incomplete primitives may contribute to the count.

  • VK_QUERY_PIPELINE_STATISTIC_INPUT_ASSEMBLY_PRIMITIVES_BIT specifies that queries managed by the pool will count the number of primitives processed by the input assembly stage. If primitive restart is enabled, restarting the primitive topology has no effect on the count. Incomplete primitives may be counted.

  • VK_QUERY_PIPELINE_STATISTIC_VERTEX_SHADER_INVOCATIONS_BIT specifies that queries managed by the pool will count the number of vertex shader invocations. This counter’s value is incremented each time a vertex shader is invoked.

  • VK_QUERY_PIPELINE_STATISTIC_GEOMETRY_SHADER_INVOCATIONS_BIT specifies that queries managed by the pool will count the number of geometry shader invocations. This counter’s value is incremented each time a geometry shader is invoked. In the case of instanced geometry shaders, the geometry shader invocations count is incremented for each separate instanced invocation.

  • VK_QUERY_PIPELINE_STATISTIC_GEOMETRY_SHADER_PRIMITIVES_BIT specifies that queries managed by the pool will count the number of primitives generated by geometry shader invocations. The counter’s value is incremented each time the geometry shader emits a primitive. Restarting primitive topology using the SPIR-V instructions OpEndPrimitive or OpEndStreamPrimitive has no effect on the geometry shader output primitives count.

  • VK_QUERY_PIPELINE_STATISTIC_CLIPPING_INVOCATIONS_BIT specifies that queries managed by the pool will count the number of primitives processed by the Primitive Clipping stage of the pipeline. The counter’s value is incremented each time a primitive reaches the primitive clipping stage.

  • VK_QUERY_PIPELINE_STATISTIC_CLIPPING_PRIMITIVES_BIT specifies that queries managed by the pool will count the number of primitives output by the Primitive Clipping stage of the pipeline. The counter’s value is incremented each time a primitive passes the primitive clipping stage. The actual number of primitives output by the primitive clipping stage for a particular input primitive is implementation-dependent but must satisfy the following conditions:

    • If at least one vertex of the input primitive lies inside the clipping volume, the counter is incremented by one or more.

    • Otherwise, the counter is incremented by zero or more.

  • VK_QUERY_PIPELINE_STATISTIC_FRAGMENT_SHADER_INVOCATIONS_BIT specifies that queries managed by the pool will count the number of fragment shader invocations. The counter’s value is incremented each time the fragment shader is invoked.

  • VK_QUERY_PIPELINE_STATISTIC_TESSELLATION_CONTROL_SHADER_PATCHES_BIT specifies that queries managed by the pool will count the number of patches processed by the tessellation control shader. The counter’s value is incremented once for each patch for which a tessellation control shader is invoked.

  • VK_QUERY_PIPELINE_STATISTIC_TESSELLATION_EVALUATION_SHADER_INVOCATIONS_BIT specifies that queries managed by the pool will count the number of invocations of the tessellation evaluation shader. The counter’s value is incremented each time the tessellation evaluation shader is invoked.

  • VK_QUERY_PIPELINE_STATISTIC_COMPUTE_SHADER_INVOCATIONS_BIT specifies that queries managed by the pool will count the number of compute shader invocations. The counter’s value is incremented every time the compute shader is invoked. Implementations may skip the execution of certain compute shader invocations or execute additional compute shader invocations for implementation-dependent reasons as long as the results of rendering otherwise remain unchanged.

These values are intended to measure relative statistics on one implementation. Various device architectures will count these values differently. Any or all counters may be affected by the issues described in Query Operation.

Note

For example, tile-based rendering devices may need to replay the scene multiple times, affecting some of the counts.

If a pipeline has rasterizerDiscardEnable enabled, implementations may discard primitives after the final vertex processing stage. As a result, if rasterizerDiscardEnable is enabled, the clipping input and output primitives counters may not be incremented.

When a pipeline statistics query finishes, the result for that query is marked as available. The application can copy the result to a buffer (via vkCmdCopyQueryPoolResults), or request it be put into host memory (via vkGetQueryPoolResults).

typedef VkFlags VkQueryPipelineStatisticFlags;

VkQueryPipelineStatisticFlags is a bitmask type for setting a mask of zero or more VkQueryPipelineStatisticFlagBits.

16.5. Timestamp Queries

Timestamps provide applications with a mechanism for timing the execution of commands. A timestamp is an integer value generated by the VkPhysicalDevice. Unlike other queries, timestamps do not operate over a range, and so do not use vkCmdBeginQuery or vkCmdEndQuery. The mechanism is built around a set of commands that allow the application to tell the VkPhysicalDevice to write timestamp values to a query pool and then either read timestamp values on the host (using vkGetQueryPoolResults) or copy timestamp values to a VkBuffer (using vkCmdCopyQueryPoolResults). The application can then compute differences between timestamps to determine execution time.

The number of valid bits in a timestamp value is determined by the VkQueueFamilyProperties::timestampValidBits property of the queue on which the timestamp is written. Timestamps are supported on any queue which reports a non-zero value for timestampValidBits via vkGetPhysicalDeviceQueueFamilyProperties. If the timestampComputeAndGraphics limit is VK_TRUE, timestamps are supported by every queue family that supports either graphics or compute operations (see VkQueueFamilyProperties).

The number of nanoseconds it takes for a timestamp value to be incremented by 1 can be obtained from VkPhysicalDeviceLimits::timestampPeriod after a call to vkGetPhysicalDeviceProperties.

To request a timestamp, call:

void vkCmdWriteTimestamp(
    VkCommandBuffer                             commandBuffer,
    VkPipelineStageFlagBits                     pipelineStage,
    VkQueryPool                                 queryPool,
    uint32_t                                    query);
  • commandBuffer is the command buffer into which the command will be recorded.

  • pipelineStage is one of the VkPipelineStageFlagBits, specifying a stage of the pipeline.

  • queryPool is the query pool that will manage the timestamp.

  • query is the query within the query pool that will contain the timestamp.

vkCmdWriteTimestamp latches the value of the timer when all previous commands have completed executing as far as the specified pipeline stage, and writes the timestamp value to memory. When the timestamp value is written, the availability status of the query is set to available.

Note

If an implementation is unable to detect completion and latch the timer at any specific stage of the pipeline, it may instead do so at any logically later stage.

vkCmdCopyQueryPoolResults can then be called to copy the timestamp value from the query pool into buffer memory, with ordering and synchronization behavior equivalent to how other queries operate. Timestamp values can also be retrieved from the query pool using vkGetQueryPoolResults. As with other queries, the query must be reset using vkCmdResetQueryPool before requesting the timestamp value be written to it.

While vkCmdWriteTimestamp can be called inside or outside of a render pass instance, vkCmdCopyQueryPoolResults must only be called outside of a render pass instance.

Timestamps may only be meaningfully compared if they are written by commands submitted to the same queue.

Note

An example of such a comparison is determining the execution time of a sequence of commands.

If vkCmdWriteTimestamp is called while executing a render pass instance that has multiview enabled, the timestamp uses N consecutive query indices in the query pool (starting at query) where N is the number of bits set in the view mask of the subpass the command is executed in. The resulting query values are determined by an implementation-dependent choice of one of the following behaviors:

  • The first query is a timestamp value and (if more than one bit is set in the view mask) zero is written to the remaining queries. If two timestamps are written in the same subpass, the sum of the execution time of all views between those commands is the difference between the first query written by each command.

  • All N queries are timestamp values. If two timestamps are written in the same subpass, the sum of the execution time of all views between those commands is the sum of the difference between corresponding queries written by each command. The difference between corresponding queries may be the execution time of a single view.

In either case, the application can sum the differences between all N queries to determine the total execution time.

Valid Usage
  • queryPool must have been created with a queryType of VK_QUERY_TYPE_TIMESTAMP

  • The query identified by queryPool and query must be unavailable

  • The command pool’s queue family must support a non-zero timestampValidBits

  • All queries used by the command must be unavailable

  • If vkCmdWriteTimestamp is called within a render pass instance, the sum of query and the number of bits set in the current subpass’s view mask must be less than or equal to the number of queries in queryPool

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • pipelineStage must be a valid VkPipelineStageFlagBits value

  • queryPool must be a valid VkQueryPool handle

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support transfer, graphics, or compute operations

  • Both of commandBuffer, and queryPool must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Both

Transfer
Graphics
Compute

Transfer

16.6. Transform Feedback Queries

Transform feedback queries track the number of primitives attempted to be written and actually written, by the vertex stream being captured, to a transform feedback buffer. This query is updated during draw commands while transform feedback is active. The number of primitives actually written will be less than the number attempted to be written if the bound transform feedback buffer size was too small for the number of primitives actually drawn. Primitives are not written beyond the bound range of the transform feedback buffer. A transform feedback query is begun and ended by calling vkCmdBeginQuery and vkCmdEndQuery, respectively to query for vertex stream zero. vkCmdBeginQueryIndexedEXT and vkCmdEndQueryIndexedEXT can be used to begin and end transform feedback queries for any supported vertex stream. When a transform feedback query begins, the count of primitives written and primitives needed starts from zero. For each drawing command, the count is incremented as vertex attribute outputs are captured to the transform feedback buffers while transform feedback is active.

When a transform feedback query finishes, the result for that query is marked as available. The application can then either copy the result to a buffer (via vkCmdCopyQueryPoolResults) or request it be put into host memory (via vkGetQueryPoolResults).