Appendix A: Vulkan Environment for SPIR-V

Shaders for Vulkan are defined by the Khronos SPIR-V Specification as well as the Khronos SPIR-V Extended Instructions for GLSL Specification. This appendix defines additional SPIR-V requirements applying to Vulkan shaders.

Versions and Formats

A Vulkan 1.1 implementation must support the 1.0, 1.1, 1.2, and 1.3 versions of SPIR-V and the 1.0 version of the SPIR-V Extended Instructions for GLSL.

A SPIR-V module passed into vkCreateShaderModule is interpreted as a series of 32-bit words in host endianness, with literal strings packed as described in section 2.2 of the SPIR-V Specification. The first few words of the SPIR-V module must be a magic number and a SPIR-V version number, as described in section 2.3 of the SPIR-V Specification.

Capabilities

The SPIR-V capabilities listed below must be supported if the corresponding feature or extension is enabled, or if no features or extensions are listed for that capability. Extensions are only listed when there is not also a feature bit associated with that capability.

Table 77. List of SPIR-V Capabilities and enabling features or extensions
SPIR-V OpCapability Vulkan feature or extension name

Matrix

Shader

InputAttachment

Sampled1D

Image1D

SampledBuffer

ImageBuffer

ImageQuery

DerivativeControl

Geometry

geometryShader

Tessellation

tessellationShader

Float64

shaderFloat64

Int64

shaderInt64

Int16

shaderInt16

TessellationPointSize

shaderTessellationAndGeometryPointSize

GeometryPointSize

shaderTessellationAndGeometryPointSize

ImageGatherExtended

shaderImageGatherExtended

StorageImageMultisample

shaderStorageImageMultisample

UniformBufferArrayDynamicIndexing

shaderUniformBufferArrayDynamicIndexing

SampledImageArrayDynamicIndexing

shaderSampledImageArrayDynamicIndexing

StorageBufferArrayDynamicIndexing

shaderStorageBufferArrayDynamicIndexing

StorageImageArrayDynamicIndexing

shaderStorageImageArrayDynamicIndexing

ClipDistance

shaderClipDistance

CullDistance

shaderCullDistance

ImageCubeArray

imageCubeArray

SampleRateShading

sampleRateShading

SparseResidency

shaderResourceResidency

MinLod

shaderResourceMinLod

SampledCubeArray

imageCubeArray

ImageMSArray

shaderStorageImageMultisample

StorageImageExtendedFormats

shaderStorageImageExtendedFormats

InterpolationFunction

sampleRateShading

StorageImageReadWithoutFormat

shaderStorageImageReadWithoutFormat

StorageImageWriteWithoutFormat

shaderStorageImageWriteWithoutFormat

MultiViewport

multiViewport

DrawParameters

shaderDrawParameters or VK_KHR_shader_draw_parameters

MultiView

DeviceGroup

VariablePointersStorageBuffer

variablePointersStorageBuffer

VariablePointers

variablePointers

StencilExportEXT

VK_EXT_shader_stencil_export

SubgroupBallotKHR

VK_EXT_shader_subgroup_ballot

SubgroupVoteKHR

VK_EXT_shader_subgroup_vote

ImageReadWriteLodAMD

VK_AMD_shader_image_load_store_lod

ImageGatherBiasLodAMD

VK_AMD_texture_gather_bias_lod

FragmentMaskAMD

VK_AMD_shader_fragment_mask

SampleMaskOverrideCoverageNV

VK_NV_sample_mask_override_coverage

GeometryShaderPassthroughNV

VK_NV_geometry_shader_passthrough

ShaderViewportIndexLayerEXT

VK_EXT_shader_viewport_index_layer

ShaderViewportIndexLayerNV

VK_NV_viewport_array2

ShaderViewportMaskNV

VK_NV_viewport_array2

PerViewAttributesNV

VK_NVX_multiview_per_view_attributes

StorageBuffer16BitAccess

StorageBuffer16BitAccess

UniformAndStorageBuffer16BitAccess

UniformAndStorageBuffer16BitAccess

StoragePushConstant16

storagePushConstant16

StorageInputOutput16

storageInputOutput16

GroupNonUniform

VK_SUBGROUP_FEATURE_BASIC_BIT

GroupNonUniformVote

VK_SUBGROUP_FEATURE_VOTE_BIT

GroupNonUniformArithmetic

VK_SUBGROUP_FEATURE_ARITHMETIC_BIT

GroupNonUniformBallot

VK_SUBGROUP_FEATURE_BALLOT_BIT

GroupNonUniformShuffle

VK_SUBGROUP_FEATURE_SHUFFLE_BIT

GroupNonUniformShuffleRelative

VK_SUBGROUP_FEATURE_SHUFFLE_RELATIVE_BIT

GroupNonUniformClustered

VK_SUBGROUP_FEATURE_CLUSTERED_BIT

GroupNonUniformQuad

VK_SUBGROUP_FEATURE_QUAD_BIT

GroupNonUniformPartitionedNV

VK_SUBGROUP_FEATURE_PARTITIONED_BIT_NV

SampleMaskPostDepthCoverage

VK_EXT_post_depth_coverage

ShaderNonUniformEXT

VK_EXT_descriptor_indexing

RuntimeDescriptorArrayEXT

runtimeDescriptorArray

InputAttachmentArrayDynamicIndexingEXT

shaderInputAttachmentArrayDynamicIndexing

UniformTexelBufferArrayDynamicIndexingEXT

shaderUniformTexelBufferArrayDynamicIndexing

StorageTexelBufferArrayDynamicIndexingEXT

shaderStorageTexelBufferArrayDynamicIndexing

UniformBufferArrayNonUniformIndexingEXT

shaderUniformBufferArrayNonUniformIndexing

SampledImageArrayNonUniformIndexingEXT

shaderSampledImageArrayNonUniformIndexing

StorageBufferArrayNonUniformIndexingEXT

shaderStorageBufferArrayNonUniformIndexing

StorageImageArrayNonUniformIndexingEXT

shaderStorageImageArrayNonUniformIndexing

InputAttachmentArrayNonUniformIndexingEXT

shaderInputAttachmentArrayNonUniformIndexing

UniformTexelBufferArrayNonUniformIndexingEXT

shaderUniformTexelBufferArrayNonUniformIndexing

StorageTexelBufferArrayNonUniformIndexingEXT

shaderStorageTexelBufferArrayNonUniformIndexing

Float16

VK_AMD_gpu_shader_half_float

StorageBuffer8BitAccess

StorageBuffer8BitAccess

UniformAndStorageBuffer8BitAccess

UniformAndStorageBuffer8BitAccess

StoragePushConstant8

StoragePushConstant8

VulkanMemoryModelKHR

vulkanMemoryModel

VulkanMemoryModelDeviceScopeKHR

vulkanMemoryModelDeviceScope

ComputeDerivativeGroupQuadsNV

computeDerivativeGroupQuads

ComputeDerivativeGroupLinearNV

computeDerivativeGroupLinear

FragmentBarycentricNV

fragmentShaderBarycentric

ImageFootprintNV

imageFootprint

ShadingRateImageNV

shadingRateImage

MeshShadingNV

VK_NV_mesh_shader

RaytracingNVX

VK_NVX_raytracing

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_KHR_variable_pointers SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_AMD_shader_explicit_vertex_parameter SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_AMD_gcn_shader SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_AMD_gpu_shader_half_float SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_AMD_gpu_shader_int16 SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_AMD_shader_ballot SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_AMD_shader_fragment_mask SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_AMD_shader_image_load_store_lod SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_AMD_shader_trinary_minmax SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_AMD_texture_gather_bias_lod SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_KHR_shader_draw_parameters SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_KHR_8bit_storage SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_KHR_16bit_storage SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_KHR_storage_buffer_storage_class SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_KHR_post_depth_coverage SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_EXT_shader_stencil_export SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_KHR_shader_ballot SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_KHR_subgroup_vote SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_NV_sample_mask_override_coverage SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_NV_geometry_shader_passthrough SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_NV_mesh_shader SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_NV_viewport_array2 SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_EXT_shader_viewport_index_layer SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_NVX_multiview_per_view_attributes SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_EXT_descriptor_indexing SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_KHR_vulkan_memory_model SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_NV_compute_shader_derivatives SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_NV_fragment_shader_barycentric SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_NV_shader_image_footprint SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_NV_shading_rate SPIR-V extension.

The application can pass a SPIR-V module to vkCreateShaderModule that uses the SPV_NVX_raytracing SPIR-V extension.

The application must not pass a SPIR-V module containing any of the following to vkCreateShaderModule:

  • any OpCapability not listed above,

  • an unsupported capability, or

  • a capability which corresponds to a Vulkan feature or extension which has not been enabled.

Validation Rules within a Module

A SPIR-V module passed to vkCreateShaderModule must conform to the following rules:

  • Every entry point must have no return value and accept no arguments.

  • Recursion: The static function-call graph for an entry point must not contain cycles.

  • The Logical addressing model must be selected.

  • Scope for execution must be limited to:

    • Workgroup

    • Subgroup

  • Scope for memory must be limited to:

  • Scope for Non Uniform Group Operations must be limited to:

    • Subgroup

  • Storage Class must be limited to:

    • UniformConstant

    • Input

    • Uniform

    • Output

    • Workgroup

    • Private

    • Function

    • PushConstant

    • Image

    • StorageBuffer

  • Memory semantics must obey the following rules:

    • Acquire must not be used with OpAtomicStore.

    • Release must not be used with OpAtomicLoad.

    • AcquireRelease must not be used with OpAtomicStore or OpAtomicLoad.

    • Sequentially consistent atomics and barriers are not supported and SequentiallyConsistent is treated as AcquireRelease. SequentiallyConsistent should not be used.

    • OpMemoryBarrier must use one of Acquire, Release, AcquireRelease, or SequentiallyConsistent and must include at least one storage class.

    • If the semantics for OpControlBarrier includes one of Acquire, Release, AcquireRelease, or SequentiallyConsistent, then it must include at least one storage class.

    • SubgroupMemory, CrossWorkgroupMemory, and AtomicCounterMemory are ignored.

  • Any OpVariable with an Initializer operand must have one of the following as its Storage Class operand:

    • Output

    • Private

    • Function

  • The OriginLowerLeft execution mode must not be used; fragment entry points must declare OriginUpperLeft.

  • The PixelCenterInteger execution mode must not be used. Pixels are always centered at half-integer coordinates.

  • Images and Samplers

    • OpTypeImage must declare a scalar 32-bit float or 32-bit integer type for the “Sampled Type”. (RelaxedPrecision can be applied to a sampling instruction and to the variable holding the result of a sampling instruction.)

    • OpTypeImage must have a “Sampled” operand of 1 (sampled image) or 2 (storage image).

    • If shaderStorageImageReadWithoutFormat is not enabled and an OpTypeImage has “Image Format” operand of Unknown, any variables created with the given type must be decorated with NonReadable.

    • If shaderStorageImageWriteWithoutFormat is not enabled and an OpTypeImage has “Image Format” operand of Unknown, any variables created with the given type must be decorated with NonWritable.

    • OpImageQuerySizeLod, and OpImageQueryLevels must only consume an “Image” operand whose type has its “Sampled” operand set to 1.

    • The (u,v) coordinates used for a SubpassData must be the <id> of a constant vector (0,0), or if a layer coordinate is used, must be a vector that was formed with constant 0 for the u and v components.

    • The “Depth” operand of OpTypeImage is ignored.

    • Objects of types OpTypeImage, OpTypeSampler, OpTypeSampledImage, and arrays of these types must not be stored to or modified.

  • Decorations

    • The GLSLShared and GLSLPacked decorations must not be used.

    • The Flat, NoPerspective, Sample, and Centroid decorations must not be used on variables with storage class other than Input or on variables used in the interface of non-fragment shader entry points.

    • The Patch decoration must not be used on variables in the interface of a vertex, geometry, or fragment shader stage’s entry point.

    • The ViewportRelativeNV decoration must only be used on a variable decorated with Layer in the vertex, tessellation evaluation, or geometry shader stages.

    • The ViewportRelativeNV decoration must not be used unless a variable decorated with one of ViewportIndex or ViewportMaskNV is also statically used by the same OpEntryPoint.

    • The ViewportMaskNV and ViewportIndex decorations must not both be statically used by one or more OpEntryPoint’s that form the vertex processing stages of a graphics pipeline.

    • Only the round-to-nearest-even and the round-to-zero rounding modes can be used for the FPRoundingMode decoration.

    • The FPRoundingMode decoration can only be used for the floating-point conversion instructions as described in the SPV_KHR_16bit_storage SPIR-V extension.

    • DescriptorSet and Binding decorations must obey the constraints on storage class, type, and descriptor type described in DescriptorSet and Binding Assignment

  • OpTypeRuntimeArray must only be used for:

    • the last member of an OpTypeStruct that is in the StorageBuffer storage class decorated as Block, or that is in the Uniform storage class decorated as BufferBlock.

    • If the RuntimeDescriptorArrayEXT capability is supported, an array of variables with storage class Uniform, StorageBuffer, or UniformConstant, or for the outermost dimension of an array of arrays of such variables.

  • Linkage: See Shader Interfaces for additional linking and validation rules.

  • If OpControlBarrier is used in fragment, vertex, tessellation evaluation, or geometry stages, the execution Scope must be Subgroup.

  • Compute Shaders

    • For each compute shader entry point, either a LocalSize execution mode or an object decorated with the WorkgroupSize decoration must be specified.

    • For compute shaders using the DerivativeGroupQuadsNV execution mode, the first two dimensions of the local workgroup size must be a multiple of two.

    • For compute shaders using the DerivativeGroupLinearNV execution mode, the product of the dimensions of the local workgroup size must be a multiple of four.

  • “Result Type” for Non Uniform Group Operations must be limited to 32-bit float, 32-bit integer, boolean, or vectors of these types. If the Float64 capability is enabled, double and vector of double types are also permitted.

  • “Mask” for OpGroupNonUniformShuffleXor must be a specialization constant or a constant, or if the dynamic instance is called within a loop construct it must be one of:

    1. A specialization constant.

    2. A constant.

    3. An arthimetic operation whose operands are 1., 2., or 4.

    4. A phi node whose operands are 1., 2., or 3.

  • If OpGroupNonUniformBallotBitCount is used, the group operation must be one of:

    • Reduce

    • InclusiveScan

    • ExclusiveScan

  • Atomic instructions must declare a scalar 32-bit integer type for the Result Type and the type of the value pointed to by Pointer.

  • If an instruction loads from or stores to a resource (including atomics and image instructions) and the resource descriptor being accessed is not dynamically uniform, then the operand corresponding to that resource (e.g. the pointer or sampled image operand) must be decorated with NonUniformEXT.

Precision and Operation of SPIR-V Instructions

The following rules apply to both single and double-precision floating point instructions:

  • Positive and negative infinities and positive and negative zeros are generated as dictated by IEEE 754, but subject to the precisions allowed in the following table.

  • Dividing a non-zero by a zero results in the appropriately signed IEEE 754 infinity.

  • Any denormalized value input into a shader or potentially generated by any instruction in a shader may be flushed to 0.

  • The rounding mode cannot be set and is undefined.

  • NaNs may not be generated. Instructions that operate on a NaN may not result in a NaN.

  • Support for signaling NaNs is optional and exceptions are never raised.

The precision of double-precision instructions is at least that of single precision.

The precision of operations is defined either in terms of rounding, as an error bound in ULP, or as inherited from a formula as follows.

Correctly Rounded

Operations described as "correctly rounded" will return the infinitely precise result, x, rounded so as to be representable in floating-point. The rounding mode used is not defined but if x is exactly representable then x will be returned. Otherwise, either the floating-point value closest to and no less than x or the value closest to and no greater than x will be returned.

ULP

Where an error bound of n ULP (units in the last place) is given, for an operation with infinitely precise result x the value returned must be in the range [x - n * ulp(x), x + n * ulp(x)]. The function ulp(x) is defined as follows:

If there exist non-equal floating-point numbers a and b such that a ≤ x ≤ b then ulp(x) is the minimum possible distance between such numbers, \(ulp(x) = \mathrm{min}_{a,b} | b - a |\). If such numbers do not exist then ulp(x) is defined to be the difference between the two finite floating-point numbers nearest to x.

Where the range of allowed return values includes any value of magnitude larger than that of the largest representable finite floating-point number, operations may return an infinity of the appropriate sign. If the infinitely precise result of the operation is not mathematically defined then the value returned is undefined.

Inherited From …​

Where an operation’s precision is described as being inherited from a formula, the result returned must be at least as accurate as the result of computing an approximation to x using a formula equivalent to the given formula applied to the supplied inputs. Specifically, the formula given may be transformed using the mathematical associativity, commutativity and distributivity of the operators involved to yield an equivalent formula. The SPIR-V precision rules, when applied to each such formula and the given input values, define a range of permitted values. If NaN is one of the permitted values then the operation may return any result, otherwise let the largest permitted value in any of the ranges be Fmax and the smallest be Fmin. The operation must return a value in the range [x - E, x + E] where \(E = \mathrm{max} \left( | x - F_{\mathrm{min}} |, | x - F_{\mathrm{max}} | \right) \)

For single precision (32 bit) instructions, precisions are required to be at least as follows, unless decorated with RelaxedPrecision:

Table 78. Precision of core SPIR-V Instructions
Instruction Precision

OpFAdd

Correctly rounded.

OpFSub

Correctly rounded.

OpFMul

Correctly rounded.

OpFOrdEqual, OpFUnordEqual

Correct result.

OpFOrdLessThan, OpFUnordLessThan

Correct result.

OpFOrdGreaterThan, OpFUnordGreaterThan

Correct result.

OpFOrdLessThanEqual, OpFUnordLessThanEqual

Correct result.

OpFOrdGreaterThanEqual, OpFUnordGreaterThanEqual

Correct result.

OpFDiv

2.5 ULP for b in the range [2-126, 2126].

conversions between types

Correctly rounded.

Table 79. Precision of GLSL.std.450 Instructions
Instruction Precision

fma()

Inherited from OpFMul followed by OpFAdd.

exp(x), exp2(x)

3 + 2 × |x| ULP.

log(), log2()

3 ULP outside the range [0.5, 2.0]. Absolute error < 2-21 inside the range [0.5, 2.0].

pow(x, y)

Inherited from exp2(y × log2(x)).

sqrt()

Inherited from 1.0 / inversesqrt().

inversesqrt()

2 ULP.

GLSL.std.450 extended instructions specifically defined in terms of the above instructions inherit the above errors. GLSL.std.450 extended instructions not listed above and not defined in terms of the above have undefined precision. These include, for example, the trigonometric functions and determinant.

For the OpSRem and OpSMod instructions, if either operand is negative the result is undefined.

Note

While the OpSRem and OpSMod instructions are supported by the Vulkan environment, they require non-negative values and thus do not enable additional functionality beyond what OpUMod provides.

Compatibility Between SPIR-V Image Formats And Vulkan Formats

Images which are read from or written to by shaders must have SPIR-V image formats compatible with the Vulkan image formats backing the image under the circumstances described for texture image validation. The compatibile formats are:

Table 80. SPIR-V and Vulkan Image Format Compatibility
SPIR-V Image Format Compatible Vulkan Format

Rgba32f

VK_FORMAT_R32G32B32A32_SFLOAT

Rgba16f

VK_FORMAT_R16G16B16A16_SFLOAT

R32f

VK_FORMAT_R32_SFLOAT

Rgba8

VK_FORMAT_R8G8B8A8_UNORM

Rgba8Snorm

VK_FORMAT_R8G8B8A8_SNORM

Rg32f

VK_FORMAT_R32G32_SFLOAT

Rg16f

VK_FORMAT_R16G16_SFLOAT

R11fG11fB10f

VK_FORMAT_B10G11R11_UFLOAT_PACK32

R16f

VK_FORMAT_R16_SFLOAT

Rgba16

VK_FORMAT_R16G16B16A16_UNORM

Rgb10A2

VK_FORMAT_A2B10G10R10_UNORM_PACK32

Rg16

VK_FORMAT_R16G16_UNORM

Rg8

VK_FORMAT_R8G8_UNORM

R16

VK_FORMAT_R16_UNORM

R8

VK_FORMAT_R8_UNORM

Rgba16Snorm

VK_FORMAT_R16G16B16A16_SNORM

Rg16Snorm

VK_FORMAT_R16G16_SNORM

Rg8Snorm

VK_FORMAT_R8G8_SNORM

R16Snorm

VK_FORMAT_R16_SNORM

R8Snorm

VK_FORMAT_R8_SNORM

Rgba32i

VK_FORMAT_R32G32B32A32_SINT

Rgba16i

VK_FORMAT_R16G16B16A16_SINT

Rgba8i

VK_FORMAT_R8G8B8A8_SINT

R32i

VK_FORMAT_R32_SINT

Rg32i

VK_FORMAT_R32G32_SINT

Rg16i

VK_FORMAT_R16G16_SINT

Rg8i

VK_FORMAT_R8G8_SINT

R16i

VK_FORMAT_R16_SINT

R8i

VK_FORMAT_R8_SINT

Rgba32ui

VK_FORMAT_R32G32B32A32_UINT

Rgba16ui

VK_FORMAT_R16G16B16A16_UINT

Rgba8ui

VK_FORMAT_R8G8B8A8_UINT

R32ui

VK_FORMAT_R32_UINT

Rgb10a2ui

VK_FORMAT_A2B10G10R10_UINT_PACK32

Rg32ui

VK_FORMAT_R32G32_UINT

Rg16ui

VK_FORMAT_R16G16_UINT

Rg8ui

VK_FORMAT_R8G8_UINT

R16ui

VK_FORMAT_R16_UINT

R8ui

VK_FORMAT_R8_UINT