Next Previous

Resource Heaps

Available in: iOS_GPUFamily1_v3, iOS_GPUFamily2_v3, iOS_GPUFamily3_v2, tvOS_GPUFamily1_v2

Resource heaps allow Metal resources to be backed by the same memory allocation. These resources are created from a memory pool known as a heap and they are tracked by a fence that captures and manages GPU work dependencies. Resource heaps help your app reduce the cost of:

Resource creation. Resource creation may involve allocating new memory, mapping it into your process, and zero filling it. This cost is reduced by creating resources from a larger heap or from recycled resource memory backed by a heap.
Fixed memory budget. If some of your resources are not used for some time, the virtual memory may compress the resource memory to save space. This can cause extra time spent in making this resource memory available again for the next use. By using a small number of heaps, you can keep your allocations within a given memory budget, and ensure those resources are continually in use (which can help provide more consistent performance).
Transient resources. Transient resources are produced and consumed for each frame, but not all of these resources are used together at the same time. To reduce memory consumption, transient resources that are not used together can share the same memory backed by a heap.

Heaps

A MTLHeap object is a Metal resource that represents an abstract memory pool. Resources created from this heap are defined as either aliasable or non-aliasable. Sub-allocated resources are aliased when they share the same portion of heap memory as another aliased resource.

Creating a Heap

A MTLHeap object is created by calling the newHeapWithDescriptor: method of a MTLDevice object. A MTLHeapDescriptor object describes the storage mode, CPU cache mode, and byte size of a heap. All resources sub-allocated from the same heap share the same storage mode and CPU cache mode. The byte size of the heap must always be large enough to allocate enough memory for its resources.

A heap can be made purgeable after it has been created by calling the setPurgeableState: method. The heap purgeability state refers to its whole backing memory and affects all resources within the heap. Heaps are purgeable but their resources are not; sub-allocated resources only reflect the heap’s purgeability state. Purgeability may be useful for heaps that store only render targets.

Sub-Allocating Heap Resources

Both MTLBuffer and MTLTexture objects can be sub-allocated from a heap. To do so, call one of these two methods of a MTLHeap object:

Each sub-allocated resource is defined as non-aliasable by default, which prevents future sub-allocated resources from using its memory. To make a sub-allocated resource aliasable, call the makeAliasable method; this allows future sub-allocated resources to reuse its memory.

Aliasable sub-allocated resources are not destroyed and can still be used by command encoders. These resources hold a strong reference to their heap which is released only when the resource itself is destroyed, but not when it is made aliasable. Sub-allocated resources can be destroyed only after all command buffers referencing them have completed execution.

Listing 13-1 shows the use of a heap for simple resource sub-allocation.

Listing 13-1 Simple heap creation and resource sub-allocation

// Calculate the size and alignment of each resource

MTLSizeAndAlign albedoSizeAndAlign = [_device heapTextureSizeAndAlignWithTextureDescriptor:_albedoDescriptor];

MTLSizeAndAlign normalSizeAndAlign = [_device heapTextureSizeAndAlignWithTextureDescriptor:_normalDescriptor];

MTLSizeAndAlign glossSizeAndAlign  = [_device heapTextureSizeAndAlignWithTextureDescriptor:_glossDescriptor];

// Calculate a heap size that satisfies the size requirements of all three resources

NSUInteger heapSize = albedoSizeAndAlign.size + normalSizeAndAlign.size + glossSizeAndAlign.size;

// Create a heap descriptor

MTLHeapDescriptor* heapDescriptor = [MTLHeapDescriptor new];

heapDescriptor.cpuCacheMode = MTLCPUCacheModeDefaultCache;

heapDescriptor.storageMode = MTLStorageModePrivate;

heapDescriptor.size = heapSize;

// Create a heap

id <MTLHeap> heap = [_device newHeapWithDescriptor:heapDescriptor];

// Create sub-allocated resources from the heap

id <MTLTexture> albedoTexture = [_heap newTextureWithDescriptor:_albedoDescriptor];

id <MTLTexture> normalTexture = [_heap newTextureWithDescriptor:_normalDescriptor];

id <MTLTexture> glossTexture  = [_heap newTextureWithDescriptor:_glossDescriptor];

Fences

A MTLFence object is used to track and manage sub-allocated resource dependencies across command encoders. Resource dependencies arise as resources are produced and consumed by different commands, regardless of whether those commands are encoded to the same queue or different queues. A fence captures GPU work up to a specific point in time; when the GPU encounters a fence, it must wait until all the captured work is completed before continuing execution.

Creating a Fence

A MTLFence object is created by calling the newFence method of a MTLDevice object. A fence is mainly used for tracking purposes and only supports tracking within the GPU, not between the CPU and the GPU. The MTLFence protocol does not provide any methods or completion handlers and you can only modify the label property.

Tracking Fences Across Blit and Compute Command Encoders

Both MTLBlitCommandEncoder and MTLComputeCommandEncoder objects can be tracked with a fence. To update a fence, call the updateFence: or updateFence: method for each command encoder, respectively. To wait for a fence, call the waitForFence: or waitForFence: method for each command encoder, respectively.

The fence is updated or evaluated when the command buffer is actually submitted to the hardware. This maintains global order and prevents deadlock.

Drivers may wait on fences at the beginning of a command encoder, and drivers may delay fence updates until the end of the command encoder. Therefore, you are not allowed to first update and then wait on the same fence in the same command encoder (however, you are allowed to first wait and then update). Producer-consumer relationships must be split across different command encoders.

Tracking Fences Across Render Command Encoders

A MTLRenderCommandEncoder object can be tracked with a fence at a finer granularity. The MTLRenderStages enum allows you to specify the render stage at which a fence is either updated or waited for, allowing for vertex and fragment commands to overlap execution. Call the updateFence:afterStages: method to update a fence and call the waitForFence:beforeStages: method to wait for a fence.

Fence Examples

Listing 13-2 shows the use of a fence for simple tracking.

Listing 13-2 Simple fence tracking

id <MTLFence> fence = [_device newFence];

id <MTLCommandBuffer> commandBuffer = [_commandQueue commandBuffer];

// Producer

id <MTLRenderCommandEncoder> renderCommandEncoder = [commandBuffer renderCommandEncoderWithDescriptor:_descriptor];

/* Draw using resources associated with 'fence' */

[renderCommandEncoder updateFence:fence afterStages:MTLRenderStageFragment];

[renderCommandEncoder endEncoding];

// Consumer

id <MTLComputeCommandEncoder> computeCommandEncoder = [commandBuffer computeCommandEncoder];

[computeCommandEncoder waitForFence:fence];

/* Dispatch using resources associated with 'fence' */

[computeCommandEncoder endEncoding];

[commandBuffer commit];

You cannot assume that two command encoders will complete if only the latter command encoder updates a fence. The consumer command encoder must explicitly wait on all command encoders that will conflict on the fence. (The GPU may start executing as many commands as it can, unless it encounters a fence.) Listing 13-3 shows the incorrect use of a fence that introduces a race condition.

Listing 13-3 Incorrect fence tracking

id <MTLFence> fence = [_device newFence];

id <MTLCommandBuffer> commandBuffer = [_commandQueue commandBuffer];

// Producer 1

id <MTLRenderCommandEncoder> producerCommandEncoder1 = [commandBuffer renderCommandEncoderWithDescriptor:_descriptor];

/* Draw using resources associated with 'fence' */

[producerCommandEncoder1 endEncoding];

// Producer 2

id <MTLComputeCommandEncoder> producerCommandEncoder2 = [commandBuffer computeCommandEncoder];

/* Encode */

[producerCommandEncoder2 updateFence:fence];

[producerCommandEncoder2 endEncoding];

// Race condition at consumption!

// producerCommandEncoder2 updated the fence and will have completed its work

// producerCommandEncoder1 did not update the fence and therefore there is no guarantee that it will have completed its work

// Consumer

id <MTLComputeCommandEncoder> computeCommandEncoder = [commandBuffer computeCommandEncoder];

[computeCommandEncoder waitForFence:fence];

/* Dispatch using resources associated with 'fence' */

[computeCommandEncoder endEncoding];

[commandBuffer commit];

You are still responsible for sequencing command buffer submission queues, as shown in Listing 13-4. However, fences do not allow you to control inter-queue command buffer sequencing.

Listing 13-4 Sequencing fences across command buffer submission queues

id <MTLFence> fence = [_device newFence];

id <MTLCommandBuffer> commandBuffer0 = [_commandQueue0 commandBuffer];

id <MTLCommandBuffer> commandBuffer1 = [_commandQueue1 commandBuffer];

// Producer

id <MTLRenderCommandEncoder> renderCommandEncoder = [commandBuffer0 renderCommandEncoderWithDescriptor:_descriptor];

/* Draw using resources associated with 'fence' */

[renderCommandEncoder updateFence:fence afterStages:MTLRenderStageFragment];

[renderCommandEncoder endEncoding];

// Consumer

id <MTLComputeCommandEncoder> computeCommandEncoder = [commandBuffer1 computeCommandEncoder];

[computeCommandEncoder waitForFence:fence];

/* Dispatch using resources associated with 'fence' */

[computeCommandEncoder endEncoding];

// Ensure 'commandBuffer0' is scheduled before 'commandBuffer1'

[commandBuffer0 addScheduledHandler:^(id <MTLCommandBuffer>) {

    [commandBuffer1 commit];

}];

[commandBuffer0 commit];

Best Practices

Separate Heaps for Render Target Types

Some devices cannot alias sub-allocated resources arbitrarily; for example, compressible depth textures and MSAA textures. You should create a different heap for each type of render target: color, depth, stencil, and MSAA.

Separate Heaps for Aliasable and Non-Aliasable Resources

When making a sub-allocated resource aliasable, you must assume that this resource will alias against all future heap sub-allocations. If you later allocate non-aliasable resources, such as longer-lived textures, then those resources could alias against your temporary resources, and become very difficult to track correctly.

Tracking what aliases, and what does not, can be significantly easier if you keep at least two resource heaps: one for aliasable resources (for example, render targets), and one for non-aliasable resources (for example, asset textures or vertex buffers).

Separate Heaps to Reduce Fragmentation

Creating or deleting many sub-allocated resources of different sizes may fragment memory. Defragmentation requires you to explicitly copy from the fragmented heap to another heap. Alternatively, you can create multiple heaps dedicated to sub-allocated resources of similar size.

Heaps can also be used a stack. When used as a stack, fragmentation cannot occur.

Minimize Fencing

Fine-grained fences are difficult to manage and they reduce the tracking benefits of heaps. Avoid using a fence per sub-allocated resource; instead, use a single fence to track all sub-allocated resources with identical synchronization requirements.

Consider Tracking Non-Heap Resources

Manual data hazard tracking is extended to resources created directly from a MTLDevice object. Specify the new MTLResourceHazardTrackingModeUntracked resource option when creating the resource, then track it with fences. Manual tracking may reduce the automatic tracking overhead of many read-only resources.

Sample Code

For an example of how to use heaps and fences, see the MetalHeapsAndFences sample.

Next Previous