Discover how you can use RealityKit Trace to improve the performance of your spatial computing apps. Explore performance profiling guidelines for this platform and learn how the RealityKit Trace template can help you optimize rendering for your apps. We'll also provide guidance on profiling various types of content in your app to help pinpoint performance issues.
♪ Mellow instrumental hip-hop ♪ ♪ Sarina Wu: Hello! My name is Sarina, and I'm a Software Engineer on the RealityKit Tools team. Harjas Monga: And I am Harjas, a Profiling Tools Engineer. Sarina: Today, Harjas and I will be introducing the RealityKit Trace template in Instruments. We'll show you how this template can help you optimize the performance of your spatial experiences. Performance is essential to the user experience in spatial computing. To learn how to optimize spatial experiences, we will briefly cover how rendering works on this platform, show you how to profile using the RealityKit Trace template in Instruments, and briefly cover the other great tools available to optimize your content. This platform has unique performance constraints. To understand them, you first need to understand how rendering works. Rendering includes your app process, the render server, and the compositor. How your app interacts with these components will depend on the types of experiences you create. Let's take a look at the types of experiences you can create for your spatial apps and how they are rendered. Apps on the platform can enter either the Shared Space or a Full Space. These have different performance implications to consider based on how they render. When multiple apps run side by side, they are all rendered in the same space, which is one of the reasons we call it the Shared Space. This means that the performance of your app can be affected by the work that the render server is doing to render the other apps. Then the render server works with the compositor to generate the final frames. When your app enters a Full Space, all other visible apps are hidden. This means that the performance of your app is no longer affected by the rendering work for the apps that are now hidden. To learn more about how to enter a Full Space, check out the session "Go beyond the window with SwiftUI." Based on what was just covered, we have two recommended ways to profile your app. Whenever you are investigating performance issues or analyzing system power impact, you should profile your app in isolation to understand your app's impact on system performance. When you expect your app to work alongside other apps, you should profile your app with those other apps. This is important to understand how the user would experience your app. Let's profile a spatial app to show you how you can optimize your app's performance in isolation using the RealityKit Trace template. We've been working on Hello World, and we want to make sure that there are no performance issues. Sarina: This is the Start screen of the app, which is a SwiftUI View. This view has an Objects in Orbit button. We can tap on that button to learn more about objects orbiting Earth. The button opens a new view that lists examples of different objects that are orbiting Earth. This view has 3D models of these objects, including a satellite, the Moon, and a telescope. In this view, there is also a View Orbits button. We can explore this by tapping the button, which will open an immersive experience showing Earth and a satellite orbiting around it. We've used detailed assets for these models, and I suspect that they're affecting the performance of this app. In the immersive experience, we can see the path of the satellite animate as it orbits Earth. We can even scale the Earth up to see it in more detail. This interaction is incredibly choppy, so I think there's a performance issue here. Harjas and I profiled that experience using the RealityKit Trace template. Harjas, could you walk us through it? Harjas: Of course, let's walk through all the features available in RealityKit Trace. RealityKit Trace is available as a new template in Instruments 15. It can be used to profile both a real device and a simulator. To get the most accurate and actionable information, you should profile a real device. When profiling against the simulator not all the timing information will be accurate because of the hardware and software differences between your Mac and on-device. But you could still use it for quick iteration and improving some of the statistics that are not based on time. The RealityKit Trace template contains several instruments. The first instrument you will want to look at is the RealityKit Frames instrument. This instrument tracks each frame being rendered by the device. You can zoom in on these frames to check how long each frame took to render. With this, you can check how long each stage of the frame took to render. This gives you a high-level idea of what portion of the render pipeline could be causing performance problems. In order to achieve a smooth user experience, your application should be able to achieve 90 frames per second. However, the OS may not always be targeting 90 fps. It will render at the frame rate most appropriate for the content being displayed and the environment the device is in. Because the frame rate can change, every frame has a deadline in which it has to complete rendering so that the device can hit whatever the current target frame rate is. The frames are classified into three groups: frames that are completing well within their deadline, frames that are just barely finishing within their deadline, and frames than run past their deadline and result in frame drops. These classifications are color coded green, orange, and red, respectively. Frames that are running past the deadline will negatively impact the user experience. If you zoom out and check the frames from a high level, the color coding allows you to quickly find the problematic parts of the trace. So, you can narrow down any performance investigations to the areas where there are the most frame drops. In addition to the individual frames, the instrument also visualizes the average amount of time the system spent on CPU or GPU work to render each frame. The next instrument you will want to check is the RealityKit Metrics instrument. At a top level, the instrument draws all the bottlenecks that it detected. These bottlenecks are generated by looking at comprehensive timing information from the entire render pipeline. Prioritize the bottlenecks that occur during the same time that frames exceed their deadline. In the detail view below, you will find that these RealityKit bottlenecks are summarized by severity and type. You can dig in further to see exactly what kind of bottleneck the instrument found and how much it affected your overall performance. In the extended detail view, the instrument provides recommendations on how to diagnose these bottlenecks further and what steps you can take to mitigate them. By expanding the RealityKit Metrics track, you will be presented with several types of metrics from different components of the render pipeline. These statistics can help you understand the full complexity of the scene your app is presenting. Some of the key metrics will have associated thresholds to help inform you on reasonable expectations for those metrics. Use the metrics to help guide you further in diagnosing a bottleneck or why a frame is not hitting its deadline. RealityKit Metrics will visualize how much time is being spent in each frame to run the application's RealityKit systems. This includes all the built-in systems and all the custom systems your application may implement. This information is best combined with Time Profiler so you can optimize your RealityKit system code. Lastly, review the System Power Impact lane shown in RealityKit Metrics to understand the power envelope your application needs to work within to provide a great and consistent user experience. Now let's take a look at some traces we took while we were stepping through the world experience. The first scene in the app was the Start screen, which is implemented in SwiftUI. In the frames instrument, there are quite a few dropped frames throughout this trace. These dropped frames may not seem significant, but they can really damage the user experience. I can use Option-drag to zoom in on one of the more problematic areas. And by adjusting the time range, I can check what bottlenecks the RealityKit Metrics instrument found during these long-running frames. The Instrument found that the largest bottleneck in this time was Core Animation Encoding. So I'm going to check the Core Animation statistics, which can be found by clicking on the disclosure triangle next to RealityKit Metrics Instrument and selecting the track labeled Core Animation. These Core Animation metrics can help inform us on what might have caused these frame drops. When investigating these metrics, you will notice that some of them have context of how severe the metric is. In the timeline, this is reflected in the color coding. This is to guide you on what are reasonable thresholds for these key metrics. Based on the timeline visualization, it is clear that the application is exceeding the recommended threshold for the number of offscreen prepares. The summary at the bottom shows that the average number of offscreen prepares here are 180, which is quite a high average. When considering the Core Animation statistics, there are three types of work you want to keep in mind. Firstly, transparency and blur effects are very expensive operations for the system. Use these effects when they deliver most impact to the user, otherwise use them sparingly. The number of render passes is determined by how many layers Core Animation has to render individually for the entire image. And finally, there are offscreen passes. So as the name implies, an offscreen pass is a render pass that is rendered offscreen and not to the display. An offscreen pass requires the rendering pass to pause what it's currently doing and do some work that won't be shown to the user. However, the output of the offscreen pass is needed to continue the regular rendering pass. Offscreen passes are particularly impactful for spatial apps. Unlike other app platforms, this platform continuously renders spatial apps because every single frame needs to account for environment factors, such as the user's head movements. Therefore, your static UI needs to be efficient enough that it can be rendered at the system's target frame rate. There are four main types of work that can cause an offscreen pass: shadows, masking, rounded rectangles, and visual effects. To learn more about offscreen passes, watch our tech talk on "Demystify and eliminate hitches in the render phase." Since there were a lot of offscreen passes, I am going to check the SwiftUI code for this view to find what could have caused them. In the SwiftUI code, this view is not doing any masking or visual effects. But there are instances of shadows being applied. For example, in the SwiftUI View item, shadows are being applied to several buttons. Shadows are a particularly expensive operation, especially when combined with transparency. While shadows are a useful UI idiom, for spatial apps, you should use them when they deliver a significant effect to the user. I'm going to disable these shadows and take look at a new trace. With the shadows disabled, in the RealityKit Frames Instrument, there are few frame issues and RealityKit Metrics reports that offscreen passes have reduced by four times. Now, the next scene that we saw in the World app was the objects in orbit view. I am going to open up a trace from that scene to see if there is anything that can be optimized. In the Frames Instrument, there are a scattering of dropped frames throughout the trace with lots of bottlenecks. The detail view for RealityKit Metrics provides a summary of those bottlenecks.
In the summary, most of these bottlenecks are related to GPU Work Stalls. Because the bottleneck type reported most frequently are GPU stalls, I am again going to expand RealityKit Metrics. But this time, I'll investigate using the 3D Render track.
I'm going to select the area of the trace that has a high number of frame drops. In this time selection, the 3D Render metrics reports that the triangle and vertex counts are far exceeding the recommended thresholds. Next, I am going to highlight the area of the trace where there aren't nearly as many frame drops.
And according to the rendering metrics, the triangle and vertex counts are within the recommended thresholds. This means you should really be evaluating the number and quality of the assets the app is using in the scene. When optimizing asset rendering, first check the triangles, vertices, and draw calls from the 3D Rendering group in RealityKit Metrics. To optimize these metrics, use simple shape meshes when possible. Take advantage of instancing when utilizing assets with the same mesh. Check the complexity of assets using the statistics in Reality Composer Pro, which is a new developer tool that allows you to assemble, edit, and preview 3D content. That content could later be accessed through code directly in your Xcode project. To learn more about this tool and how to create great assets, check out the session "Meet Reality Composer Pro." I went ahead and swapped out the assets I was using with those that used fewer polygons and captured a new trace. In this trace, the Frames Instrument reports that all the frames are hitting their deadlines. And if I check the 3D rendering statistics again, it reports that the triangle and vertex counts are reduced substantially. While these assets did use fewer polygons, there was no loss in quality of the experience. The next trace is for when we were interacting with the Earth model. During this scene, resizing the globe was actually quite jittery. RealityKit Metrics reports that the System Power Impact lane was very high for a substantial amount of time. This is indicating that some part of your application is being very inefficient and the user experience could be impacted. You should target for your application to work well while keeping the device's system power impact in the nominal state for as much time as possible. When profiling to reduce system power impact, always profile with your application in isolation to ensure you get the most actionable information. You can lower the system power impact using several approaches. First, make sure that the statistics from RealityKit Metrics are within expectations. If these are exceeding expectations, the device could be operating at higher power states for long stretches of time to deliver a smooth experience. Next, check what work the CPU and GPU are doing. For the CPU, check if Time Profiler reports high CPU usage during your high-power draw regions. And if it does, optimize your CPU-bound code using Time Profiler. For the GPU, we have performance states. When the GPU is in the maximum stage, it draws a considerable amount of power. In that case, we should use the Metal System Trace template in Instruments to see what work is being done on the GPU. That way, we can understand what could be optimized. Going back to the trace, Time Profiler tells us that the CPU usage was averaging 100 percent in this region, and the GPU performance states were minimum during most of this time. Using Time Profiler, I can check what caused the high CPU usage. The heaviest stack trace is in the extended detail view. This is a very useful feature of the Time Profiler, as it allows you to quickly find the most expensive parts of your code in the call tree. Looking at these frames, it appears that Entity.makeModel is using a lot of CPU time. The next frame down is calling Entity.generateCollisionShapes. Therefore, the performance issue appears to being caused by constantly generating models and collision shapes, which is an expensive operation. I'm going to open Xcode to see what I can do about this. This is the Entity.makeModel function call that the call tree showed was taking a lot of CPU time. This is getting invoked within the makeGlobe function. I can Control-click on the makeGlobe function to see who is invoking it. It's getting invoked from the Orbit SwiftUI view body. This is antipattern that should be avoided because the view body needs to be computed very quickly. You should avoid doing model loading or any other expensive operations in the body of your SwiftUI views because any time the state of the view changes, all those expensive operations need to be recomputed. So, what I am going to do is remove this call from the view body. Next, in the ViewModel, I will add a reusable version of the Earth entity. And finally, I am going to use that reusable Earth entity in the Orbit View. Now, when the view body is recomputed, the app is not wasting time reloading the same model. Looking at the trace after our fix, the power impact is brought back down to the nominal state. And Time Profiler reports that the CPU usage has dropped from 100 percent to 10 percent. After all these optimizations, there are few reported bottlenecks, almost every frame is hitting its deadline, and power is within expectations. Now, the World app is a well-optimized app for this platform. Now that we've reduced the number of offscreen passes, replaced the high-polygon assets with reasonable ones, and lowered CPU and power usage, we're going to step through the optimized version of this app. The start screen looks great, and since the shadows weren't adding much to the user experience, this was a good optimization. Next, let's open up the Objects in Orbit. These models look great, even though we are using assets with fewer polygons. So that extra detail was just wasting resources. And finally, we're going to open up the Earth model again and try resizing.
Now this interaction is as smooth as butter. That was a brief overview of how to use RealityKit Trace to optimize your apps for this new platform. Hey, Sarina, what other tools are available for developers? Sarina: There are several tools available to help you optimize your apps for spatial computing. For optimizing SwiftUI content, there are domain-specific instruments in the Instruments app for analyzing SwiftUI, Core Animation, and hangs. You can learn more about the Hangs instrument in the session "Analyze hangs with Instruments." There are also several tools available to optimize your 3D asset-based content. The Time Profiler Instrument can help you find areas where your app is taking the most time, such as when a large amount of time is spent loading assets. The RealityKit Metrics Instrument can help you diagnose when scenes have too many assets or assets that are too complex. Finally, you can also check the complexity of your assets as you're assembling a scene using Reality Composer Pro. To learn more about Reality Composer Pro, watch the session "Meet Reality Composer Pro." If you are using Metal in your app, the most useful tool will be the Metal System Trace template in Instruments. This template has key metrics, such as the GPU timeline, GPU counters, and GPU performance state. To learn more about this template and other tools for profiling Metal content, check out the session "Discover Metal debugging, profiling, and asset creation tools." To recap, performance is essential for this platform. Apps need to be well optimized to deliver the best possible user experience. You can use the RealityKit Trace template to find performance bottlenecks in your app. Profiling proactively with other instruments and checking your content in Reality Composer Pro can also help you find and resolve performance issues. To learn more about how to use the RealityKit Trace template to optimize your apps, please check out the developer documentation. And to get a better understanding of performance for this platform, watch the session "Optimize app power and performance for spatial computing." Enjoy optimizing your spatial computing apps, whatever the trace may be. Harjas: Thank you for watching. ♪