Streaming is available in most browsers,
and in the WWDC app.
Create camera extensions with Core Media IO
Discover how you can use Core Media IO to easily create macOS system extensions for software cameras, hardware cameras, and creative cameras. We'll introduce you to our modern replacement for legacy DAL plug-ins — these extensions are secure, fast, and fully-compatible with any app that uses a camera input. We'll take you through the Core Media IO APIs and share how they can support camera manufacturers, video conferencing apps with special effects features, creative app ideas, and more.
- Capture setup
- Core Media I/O
- Creating a camera extension with Core Media I/O
- Have a question? Ask with tag wwdc2022-10022
- Overriding the default USB video class extension
- Search the forums for tag wwdc2022-10022
- System Extensions and DriverKit
♪ ♪ Hello and welcome. I'm Brad Ford from the Camera Software Engineering team. In this session, I'll be introducing you to camera extensions with CoreMedia IO, which is a modern camera driver architecture for macOS and a replacement for DAL plug-ins.
DAL plug-ins are a technology that allows you to create camera drivers for hardware that plugs into a Mac, or virtual cameras. They've been around for a very long time– since macOS 10.7.
DAL plug-ins provide the power to extend macOS as a rich media platform, bringing support for great third party camera products to pros and consumers.
It's part of what makes the Mac, the Mac.
But DAL plug-ins have some problems. They load untrusted code directly into an app's process, making it vulnerable to crashes if the plug-in has bugs or to malware attack. For this reason, they don't work with Apple apps such as FaceTime, QuickTime Player, and PhotoBooth. They also don't work with many third party camera apps, unless those apps intentionally disable library validation, or the user turns off system integrity protection. Neither of these are practices recommended as they make the system less secure and less stable. They're difficult to develop too. They carry a C API circa 2011 and a thick SDK of C++ helper classes for you to learn. And on top of all that, they're sparsely documented.
It's time for an upgrade. macOS 12.3 introduces a thoroughly modern replacement for DAL plug-ins called Camera Extensions… An architecture that places user security first. Let's learn how it works. First, I'll provide a technology overview. Next, I'll show you how to build a camera extension from scratch. Next, I'll introduce the main classes and functions of the API. I'll explain how CoreMedia IO Extensions can be used as output devices. And finally, I'll cover our DAL plug-in deprecation plan. Let's get started. Camera extensions, otherwise known as CoreMedia IO extensions, are a new way to package and deliver camera drivers to Mac applications.
They're secure. Your extension code is cordoned off into its own daemon process that's sandboxed and run as a role user. All the buffers your extension provides are validated before being delivered to an app. They're fast. The framework handles the IPC layers between your extension process and the app, with an emphasis on performance. The framework can also take care of delivering buffers to multiple simultaneous clients. They're modern. Your extension can be written in either Swift or Objective-c.
They're simple. There are just a few classes to learn, a few protocols to implement in order to get up and running. The framework takes care of the boilerplate code.
They're easy to deploy. You can ship them as apps in the App Store.
And camera extensions are 100% backward compatible with existing AVFoundation capture APIs.
Camera extensions shows up just like the built-in camera in all camera apps, including Apple apps. Here's how an example of how a camera extension might appear in the FaceTime camera picker. What kind of experiences can you build with a camera extension? Let's study three common uses. The simplest use is a software-only camera, such as a camera that displays color bars, a unique test pattern, programmatically generated images at various frame rates or resolutions, or a camera that streams pre-rendered content, such as frames in a movie, to test A/V synchronization.
The second use case is a driver for a camera that you intend to physically plug into a Mac or discover wirelessly. Camera extensions fully support hot plugging and unplugging. To address your hardware, you have a few choices. The preferred method is to use a DriverKit Extension, or DEXT, which runs entirely in user space. If your hardware must be addressed at the kernel level, you can use the legacy IOVideoFamily kext path. Development of new kext code is discouraged as kexts are inherently less secure and can contribute to system instability.
Apple provides a class compliant extension for USB video class, or UVC, cameras. It works great for cameras that conform to the UVC spec.
If, however, you need to support a USB camera that uses nonstandard protocol, has additional features outside the UVC spec, you can create a camera extension that overrides Apple's UVC extension, allowing you to claim a particular product and vendor ID. If you're interested in learning more about it, please refer to the article at developer.apple.com entitled "Overriding the default USB video class extension." It explains how to create a minimal DEXT bundle and which IOKitPersonalities keys you need to override in your Info.plist.
A third common use is a creative camera, a hybrid between software and hardware.
Your extension accesses a video stream from another physical camera attached to the Mac, applies an effect to those buffers, and sends them along to clients as a new camera stream.
Or a creative camera that accesses video streams from several cameras, composites them, and sends them along to the app.
A creative camera like this might use a configuration app to control the compositing or parameterize filters. The possibilities for creative camera are really endless.
Now that we've explored the primary use cases, let's look at the anatomy of a CoreMedia IO Extension. First the "CoreMedia IO" part.
CoreMedia IO is a low level framework for publishing or discovering camera drivers. You already know that it contains the legacy DAL API and the new camera extension API that replaces it. But it also contains a powerful set of low level C APIs for app developers to find and inspect cameras on the system.
Now, how about that "Extension" part? CoreMedia IO Extensions are built on top of the SystemExtensions framework which first appeared in macOS Catalina. It obviates the need for a throw-away installer. Instead, you ship your extension inside an app. The extension executable lives within the app bundle. By making calls into the SystemExtensions framework, your app can install, upgrade, or downgrade your extension for all users on the system. And uninstalling is a snap. Delete the app and the SystemExtensions framework uninstalls your camera extension for all users. This delivery mechanism is approved for App Store use, making it easy to deploy your camera extension to a wide audience.
To learn more about the system extensions framework, you can read the documentation at developer.apple.com/ documentation/systemextensions.
And be sure to check out the WWDC 2019 video entitled "System Extensions and DriverKit." That's it for our technology overview of camera extensions. Now, let's actually build one. Here's a quick demo of how to get a camera extension up and running in a matter of minutes.
I've created a single window macOS application in Xcode, called ExampleCam. At this point, I've only added a few lines of code.
The App Delegate is unchanged. In the main storyboard, I've added two buttons, one to install and one to uninstall the extension, plus a text field to display status.
In the ViewController class, I've added IBActions to hook up the install and uninstall buttons.
These functions create OSSystemExtensionRequests to either activate or deactivate the extension found within the app's bundle. At the bottom, I've added skeletal implementations of the OSSystemExtensionRequestDelegate functions that log status.
The app's entitlements file has the usual App Sandbox=YES and it defines an AppGroup.
I've only added one new key here, the "System Extension" key, which is required if your app installs system extensions. At this point, if I run the app and click on the Install Extension button, I'll just get a fatal error, since the app is looking for an extension in the bundle that doesn't exist yet.
To create and embed a system extension, I go to File, New, Target, and under macOS, I scroll all the way down to the bottom where the System Extensions are located. Then I pick "Camera Extension," hit next, give it a name–I'll choose "Extension"– I'll make sure that "Embedded in Application" is set, and then I click finish. Inside the new extension folder, I get four new files. The Info.plist identifies it as a CMIOExtension by defining its MachServiceName.
This is a critical piece of information. CoreMedia IO's registerassistant will not launch your extension unless it's present.
While we're here, let's give it a usage description for the system extension. The entitlements file shows that it's app sandboxed. And I need to ensure here that my extension's app group is prefixed by the MachServiceName in order for it to pass validation.
So I'll copy and paste that over from the app extension to the extensions entitlements file. And that's it.
The main.swift file serves as your extension's entry point and starts the service. And the ExtensionProvider.swift file gives us a fully functional camera. It contains a DeviceSource, a StreamSource, and a ProviderSource, all that you need to create a pure software camera. Not a bad little template.
In this file, I'll search for "SampleCapture" and I'll replace with "ExampleCam," so that my camera's name, model, and manufacturer all have the proper name.
That's it. Let's compile and run it.
When I hit the Install button, uh-oh, it fails. That's because system extensions can only be installed by apps residing in /Applications. Let's move it and try again.
This time, it succeeds. I'm prompted to Allow the blocked extension to install by authenticating in System Settings, where I find Privacy & Security, and click the Allow button.
I authenticate with my password, and then I see that my result has changed to 0 for "no error." If I use the systemextensionsctl list tool, I confirm that I've succeeded, and now I have one extension active on my system. Now I can open any camera app and find and admire my work.
Let's launch FaceTime. ExampleCam shows up in the camera picker. It sort of looks like the old Pong game from the '70s, drawing a horizontal white line that moves up and down the frame at 60 frames per second.
To get rid of the camera, all I have to do is delete the app.
The system prompts me to confirm that I'm also uninstalling the extension by deleting the app.
The ExampleCam demo shows just how easy it is make a software camera from scratch. Now let's take it up a notch by turning that software camera into a creative camera.
I call this second example CIFilterCam. The CI stands for CoreImage, a framework with all sorts of effects filters that you can apply to stills or video.
To create CIFilterCam, I began with the ExampleCam shell, but decided to make the app a configuration app as well as an installer. I've added a camera picker button, a filter picker button, and an effect bypass button. I've also added a view for live video preview. This is a standard view backed by an AVCaptureVideoPreviewLayer to show you what the Filter Camera is doing. By unchecking the bypass button, I can see various filters applied to the video, from color effects to distortion filters.
I'm kind of partial to the bump distortion.
I can apply these to the built-in FaceTime camera or to any physical camera attached to my Mac.
I've got my iPhone nearby set up as a Continuity Camera.
Let's use that.
The CIFilterCam app is nothing special in and of itself. Just an effects camera app. Where it really gets interesting, though, is when you realize that the app is a front end to a virtual filter camera that all apps can use. I'll launch FaceTime and PhotoBooth and make sure both of them are pointed at the CIFilterCam. Now, as I change filters in my configuration app, every app using CIFilterCam changes in tandem.
If I pick a different source camera, every camera app picks up the change.
Every button click in the app translates to a simple property call to the filter cam extension, telling it, "Hey, extension, use this camera," or, "Hey, extension, use this other filter." Or this other filter.
Or this other filter.
Support for running a hardware camera inside your extension requires macOS Ventura. You also need to add the com.apple.security.device.camera key to your extension's entitlements file, indicating that you will be using another camera. And since you'll be using a camera, the user will be prompted to grant permission to your extension, so you must provide an NSCameraUsageDescription in your Info.plist.
That wraps up the basics of building a camera extension. Now let's move on to the APIs.
At the bottom of the stack are daemon processes, one for each first or third party camera extension.
Within a camera app process, there are several layers at play, beginning with the private framework code that talks to your camera extension over IPC. One level up is another private layer that translates CoreMedia IO Extension calls to legacy DAL plug-in calls.
Up again, we find the public CoreMedia IO APIs that publish DAL plug-ins. To the client of this interface, there's no difference between CoreMedia IO Extensions and DAL plug-ins. Everything looks like a DAL plug-in. And finally, at the top is AVFoundation, which is a client of CoreMedia IO. It re-publishes DAL plug-ins as AVCaptureDevices.
Contrast this with the legacy DAL plug-in architecture. DAL plug-ins may or may not include a daemon piece, but all of them run code loaded by the CoreMedia IO framework directly in the app process. This leaves the app vulnerable to malware. Camera extensions remove this attack vector completely. Your extension must be app sandboxed, or it won't be allowed to run.
Apple's registerassistantservice identifies it by its CMIOExtensionMachServiceName and launches it as a role user account called _cmiodalassistants.
Sandboxd applies a custom sandbox profile to your process. It's tailored for camera use cases.
The custom sandbox profile allows you to communicate over the common hardware interfaces you would expect. USB, Bluetooth, WiFi– as a client but not a server that opens ports– and even Firewire. It also allows your extension to read and write from its own container and tmp.
The camera extension sandbox profile is more locked down than a regular app. Some examples of things you can't do are forking, exec'ing, or posix spawning a child process, accessing the window server, making a connection to the foreground user account, or registering your own mach services in the global namespace.
If, as you develop your extension, you find the sandbox too restrictive for a legitimate capture case, please provide us feedback through Feedback Assistant and we'll carefully consider loosening restrictions. The earlier architecture diagram showed your camera extension's daemon process passing buffers directly to the app layer. There's actually one more layer of security involved.
Between your daemon and the app is a proxy service called registerassistantservice. It enforces transparency, consent, and control policy. When an app tries to use a camera for the first time, the system asks the user if it's okay. That consent needs to be granted for all cameras, not just the built-in ones. The proxy service handles this consent on your behalf. If the user has denied camera access, the proxy stops buffers from going to that app. It also handles attribution– it lets the system know that a particular camera is in use by a particular app so that power consumed by your daemon can be attributed to the app that's using your camera.
CoreMedia IO Extensions have four main classes: Provider, Device, and Stream.
Providers have devices and devices have streams, and all three of them can have properties.
You create each of these three main classes by providing a source, respectively, a ProviderSource, DeviceSource, and StreamSource.
The ExtensionProvider is your lowest level object. It lets you add and remove devices as needed, such as for hot plug events.
It gets informed of the client processes as they try to connect, which gives you an opportunity to limit your device publishing to certain apps. It also consults your provider source object for property implementations.
Here's what your extension's main entry point might look like. You create your own ExtensionProviderSource, which conforms to the CMIOExtensionProviderSource protocol and creates an ExtensionProvider. To start your service, you call the provider class method startService and pass your provider instance.
ExtensionProvider implements two read only properties that do not change for the life of your extension. The manufacturer and the name of your provider. Both of these are strings.
Next up is the CMIOExtensionDevice. It manage streams, adding or removing them as needed. Your device can present multiple streams, but be aware that AVFoundation ignores all but the first input stream.
When you create a device, you provide a device source, as well as a localized name, a deviceID as a UUID, and, optionally, a legacyID string. These properties percolate all the way up to AVFoundation.
Your device's localizedName becomes the AVCaptureDevice's localizedName. Your specified deviceID becomes the AVCaptureDevice's uniqueIdentifier, unless you also provide a legacyDeviceID. You only need to provide this if you're modernizing a DAL plug-in and need to maintain backward compatibility with the uniqueIdentifier you've previously shipped.
If you provide a legacyDeviceID, AVCaptureDevice will use it as the uniqueIdentifier.
You create your CMIOExtensionDevice with a CMIOExtensionDeviceSource, which may optionally implement other properties, such as deviceModel, which should be the same for all cameras of the same model. isSuspended should be implemented if your device can enter a suspended state, such as if it has a privacy iris. The built-in cameras on Apple laptops enter the suspended state when the clamshell is closed. Your device's transport type reveals how it's connected, such as via USB, Bluetooth, or Firewire.
Lastly, if you have a microphone physically paired with your camera, you can expose it as a linked device. All of these properties are read only. Next up is the all-important CMIOExtensionStream, which does the heavy lifting in the CMIOExtension. It publishes video formats and defines their valid frame rates and configures the active format. It uses a standard clock, such as the host time clock, or provides its own custom clock to drive the timing of each buffer it produces. And most importantly, it sends sample buffers to clients.
Your extension stream source publishes CMIOExtensionStreamFormats. Those become AVCaptureDeviceFormats. Clients can read and write the active format index to change the active format.
The frame duration, which is equivalent to max frame rate. And max frame duration, which is the same as min frame rate.
The DAL plug-in world exposes a fourth interface called DAL controls. Plug-in developers use these to expose features such as auto exposure, brightness, sharpness, pan and zoom, et cetera. While powerful, they've been implemented inconsistently, so it's difficult for app developers to use them. In the CMIOExtension architecture, we don't offer a DAL control replacement. Instead, everything is a property.
You've already learned about many standard properties at the provider, device, and stream level. You can also make your own custom properties and propagate them to the app layer, just as I did in the CIFilterCam demo.
CoreMedia IO's C property interface uses a C struct to identify a property's selector, scope, and element. These are considered its address.
The selector is the name of the property as a four-character code, such as cust for custom. The scope can be global, input, or output, and the element can be any number you want. The main element is always zero. CMIOExtensions let you bridge your properties to the old world by coding property address elements into a custom property name. First, the characters 4cc_, then the selector, scope, and element as four character codes separated by underscores. Using this method, you can communicate any string or data value to the app layer.
AVFoundation doesn't work with custom properties, so you must stick to the CoreMedia IO C API if your configuration app needs to work with custom properties. That's our high-level look at the API. Now let's talk about output devices.
A lesser known feature of DAL plug-ins is their ability to present the opposite of a camera–an output device– which consumes video from an app in real time rather than provides it. This is the "O" part of CoreMedia IO. Input and Output. Output devices are common in the pro video world. Some common uses are print-to-tape, where a video signal is sent to an external recorder, or real-time preview monitoring, such as on a pro deck with SDI inputs.
One important thing to note is that output devices have no AVFoundation API equivalent. To send frames to an output device, you must use the CoreMedia IO C API directly.
CMIOExtension streams are created with a direction of either source or sink. Sink streams consume data from an app. Clients feed your sink stream by inserting sample buffers into a simple queue. That translates to a consumeSampleBuffer call in your extension, and once you've consumed that buffer, you notify them with notifyScheduledOutputChanged.
There are a number of stream properties specific to output devices. They mainly deal with the queue sizing, how many frames to buffer before starting, and signaling when all data has been consumed.
Now on to our fifth and final topic of the day.
Earlier in the presentation, I showed this diagram of the DAL plug-in architecture and I highlighted its many security problems. We've addressed these shortcomings with Camera Extensions and are fully committed to their continued development. They are the path forward. So what does that mean for DAL plug-ins? It means the end is near.
As of macOS 12.3, DAL plug-ins are already deprecated, so you get a compilation warning when building. That's a good start, but it's not enough. As long as legacy DAL plug-ins are allowed to load, camera apps will still be at risk.
To fully address security vulnerabilities and make the system more robust for all users, we plan to disable DAL plug-ins entirely in the next major release after macOS Ventura.
What does this mean for you? Well, we hope the message is clear. If you currently maintain a DAL plug-in, now is the time to begin porting your code to a Camera Extension.
And please, let us know what friction you encounter. We are eager to address these issues and provide a rich feature set. We really look forward to working with you. This concludes today's presentation on camera extensions for macOS. We can't wait to see what fresh and creative camera experiences you'll bring to the Mac. And hope you have fun doing it.
Looking for something specific? Enter a topic above and jump straight to the good stuff.
An error occurred when submitting your query. Please check your Internet connection and try again.