This specification is a draft of a potential new version of this specification and should not be referenced other than as a working draft.
Copyright 2021, The Alliance for Open Media
Licensing information is available at http://aomedia.org/license/
The MATERIALS ARE PROVIDED “AS IS.” The Alliance for Open Media, its members, and its contributors
expressly disclaim any warranties (express, implied, or otherwise), including implied warranties of
merchantability, non-infringement, fitness for a particular purpose, or title, related to the materials.
The entire risk as to implementing or otherwise using the materials is assumed by the implementer and user.
IN NO EVENT WILL THE ALLIANCE FOR OPEN MEDIA, ITS MEMBERS, OR CONTRIBUTORS BE LIABLE TO ANY OTHER PARTY
FOR LOST PROFITS OR ANY FORM OF INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY CHARACTER
FROM ANY CAUSES OF ACTION OF ANY KIND WITH RESPECT TO THIS DELIVERABLE OR ITS GOVERNING AGREEMENT,
WHETHER BASED ON BREACH OF CONTRACT, TORT (INCLUDING NEGLIGENCE), OR OTHERWISE, AND WHETHER OR NOT
THE OTHER MEMBER HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Abstract
This document specifies how to use HDR10+ metadata, [SMPTE-ST-2094-40], with [AV1] including when supporting [CMAF].
1. Introduction
1.1. Scope
This document specifies how to use HDR10+ metadata, [SMPTE-ST-2094-40], with [AV1] including when supporting [CMAF].
HDR10+ dynamic metadata is used with [BT-2100] Perceptual Quantizer (PQ) System streams generally known as "HDR10". Various tools, services and devices support creation and use of HDR10+ dynamic metadata, [SMPTE-ST-2094-40], which can be easily utilized directly in AV1 systems. Carriage of HDR10+ in AV1 leverages the existing ITU-T T.35, [ITU-T-T35], support which is defined in [CTA-861]. HDR10+ data is placed in AV1 metadata OBUs of metadata type equal to METADATA_TYPE_ITUT_T35. This document covers details of the OBU placement.
1.2. Acronyms
For the purpose of this specification, the following acronyms apply:
In this specification, HDR10+ Metadata is defined as data with the semantics defined in [SMPTE-ST-2094-40], using the syntax defined in [CTA-861], as illustrated in Figure 1.
Figure 1. METADATA_TYPE_ITUT_T35 OBU Structure
Note: AV1 defines the general metadata OBU syntax for HDR10 Static Metadata and ITU-T T.35 Metadata.
2.2. Placement of HDR10+ OBUs
As defined in [AV1] and shown in Figure 2 an AOM AV1 coded video sequence consists of one or more TUs. A TU contains a series of OBUs starting from a temporal delimiter, optional sequence headers, optional metadata OBUs, a sequence of one or more frame headers, each followed by zero or more tile group OBUs as well as optional padding OBUs.
Consequently, for each frame with show_frame=1 or show_existing_frame=1, there shall be one and only one HDR10+ metadata OBU preceding the frame header for this frame and located after the last OBU of the previous frame (if any) or after the Sequence Header (if any) or after the start of the temporal unit (e.g. after the temporal delimiter, for storage formats where temporal delimiters are preserved).
HDR10+ Dynamic Metadata OBUs are not provided when show_frame=0. For non-layered streams, there is only one HDR10+ Dynamic Metadata OBU per TU. For layered streams, there is only one such OBU per TU per layer. Note that when an AV1 stream is encoded in multiple layers, metadata may apply to a specific layer in which case the OBU header for that Metadata OBU should use the extension header and have the same value as the OBU header of the frame with which it is associated.
For formats that use the AV1CodecConfigurationRecord (e.g. ISOBMFF and MPEG-2 TS), HDR10+ Metadata OBUs shall not be present in the configOBUs field of the record.
4. Constraints on Encryption
[AV1-ISOBMFF] indicates that Metadata OBUs may be protected. This specification requires that HDR10 and HDR10+ metadata OBUs be unprotected.
5. ISOBMFF/CMAF
The CMAF AV1 track format addresses structural constraints on ISOBMFF files defined by CMAF.
A CMAF AV1 track that conforms to this specification (i.e. contains HDR10+ metadata OBUs) should use the compatible brand code "cdm4" identified in [CTA-5001] in the `ftyp` box, in addition to the CMAF file brand `av01`.
6. HDR10+ Compatible Manifests
The value of the codecs parameter for AV1 streams defined in [AV1-ISOBMFF] shall remain unchanged when HDR10+ is included.
[DASH] Content following [DASH-IOP]should include a Supplemental Descriptor with a @schemeUri set to "http://dashif.org/metadata/hdr" and a @value set to "SMPTE2094-40" in Manifest files to aid players to identify tracks containing HDR10+.
7. Film Grain Processing:
It is possible that some AV1 coded bitstreams may contain both HDR10+ metadata and film grain synthesis information. It is recommended that decoders in such scenarios perform the film grain synthesis prior to any HDR10+ processing.
8. Example Streams and Tools:
Information on this topic is found in the Wiki for this project.
Conformance
Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology.
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL”
in the normative parts of this document
are to be interpreted as described in RFC 2119.
However, for readability,
these words do not appear in all uppercase letters in this specification.
All of the text of this specification is normative
except sections explicitly marked as non-normative, examples, and notes. [RFC2119]
Examples in this specification are introduced with the words “for example”
or are set apart from the normative text with class="example", like this:
This is an example of an informative example.
Informative notes begin with the word “Note”
and are set apart from the normative text with class="note", like this: