HDR10+ AV1 Metadata Handling Specification

Commit Snapshot,

This version:
https://AOMediaCodec.github.io/av1-hdr10plus
Issue Tracking:
GitHub
Editor:
Paul Hearty (Samsung)
Warning

This specification is a draft of a potential new version of this specification and should not be referenced other than as a working draft.

Copyright 2021, The Alliance for Open Media

Licensing information is available at http://aomedia.org/license/

The MATERIALS ARE PROVIDED “AS IS.” The Alliance for Open Media, its members, and its contributors expressly disclaim any warranties (express, implied, or otherwise), including implied warranties of merchantability, non-infringement, fitness for a particular purpose, or title, related to the materials. The entire risk as to implementing or otherwise using the materials is assumed by the implementer and user. IN NO EVENT WILL THE ALLIANCE FOR OPEN MEDIA, ITS MEMBERS, OR CONTRIBUTORS BE LIABLE TO ANY OTHER PARTY FOR LOST PROFITS OR ANY FORM OF INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY CHARACTER FROM ANY CAUSES OF ACTION OF ANY KIND WITH RESPECT TO THIS DELIVERABLE OR ITS GOVERNING AGREEMENT, WHETHER BASED ON BREACH OF CONTRACT, TORT (INCLUDING NEGLIGENCE), OR OTHERWISE, AND WHETHER OR NOT THE OTHER MEMBER HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


Abstract

This document specifies how to use HDR10+ metadata, [SMPTE-ST-2094-40], with [AV1] including when supporting [CMAF].

1. Introduction

1.1. Scope

This document specifies how to use HDR10+ metadata, [SMPTE-ST-2094-40], with [AV1] including when supporting [CMAF].

HDR10+ dynamic metadata is used with [BT-2100] Perceptual Quantizer (PQ) System streams generally known as "HDR10". Various tools, services and devices support creation and use of HDR10+ dynamic metadata, [SMPTE-ST-2094-40], which can be easily utilized directly in AV1 systems. Carriage of HDR10+ in AV1 leverages the existing ITU-T T.35, [ITU-T-T35], support which is defined in [CTA-861]. HDR10+ data is placed in AV1 metadata OBUs of metadata type equal to METADATA_TYPE_ITUT_T35. This document covers details of the OBU placement.

1.2. Acronyms

For the purpose of this specification, the following acronyms apply:

2. Use of HDR10+ with AV1 T.35 OBUs:

2.1. HDR10+ OBU

In this specification, HDR10+ Metadata is defined as data with the semantics defined in [SMPTE-ST-2094-40], using the syntax defined in [CTA-861], as illustrated in Figure 1.

Figure 1. METADATA_TYPE_ITUT_T35 OBU Structure

Note: AV1 defines the general metadata OBU syntax for HDR10 Static Metadata and ITU-T T.35 Metadata.

2.2. Placement of HDR10+ OBUs

As defined in [AV1] and shown in Figure 2 an AOM AV1 coded video sequence consists of one or more TUs. A TU contains a series of OBUs starting from a temporal delimiter, optional sequence headers, optional metadata OBUs, a sequence of one or more frame headers, each followed by zero or more tile group OBUs as well as optional padding OBUs.

Consequently, for each frame with show_frame=1 or show_existing_frame=1, there shall be one and only one HDR10+ metadata OBU preceding the frame header for this frame and located after the last OBU of the previous frame (if any) or after the Sequence Header (if any) or after the start of the temporal unit (e.g. after the temporal delimiter, for storage formats where temporal delimiters are preserved).

HDR10+ Dynamic Metadata OBUs are not provided when show_frame=0. For non-layered streams, there is only one HDR10+ Dynamic Metadata OBU per TU. For layered streams, there is only one such OBU per TU per layer. Note that when an AV1 stream is encoded in multiple layers, metadata may apply to a specific layer in which case the OBU header for that Metadata OBU should use the extension header and have the same value as the OBU header of the frame with which it is associated.

Figure 2. Example of OBU_Frame Structure

HDR10 Static Metadata (MDCV, MaxCLL and MaxFALL) may be present.

3. Storage and Transport considerations:

For formats that use the AV1CodecConfigurationRecord (e.g. ISOBMFF and MPEG-2 TS), HDR10+ Metadata OBUs shall not be present in the configOBUs field of the record.

4. Constraints on Encryption

[AV1-ISOBMFF] indicates that Metadata OBUs may be protected. This specification requires that HDR10 and HDR10+ metadata OBUs be unprotected.

5. ISOBMFF/CMAF

The CMAF AV1 track format addresses structural constraints on ISOBMFF files defined by CMAF.

A CMAF AV1 track that conforms to this specification (i.e. contains HDR10+ metadata OBUs) should use the compatible brand code "cdm4" identified in [CTA-5001] in the `ftyp` box, in addition to the CMAF file brand `av01`.

6. HDR10+ Compatible Manifests

The value of the codecs parameter for AV1 streams defined in [AV1-ISOBMFF] shall remain unchanged when HDR10+ is included.

[DASH] Content following [DASH-IOP] should include a Supplemental Descriptor with a @schemeUri set to "http://dashif.org/metadata/hdr" and a @value set to "SMPTE2094-40" in Manifest files to aid players to identify tracks containing HDR10+.

7. Film Grain Processing:

It is possible that some AV1 coded bitstreams may contain both HDR10+ metadata and film grain synthesis information. It is recommended that decoders in such scenarios perform the film grain synthesis prior to any HDR10+ processing.

8. Example Streams and Tools:

Information on this topic is found in the Wiki for this project.

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

References

Normative References

[AV1]
AV1 Bitstream & Decoding Process Specification. Standard. URL: https://aomediacodec.github.io/av1-spec/av1-spec.pdf
[AV1-ISOBMFF]
AV1 Codec ISO Media File Format Binding. Standard. URL: https://aomediacodec.github.io/av1-isobmff/
[BT-2100]
BT.2100. Standard. URL: https://www.itu.int/rec/R-REC-BT.2100
[CMAF]
Information technology — Multimedia application format (MPEG-A) — Part 19: Common media application format (CMAF) for segmented media.. Standard. URL: https://www.iso.org/standard/71975.html
[CTA-5001]
CTA-5001-C. Standard. URL: https://shop.cta.tech/products/web-application-video-ecosystem-content-specification
[CTA-861]
ANSI/CTA-861-H. Standard. URL: https://shop.cta.tech/products/a-dtv-profile-for-uncompressed-high-speed-digital-interfaces-cta-861-h
[DASH]
Information technology — Dynamic adaptive streaming over HTTP (DASH) — Part 1: Media presentation description and segment formats. Standard. URL: https://www.iso.org/standard/79329.html
[ITU-T-T35]
ITU-T T.35. Standard. URL: https://www.itu.int/rec/T-REC-T.35-200002-I/en
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[SMPTE-ST-2086]
SMPTE ST 2086:2018. Standard. URL: https://ieeexplore.ieee.org/document/8353899
[SMPTE-ST-2094-40]
SMPTE ST 2094-40:2020. Standard. URL: https://ieeexplore.ieee.org/document/9095450

Informative References

[DASH-IOP]
Guideline for Implementation: DASH-IF Interoperability Points V4.3: On-Demand and Mixed Services, HDR Dynamic Metadata and other Improvements.. Standard. URL: https://dashif.org/guidelines/