Video-Bookmarks =============== Proposed player independent standard for attaching textual metadata to certain timecode positions within a video file, as markers, as annotations or as video bookmarks. Version 1. ## DESCRIPTION Some video-players or video editing software implement the notion of having a marker at various time offsets of a video file. A marker specifies a singular timecode position within a video-file. Mostly a marker is indicator of the beginning of a section within a presentation, so most markers are used to attach short textual notes to a video. Within the scope of this document, these annotations are further meant to tell player software to begin playback at these positions within a file, once a user specifies a selected marker. As means to jump to a certain position of a presentation, we regard these markers as video bookmarks, although they may be used in different contexts than playback and for different purposes. As the overall idea can also be applied to audio files, we might just as well speak of audio, or more general, media bookmarks. This document defines a syntax for describing these video bookmarks and offers recommendations for storage of video bookmarks along with and embedded in files. ## SYNTAX & DATA STRUCTURES We store two values per bookmark: - The timecode offset in the form HH:MM:SS.mmm, whereas timecode 00:00 is the absolute beginning of the video-file, and not relative to potentially embedded timecode within the file. (H is hours, M minutes, S seconds, m milliseconds.) - A descriptive text, free form. Which can be used for a description, a track title, notes, annotations, enumerations, etc. The format further proposes optional metadata beyond the basics above, although a specific syntax has not yet settled for these optional values: - A type indicator, which can be used to tell implementing players how playback should be organized, played as "one-shot" (default, when omitted), or looped, or something like "seek" to indicate the player to seek to the specified position then pause. But beyond the use for playback, the type can be used by applications to attach application specific modes or arguments to markers. - A second timecode position, an end position, which can be used to define a time span within the file. Implementing applications should look into multiple storage layers, gather found bookmarks and present them to the user for selection in order, sorted by ascending timecode. Storage layers are the filename (the file's basename), filesystem extended attributes (xattr) and the video container's metadata key/value store (if available). Bookmarks may be stored in all three stores, with deduplication done by implementing applications. If video bookmarks are saved in a file's name and the host filesystem doesn't allow the colon (":") character in filenames, the colon may be exchanged with the dot ("."). We store these values into a metadata attribute with the name (key) _video.bookmarks_ as a JSON encoded array of arrays. Or into _video.bookmarks_ plus the child keys _video.bookmark.1_ ... _video.bookmark.n_, in case the underlying storage mechanism can't hold the length of the entire data structure. Bookmarks can either be stored in the proposed Video-Bookmarks textual syntax proposed here or as an array of arrays, encoded as JSON, with a list item layout resembling the textual syntax. ### Timecode format Both variants, syntax and JSON layout structure, share a common notation of timecode values. Video bookmark's timecode format follows common understanding, that means, it's evaluated backwards. Having a timecode of _0:22_ means 22 seconds, zero minutes - having a timecode of _1:22:00_ means one hour, 22 minutes and zero seconds, with the leading zero being omitted in the hours position. When only a number is given, like _123_, it is interpreted as seconds. This follows a convention established by ffmpeg / avconv and is contrary to the stated paradigm of evaluating timecode backwards (where it would mean it's a millisecond value, with all leading hours, minutes, seconds omitted). The briefest form of defining a millisecond offset following this scheme would be _0.123_. Timecode is not frame-rate dependent, nor does the last string element refer to a specific frame. All numbers refer to absolute runtime at intended playback speed. ### Textual syntax in filenames The textual syntax for bookmarks in video files is mainly targeted at use in a file's name, its basename. It looks similar to links in the Markdown markup language but Video-Bookmarks have the URI and text positions exchanged. Here the timecode is in square brackets, with the description in round brackets (parentheses):     \[\](\)     for example:
    \[01:22:45](here starts some moment in the video\)     embedded into a filename:
    /path/to/video/Videofile_123_xyz_\[01:22:45](here starts some moment in the video).mp4 An empty description must be defined as an empty string (or as JSON undef, see below). Bookmarks set in filenames must consist of two square and two round brackets, even when the description is empty. So _\[00:58:14]()_ is a valid video bookmark / marker. For readability, markers in filenames should be separated from other filename elements by either a space or a dash. ### JSON structure in extended attributes The proposed standard requires video bookmarks to be stored in defined extended attribute (xattr) keys. The idea behind this requirement and the storage structure is to minimise I/O lookups. A single test for one extended attribute (video.bookmarks) returns if video-bookmarks are present in xattr. Video bookmarks stored in xattr require the metadata key _video.bookmarks_ to be set. The value of _video.bookmarks_ is either a number, indicating that there are n more _video.bookmark.n_ key/value attributes set, or a list of arrays encoded as JSON. The JSON data structure for video bookmarks is an array of arrays: each element in the parent/main array is a reference to an array which represents one bookmark. List positions in each bookmark array have implicit meaning: Position one (array position 0) holds the timecode, position two (array position 1) is a bookmark's description. Subsequent positions may hold values with a to-be defined meaning in the future (compare "optional metadata beyond the basics" above). Example storage layouts:    video.bookmarks = \[00:00:05](this is a single bookmark in textual syntax)    video.bookmarks = \[["00:00:05", "this is a single bookmark in JSON syntax"]]    video.bookmarks = 2
   video.bookmark.1 = \[["00:00:05", "this is bookmark one, in JSON syntax"],["00:00:12", "this is bookmark two"]]
   video.bookmark.2 = \[00:00:05](this is bookmark three, in textual syntax) As the xattr storage structure allows to choose either the textual syntax or the JSON layout, implementing software must test for these two variants. ### Storage in media container metadata-store In case the container format of a media file allows to store arbitrary metadata about a file, this store can be used to attach video bookmarks just as well. Whether the textual syntax or the JSON structure layout is used depends on the metadata store's characteristics in terms of abilities and size-restrictions. ## SEE ALSO / RELATED TECHNOLOGY ### Cue Sheets Cue Sheets or .cue files are used to describe how individual tracks are laid out in an otherwise monolithic audio file. Sometimes CD contents, after being extracted from physical media, are stored as one continuous file with the attached cue sheet containing the offsets of the individual tracks within the file, similar to a playlist. Some audio containers, most notably FLAC, can embed a whole or parts of a cue file, and supporting media players can treat such a combination as individual tracks - although resulting sub-playlists end up being problematic in relation with the predominant "one file = one playlist item" paradigm. ### Chapters in media containers Most modern media containers offer a facility to store or define information about chapters in a file, similar to DVD or Blu-Ray presentations. And although most players support chapters, very often users have no means to add chapters to a file or edit defined chapters. ### Markers in WAV files The RIFF file format, the data format of WAV files, allows to add cue chunks to a file. These chunks can define named cue points for certain positions in the audio file. Cue chunks are very often handled in audio software but are limited to audio and wav files. ### Subtitles formats Subtitle files define text to be displayed at specified timecodes of a video file. As such, this technology solves similar problems in terms of attaching metadata to certain time offsets within a video file. But from the established use case and common implementations, using subtitle files to store different metadata such as marks, annotations or bookmarks would lead to confusion. ## AUTHOR Development of the _Video Bookmarks_ standards proposal and attached software has been funded by Clipland GmbH, [clipland.com](http://www.clipland.com/) ## Copyright and License This standards proposal is Copyright 2015-2017 Clipland GmbH. All rights reserved. Clipland GmbH licenses this standard and its documentation to the public under the GNU Free Documentation License (GNU FDL or GFDL) Version 1.3.