Skip to content


Data model for videos.

The Video class is a SLEAP data structure that stores information regarding a video and its components used in SLEAP.


Video class used by sleap to represent videos and data associated with them.

This class is used to store information regarding a video and its components. It is used to store the video's filename, shape, and the video's backend.

To create a Video object, use the from_filename method which will select the backend appropriately.


Name Type Description
filename str | list[str]

The filename(s) of the video. Supported extensions: "mp4", "avi", "mov", "mj2", "mkv", "h5", "hdf5", "slp", "png", "jpg", "jpeg", "tif", "tiff", "bmp". If the filename is a list, a list of image filenames are expected. If filename is a folder, it will be searched for images.

backend Optional[VideoBackend]

An object that implements the basic methods for reading and manipulating frames of a specific video type.

backend_metadata dict[str, any]

A dictionary of metadata specific to the backend. This is useful for storing metadata that requires an open backend (e.g., shape information) without having access to the video file itself.

source_video Optional[Video]

The source video object if this is a proxy video. This is present when the video contains an embedded subset of frames from another video.


Instances of this class are hashed by identity, not by value. This means that two Video instances with the same attributes will NOT be considered equal in a set or dict.

See also: VideoBackend

Source code in sleap_io/model/
class Video:
    """`Video` class used by sleap to represent videos and data associated with them.

    This class is used to store information regarding a video and its components.
    It is used to store the video's `filename`, `shape`, and the video's `backend`.

    To create a `Video` object, use the `from_filename` method which will select the
    backend appropriately.

        filename: The filename(s) of the video. Supported extensions: "mp4", "avi",
            "mov", "mj2", "mkv", "h5", "hdf5", "slp", "png", "jpg", "jpeg", "tif",
            "tiff", "bmp". If the filename is a list, a list of image filenames are
            expected. If filename is a folder, it will be searched for images.
        backend: An object that implements the basic methods for reading and
            manipulating frames of a specific video type.
        backend_metadata: A dictionary of metadata specific to the backend. This is
            useful for storing metadata that requires an open backend (e.g., shape
            information) without having access to the video file itself.
        source_video: The source video object if this is a proxy video. This is present
            when the video contains an embedded subset of frames from another video.

        Instances of this class are hashed by identity, not by value. This means that
        two `Video` instances with the same attributes will NOT be considered equal in a
        set or dict.

    See also: VideoBackend

    filename: str | list[str]
    backend: Optional[VideoBackend] = None
    backend_metadata: dict[str, any] = attrs.field(factory=dict)
    source_video: Optional[Video] = None

    EXTS = MediaVideo.EXTS + HDF5Video.EXTS + ImageVideo.EXTS

    def __attrs_post_init__(self):
        """Post init syntactic sugar."""
        if self.backend is None and self.exists():

    def from_filename(
        filename: str | list[str],
        dataset: Optional[str] = None,
        grayscale: Optional[bool] = None,
        keep_open: bool = True,
        source_video: Optional[Video] = None,
    ) -> VideoBackend:
        """Create a Video from a filename.

            filename: The filename(s) of the video. Supported extensions: "mp4", "avi",
                "mov", "mj2", "mkv", "h5", "hdf5", "slp", "png", "jpg", "jpeg", "tif",
                "tiff", "bmp". If the filename is a list, a list of image filenames are
                expected. If filename is a folder, it will be searched for images.
            dataset: Name of dataset in HDF5 file.
            grayscale: Whether to force grayscale. If None, autodetect on first frame
            keep_open: Whether to keep the video reader open between calls to read
                frames. If False, will close the reader after each call. If True (the
                default), it will keep the reader open and cache it for subsequent calls
                which may enhance the performance of reading multiple frames.
            source_video: The source video object if this is a proxy video. This is
                present when the video contains an embedded subset of frames from
                another video.

            Video instance with the appropriate backend instantiated.
        return cls(

    def shape(self) -> Tuple[int, int, int, int] | None:
        """Return the shape of the video as (num_frames, height, width, channels).

        If the video backend is not set or it cannot determine the shape of the video,
        this will return None.
        return self._get_shape()

    def _get_shape(self) -> Tuple[int, int, int, int] | None:
        """Return the shape of the video as (num_frames, height, width, channels).

        This suppresses errors related to querying the backend for the video shape, such
        as when it has not been set or when the video file is not found.
            return self.backend.shape
            if "shape" in self.backend_metadata:
                return self.backend_metadata["shape"]
            return None

    def grayscale(self) -> bool | None:
        """Return whether the video is grayscale.

        If the video backend is not set or it cannot determine whether the video is
        grayscale, this will return None.
        shape = self.shape
        if shape is not None:
            return shape[-1] == 1
            if "grayscale" in self.backend_metadata:
                return self.backend_metadata["grayscale"]
            return None

    def __len__(self) -> int:
        """Return the length of the video as the number of frames."""
        shape = self.shape
        return 0 if shape is None else shape[0]

    def __repr__(self) -> str:
        """Informal string representation (for print or format)."""
        dataset = (
            f"dataset={self.backend.dataset}, "
            if getattr(self.backend, "dataset", "")
            else ""
        return (
            f'filename="{self.filename}", '
            f"shape={self.shape}, "

    def __str__(self) -> str:
        """Informal string representation (for print or format)."""
        return self.__repr__()

    def __getitem__(self, inds: int | list[int] | slice) -> np.ndarray:
        """Return the frames of the video at the given indices.

            inds: Index or list of indices of frames to read.

            Frame or frames as a numpy array of shape `(height, width, channels)` if a
            scalar index is provided, or `(frames, height, width, channels)` if a list
            of indices is provided.

        See also: VideoBackend.get_frame, VideoBackend.get_frames
        if not self.is_open:
        return self.backend[inds]

    def exists(self, check_all: bool = False) -> bool:
        """Check if the video file exists.

            check_all: If `True`, check that all filenames in a list exist. If `False`
                (the default), check that the first filename exists.
        if isinstance(self.filename, list):
            if check_all:
                for f in self.filename:
                    if not Path(f).exists():
                        return False
                return True
                return Path(self.filename[0]).exists()
        return Path(self.filename).exists()

    def is_open(self) -> bool:
        """Check if the video backend is open."""
        return self.exists() and self.backend is not None

    def open(
        dataset: Optional[str] = None,
        grayscale: Optional[str] = None,
        keep_open: bool = True,
        """Open the video backend for reading.

            dataset: Name of dataset in HDF5 file.
            grayscale: Whether to force grayscale. If None, autodetect on first frame
            keep_open: Whether to keep the video reader open between calls to read
                frames. If False, will close the reader after each call. If True (the
                default), it will keep the reader open and cache it for subsequent calls
                which may enhance the performance of reading multiple frames.

            This is useful for opening the video backend to read frames and then closing
            it after reading all the necessary frames.

            If the backend was already open, it will be closed before opening a new one.
            Values for the HDF5 dataset and grayscale will be remembered if not
        if not self.exists():
            raise FileNotFoundError(f"Video file not found: {self.filename}")

        # Try to remember values from previous backend if available and not specified.
        if self.backend is not None:
            if dataset is None:
                dataset = getattr(self.backend, "dataset", None)
            if grayscale is None:
                grayscale = getattr(self.backend, "grayscale", None)

            if dataset is None and "dataset" in self.backend_metadata:
                dataset = self.backend_metadata["dataset"]
            if grayscale is None and "grayscale" in self.backend_metadata:
                grayscale = self.backend_metadata["grayscale"]

        # Close previous backend if open.

        # Create new backend.
        self.backend = VideoBackend.from_filename(

    def close(self):
        """Close the video backend."""
        if self.backend is not None:
            del self.backend
            self.backend = None

    def replace_filename(
        self, new_filename: str | Path | list[str] | list[Path], open: bool = True
        """Update the filename of the video, optionally opening the backend.

            new_filename: New filename to set for the video.
            open: If `True` (the default), open the backend with the new filename. If
                the new filename does not exist, no error is raised.
        if isinstance(new_filename, Path):
            new_filename = new_filename.as_posix()

        if isinstance(new_filename, list):
            new_filename = [
                p.as_posix() if isinstance(p, Path) else p for p in new_filename

        self.filename = new_filename

        if open:
            if self.exists():

grayscale: bool | None property

Return whether the video is grayscale.

If the video backend is not set or it cannot determine whether the video is grayscale, this will return None.

is_open: bool property

Check if the video backend is open.

shape: Tuple[int, int, int, int] | None property

Return the shape of the video as (num_frames, height, width, channels).

If the video backend is not set or it cannot determine the shape of the video, this will return None.


Post init syntactic sugar.

Source code in sleap_io/model/
def __attrs_post_init__(self):
    """Post init syntactic sugar."""
    if self.backend is None and self.exists():


Return the frames of the video at the given indices.


Name Type Description Default
inds int | list[int] | slice

Index or list of indices of frames to read.



Type Description

Frame or frames as a numpy array of shape (height, width, channels) if a scalar index is provided, or (frames, height, width, channels) if a list of indices is provided.

See also: VideoBackend.get_frame, VideoBackend.get_frames

Source code in sleap_io/model/
def __getitem__(self, inds: int | list[int] | slice) -> np.ndarray:
    """Return the frames of the video at the given indices.

        inds: Index or list of indices of frames to read.

        Frame or frames as a numpy array of shape `(height, width, channels)` if a
        scalar index is provided, or `(frames, height, width, channels)` if a list
        of indices is provided.

    See also: VideoBackend.get_frame, VideoBackend.get_frames
    if not self.is_open:
    return self.backend[inds]


Return the length of the video as the number of frames.

Source code in sleap_io/model/
def __len__(self) -> int:
    """Return the length of the video as the number of frames."""
    shape = self.shape
    return 0 if shape is None else shape[0]


Informal string representation (for print or format).

Source code in sleap_io/model/
def __repr__(self) -> str:
    """Informal string representation (for print or format)."""
    dataset = (
        f"dataset={self.backend.dataset}, "
        if getattr(self.backend, "dataset", "")
        else ""
    return (
        f'filename="{self.filename}", '
        f"shape={self.shape}, "


Informal string representation (for print or format).

Source code in sleap_io/model/
def __str__(self) -> str:
    """Informal string representation (for print or format)."""
    return self.__repr__()


Close the video backend.

Source code in sleap_io/model/
def close(self):
    """Close the video backend."""
    if self.backend is not None:
        del self.backend
        self.backend = None


Check if the video file exists.


Name Type Description Default
check_all bool

If True, check that all filenames in a list exist. If False (the default), check that the first filename exists.

Source code in sleap_io/model/
def exists(self, check_all: bool = False) -> bool:
    """Check if the video file exists.

        check_all: If `True`, check that all filenames in a list exist. If `False`
            (the default), check that the first filename exists.
    if isinstance(self.filename, list):
        if check_all:
            for f in self.filename:
                if not Path(f).exists():
                    return False
            return True
            return Path(self.filename[0]).exists()
    return Path(self.filename).exists()

from_filename(filename, dataset=None, grayscale=None, keep_open=True, source_video=None, **kwargs) classmethod

Create a Video from a filename.


Name Type Description Default
filename str | list[str]

The filename(s) of the video. Supported extensions: "mp4", "avi", "mov", "mj2", "mkv", "h5", "hdf5", "slp", "png", "jpg", "jpeg", "tif", "tiff", "bmp". If the filename is a list, a list of image filenames are expected. If filename is a folder, it will be searched for images.

dataset Optional[str]

Name of dataset in HDF5 file.

grayscale Optional[bool]

Whether to force grayscale. If None, autodetect on first frame load.

keep_open bool

Whether to keep the video reader open between calls to read frames. If False, will close the reader after each call. If True (the default), it will keep the reader open and cache it for subsequent calls which may enhance the performance of reading multiple frames.

source_video Optional[Video]

The source video object if this is a proxy video. This is present when the video contains an embedded subset of frames from another video.



Type Description

Video instance with the appropriate backend instantiated.

Source code in sleap_io/model/
def from_filename(
    filename: str | list[str],
    dataset: Optional[str] = None,
    grayscale: Optional[bool] = None,
    keep_open: bool = True,
    source_video: Optional[Video] = None,
) -> VideoBackend:
    """Create a Video from a filename.

        filename: The filename(s) of the video. Supported extensions: "mp4", "avi",
            "mov", "mj2", "mkv", "h5", "hdf5", "slp", "png", "jpg", "jpeg", "tif",
            "tiff", "bmp". If the filename is a list, a list of image filenames are
            expected. If filename is a folder, it will be searched for images.
        dataset: Name of dataset in HDF5 file.
        grayscale: Whether to force grayscale. If None, autodetect on first frame
        keep_open: Whether to keep the video reader open between calls to read
            frames. If False, will close the reader after each call. If True (the
            default), it will keep the reader open and cache it for subsequent calls
            which may enhance the performance of reading multiple frames.
        source_video: The source video object if this is a proxy video. This is
            present when the video contains an embedded subset of frames from
            another video.

        Video instance with the appropriate backend instantiated.
    return cls(

open(dataset=None, grayscale=None, keep_open=True)

Open the video backend for reading.


Name Type Description Default
dataset Optional[str]

Name of dataset in HDF5 file.

grayscale Optional[str]

Whether to force grayscale. If None, autodetect on first frame load.

keep_open bool

Whether to keep the video reader open between calls to read frames. If False, will close the reader after each call. If True (the default), it will keep the reader open and cache it for subsequent calls which may enhance the performance of reading multiple frames.


This is useful for opening the video backend to read frames and then closing it after reading all the necessary frames.

If the backend was already open, it will be closed before opening a new one. Values for the HDF5 dataset and grayscale will be remembered if not specified.

Source code in sleap_io/model/
def open(
    dataset: Optional[str] = None,
    grayscale: Optional[str] = None,
    keep_open: bool = True,
    """Open the video backend for reading.

        dataset: Name of dataset in HDF5 file.
        grayscale: Whether to force grayscale. If None, autodetect on first frame
        keep_open: Whether to keep the video reader open between calls to read
            frames. If False, will close the reader after each call. If True (the
            default), it will keep the reader open and cache it for subsequent calls
            which may enhance the performance of reading multiple frames.

        This is useful for opening the video backend to read frames and then closing
        it after reading all the necessary frames.

        If the backend was already open, it will be closed before opening a new one.
        Values for the HDF5 dataset and grayscale will be remembered if not
    if not self.exists():
        raise FileNotFoundError(f"Video file not found: {self.filename}")

    # Try to remember values from previous backend if available and not specified.
    if self.backend is not None:
        if dataset is None:
            dataset = getattr(self.backend, "dataset", None)
        if grayscale is None:
            grayscale = getattr(self.backend, "grayscale", None)

        if dataset is None and "dataset" in self.backend_metadata:
            dataset = self.backend_metadata["dataset"]
        if grayscale is None and "grayscale" in self.backend_metadata:
            grayscale = self.backend_metadata["grayscale"]

    # Close previous backend if open.

    # Create new backend.
    self.backend = VideoBackend.from_filename(

replace_filename(new_filename, open=True)

Update the filename of the video, optionally opening the backend.


Name Type Description Default
new_filename str | Path | list[str] | list[Path]

New filename to set for the video.

open bool

If True (the default), open the backend with the new filename. If the new filename does not exist, no error is raised.

Source code in sleap_io/model/
def replace_filename(
    self, new_filename: str | Path | list[str] | list[Path], open: bool = True
    """Update the filename of the video, optionally opening the backend.

        new_filename: New filename to set for the video.
        open: If `True` (the default), open the backend with the new filename. If
            the new filename does not exist, no error is raised.
    if isinstance(new_filename, Path):
        new_filename = new_filename.as_posix()

    if isinstance(new_filename, list):
        new_filename = [
            p.as_posix() if isinstance(p, Path) else p for p in new_filename

    self.filename = new_filename

    if open:
        if self.exists():