- Fix issue reading buffer
- Renamed and reorganized few of the modules
- Parsing methods are now class methods: read, from_srt and from_sbv
- Improved usability with the addition of shortcuts to avoid instantiating the classes so we can do:
webvtt.read(‘captions.vtt’) # this will return a WebVTT instance
- Support for saving cue identifiers
The main goal of this release is a refactor of the WebVTT parser to be able to parse easier and give support to new features of the format.
- Support for cue identifiers
- Support for parsing WebVTT captions with comments
- Support for parsing WebVTT captions with Style blocks
- Support for BOM in caption files
- Added method to write the captions to an opened file
- Convert WebVTT to SRT format
- Ignore empty captions in SRT format
- Refactored WebVTT parser
The text for the caption is now returned clean (tags removed). The cue text could contain tags like: * timestamp tags: <00:19.000> * class tags: <c.classname>text</c> * and others… Important: It currently removes any tag present in the cue text. For example <b> would be removed.
Also a new attribute is available on captions to retrieve the text without cleaning tags: raw_text
The goal of this release if to allow the WebVTT parser to be able to read caption files that contain metadata headers that extend to more than one line.
- Made hours in WebVTT parser optional as per specs.
- Added support to parse WebVTT files that contain metadata headers.
- Added support for YouTube SBV captions.
- Added easy iteration to WebVTT class.
- New CLI command for segmenting captions for HLS.
- Improved parsers to reuse functionality.
- Added an exception for invalid timestamps in captions.
- Added an exception when saving without a filename.
- Refactor of the main module and parsers.
This module is released with the following initial features:
- Read/Edit/Write WebVTT captions.
- Read SRT captions and convert to WebVTT.
- Segment WebVTT files for captioning HLS video.