As video streaming becomes ubiquitous, subtitles need to adapt to web platforms. WebVTT (Web Video Text Tracks) builds on SRT's simplicity while adding features specifically designed for web delivery.
A Brief History#
When HTML5 video emerged, it became clear that subtitles needed to evolve. Browser vendors and streaming platforms required a format that could handle modern web needs: precise styling, multiple languages, and accessibility features. WebVTT was born from these requirements, becoming the W3C standard for web subtitles.
From SRT to WebVTT#
If you're familiar with SRT files, WebVTT will feel natural. Let's look at the same subtitle in both formats:
1
00:00:01,000 --> 00:00:04,000
Hello, world!
WEBVTT
00:00:01.000 --> 00:00:04.000
Hello, world!
The similarities are clear, but WebVTT introduces some key differences:
Required "WEBVTT" header
Numbers before timestamps no longer required (but suggested)
Periods instead of commas in timestamps (
01:00.000instead of01:00,000)
WebVTT also introduces web-specific styling options using limited CSS (Cascading Style Sheets) syntax, along with support for regions and positioning. We'll get into this in the next section.
CSS-Style Formatting#
WebVTT's power comes from its CSS-like styling system. Using STYLE blocks, you can define how different elements appear:
WEBVTT
STYLE
::cue {
color: white;
background-color: rgba(0, 0, 0, 0.7);
font-family: Arial, sans-serif;
}
::cue(b) {
color: yellow;
font-weight: bold;
}
::cue(.important) {
color: red;
font-weight: bold;
}
::cue(v[voice="narrator"]) {
color: cyan;
font-style: italic;
}
Styling Elements#
Different selectors target specific elements:
WEBVTT
STYLE
::cue(b) {
color: yellow;
}
::cue(i) {
font-style: italic;
color: cyan;
}
00:00:01.000 --> 00:00:04.000
This is <b>bold</b> and this is <i>italic</i>
Class-Based Styling#
You can define custom classes for different types of text:
WEBVTT
STYLE
::cue(.important) {
color: red;
font-weight: bold;
}
::cue(.whisper) {
color: gray;
font-style: italic;
}
00:00:01.000 --> 00:00:05.000
<c.important>Critical announcement!</c>
00:00:06.000 --> 00:00:10.000
<c.whisper>secret message</c>
Voice-Based Styling#
Speakers can have distinct styles:
WEBVTT
STYLE
::cue(v[voice="narrator"]) {
color: yellow;
font-family: "Times New Roman", serif;
}
::cue(v[voice="character"]) {
color: cyan;
font-family: Arial, sans-serif;
}
00:00:01.000 --> 00:00:04.000
<v narrator>The story begins...</v>
00:00:04.000 --> 00:00:08.000
<v character>Hello, world!</v>
Language-Specific Styling#
Different languages can have distinct appearances:
WEBVTT
STYLE
::cue(:lang(en)) {
color: white;
font-family: Arial, sans-serif;
}
::cue(:lang(ja)) {
color: yellow;
font-family: "Noto Sans JP", sans-serif;
}
00:00:01.000 --> 00:00:04.000
<lang en>Welcome to the tutorial</lang>
00:00:04.000 --> 00:00:08.000
<lang ja>γγ₯γΌγγͺγ’γ«γΈγγγγ</lang>
Styling Limitations#
While WebVTT's styling system is powerful, it has some important restrictions:
Cannot load external resources
Limited to text-related CSS properties
Styling applies to entire cue boxes
No animation or transition effects
Anatomy of a WebVTT File#
Now that we understand WebVTT's styling capabilities, let's look at how a complete file comes together:
WEBVTT
Kind: captions
Language: en
STYLE
::cue {
color: white;
background-color: rgba(0, 0, 0, 0.7);
}
NOTE
This is a comment - it won't be displayed
1
00:00:01.000 --> 00:00:04.000
In today's video, we'll explore
the latest web technologies.
2
00:00:04.500 --> 00:00:08.000 align:end line:90%
Subscribe for more tutorials!
3
00:00:08.100 --> 00:00:12.000
<v Presenter>Thanks for watching!
Each file contains:
The WEBVTT header (required)
Optional metadata (Kind, Language)
STYLE blocks for formatting
Cue blocks with timing and text
Optional positioning attributes
Positioning and Layout#
Beyond styling, WebVTT offers precise control over subtitle positioning. Unlike traditional formats, WebVTT uses a web-native positioning system:
00:00:04.000 --> 00:00:08.000 align:end position:90%
Right-aligned subtitle
00:00:08.000 --> 00:00:12.000 line:10%
Subtitle near the top
00:00:12.000 --> 00:00:16.000 size:40%
Narrower subtitle
Common positioning properties:
align: Start, center, or end alignmentline: Vertical position (percentage or line number)position: Horizontal position (percentage)size: Width of the text box
Voice and Speaker Support#
For content with multiple speakers, WebVTT provides clear identification through voice tags, which can be styled as we saw earlier:
STYLE
::cue(v[voice="host"]) {
color: yellow;
}
::cue(v[voice="guest"]) {
color: cyan;
}
00:00:01.000 --> 00:00:04.000
<v host>Welcome to the show!
00:00:04.000 --> 00:00:08.000
<v guest>Thanks for having me.
This feature is particularly valuable for interviews, panel discussions, and educational materials. It also helps with accessibility requirements by making speaker changes clear to screen readers.
Working with WebVTT Files#
While WebVTT offers powerful styling and positioning features, keeping subtitles simple often works best. Follow these guidelines for reliable results:
Improving Readability#
The same principles that work for SRT apply to WebVTT:
Two lines maximum per subtitle
Around 40 characters per line
20-25 characters per second
Natural line breaks
Technical Recommendations#
For robust WebVTT files:
Always use UTF-8 encoding
Test positioning on different screen sizes
Verify speaker labels work in your player
Keep styling consistent throughout
Platform Support#
WebVTT enjoys strong support across modern platforms, but capabilities vary.
Most players reliably support:
Basic subtitle display
Simple positioning
Speaker identification
Standard timing
However, test carefully when using:
Complex positioning
Custom styling
Regions
Advanced features
This is due to the web-based nature of WebVTT, which is not always well-supported outside of web browsers, since it requires layout and styling support traditionally only implemented in web browsers.
Common Use Cases#
Video streaming platforms have embraced WebVTT for its reliability and web-native features. The format particularly shines in online learning, where clear speaker identification and precise timing help viewers follow along.
Accessibility is another key strength. Screen readers handle WebVTT well, and the format's support for semantic markup helps create more inclusive content. The combination of CSS-like styling and semantic structure makes it possible to create subtitles that are both visually appealing and accessible.
Tools and Validation#
While any text editor can handle WebVTT files, specialized tools make creation and testing easier:
Professional subtitle editors include:
Aegisub: Supports WebVTT export
Subtitle Edit: Strong WebVTT support
Caption Maker: Web-focused editor
Common Mistakes#
Here are some typical WebVTT-specific issues to watch out for:
Incorrect STYLE block placement#
This example demonstrates how a STYLE block may be placed incorrectly. These blocks must always come before any cues (shown text) in the subtitle.
1
00:00:01.000 --> 00:00:04.000
First subtitle
STYLE
::cue {
color: red;
}
Invalid CSS syntax#
This example demonstrates a common mistake when writing CSS syntax - a missing semicolon. For more information on the specific syntax of CSS, W3Schools provides many great articles on the topic.
WEBVTT
STYLE
::cue {
color: red
font-weight: bold;
}
This example demonstrates invalid use of XML-like tags for class and voice (speaker labeling).
Using
v.importantis invalid and should bec.important(vfor voice vscfor cue).
WEBVTT
STYLE
::cue(.important) {
color: red;
}
00:00:01.000 --> 00:00:04.000
<v.important>Wrong syntax</v>
00:00:01.000 --> 00:00:04.000
<v speaker><c.important>Correct syntax</c></v>
Invalid positioning values#
This example demonstrates an invalid positioning value as well as an invalid alignment value.
positionis set to101%, which is invalid because percentages must be between 0 and 100.alignis set tomiddle, when it should becenter.
00:00:01.000 --> 00:00:04.000 position:101%
First subtitle
00:00:04.000 --> 00:00:08.000 align:middle
Second subtitle
What's Next?#
Now that you understand WebVTT's capabilities, from its CSS-like styling system to positioning controls, you'll want to explore the tools that can create and edit these files efficiently. In our next article, we'll look at subtitle editors that support modern formats like WebVTT.
Time to put your web subtitles to work!