Closed captions vs subtitles: what's the difference?
Understand the key differences between closed captions and subtitles. Learn when to use each, accessibility requirements, and how they're created.
Closed captions include sound descriptions and are designed for deaf and hard-of-hearing viewers, while subtitles assume the viewer can hear and only transcribe dialogue. Despite being used interchangeably in casual conversation, these two formats serve different audiences, follow different standards, and contain different types of information.
Understanding the distinction matters whether you are a content creator, educator, or business trying to make video accessible. Choosing the wrong format can leave viewers without critical context or land you on the wrong side of accessibility regulations. This guide breaks down exactly how closed captions and subtitles differ, when to use each, and how modern AI tools make creating both far easier than it used to be.
What are closed captions?
Closed captions are a text overlay that represents every meaningful sound in a video, not just spoken words. They were originally developed for television in the 1970s to give deaf and hard-of-hearing viewers full access to broadcast content.
A closed caption track typically includes:
- Dialogue with speaker identification (e.g., "NARRATOR:" or "SARAH:")
- Sound effects described in brackets, such as [door slams], [phone ringing], or [footsteps approaching]
- Music descriptions like [upbeat jazz music] or [somber piano melody]
- Tone and manner indicators such as [whispering], [sarcastically], or [shouting]
- Non-speech audio cues that carry meaning, such as [silence], [applause], or [static]
The word "closed" means the captions can be turned on or off by the viewer. This distinguishes them from open captions, which are permanently embedded in the video frame. Most streaming platforms, broadcast television, and video players support closed captions through a CC button or accessibility menu.
Closed captions are synchronized to the audio timeline with precise timestamps. Each caption block appears and disappears at specific moments, ensuring the text matches what is happening on screen. The standard file formats for closed captions include SRT and VTT, as well as broadcast-specific formats like SCC and MCC.
What are subtitles?
Subtitles are a text representation of spoken dialogue in a video. They are designed for viewers who can hear the audio but need the speech converted to text, most commonly because the video is in a foreign language.
Subtitles typically include:
- Spoken dialogue transcribed verbatim or translated
- On-screen text translation when signs, titles, or written content appear in the video
- Speaker attribution in some cases, though this is less consistent than in captions
What subtitles generally do not include is the non-speech audio information that defines closed captions. A subtitle track will not tell you that a door slammed off-screen, that suspenseful music is building, or that a character is whispering. The assumption is that the viewer can hear those elements.
Subtitles are most commonly associated with foreign-language content. When you watch a film in French with English text on screen, you are reading subtitles. The text has been translated and timed to match the original dialogue, but it does not describe the ambient sounds or musical score.
Subtitles use the same file formats as captions, primarily SRT and VTT, which can sometimes blur the line between the two. The difference is in the content, not the container.
Key differences between closed captions and subtitles
| Feature | Closed captions | Subtitles |
|---|---|---|
| Primary audience | Deaf and hard-of-hearing viewers | Hearing viewers watching foreign-language content |
| Dialogue | Yes | Yes |
| Sound effects | Yes, described in brackets | No |
| Music descriptions | Yes | No |
| Speaker identification | Yes, typically labeled | Sometimes |
| Language | Usually same language as audio | Often a different language (translation) |
| Toggleable | Yes, viewer can turn on/off | Yes |
| Legally required | Often yes (ADA, FCC, EU) | Generally no |
The core distinction comes down to completeness. Closed captions aim to represent the entire audio track in text form. Subtitles aim to make spoken dialogue readable in another language or in text form for convenience.
In practice, the terminology varies by region. In the United States, "closed captions" and "subtitles" are distinct concepts as described above. In the United Kingdom and much of Europe, the term "subtitles" covers both use cases, and what Americans call closed captions are often referred to as "subtitles for the deaf and hard of hearing" (SDH). If you are distributing content internationally, it helps to be explicit about what your text track contains rather than relying on terminology alone.
Open captions vs closed captions
Beyond the distinction between captions and subtitles, there is an important difference between open and closed captions.
Closed captions are delivered as a separate data track alongside the video. The viewer can toggle them on or off, resize them, and sometimes change their appearance. Streaming services, DVDs, and broadcast television all use closed captions.
Open captions (also called burned-in captions or hardcoded captions) are permanently rendered into the video image itself. They cannot be turned off, resized, or repositioned. Once a video is exported with open captions, every viewer sees them.
When open captions make sense
- Social media videos that autoplay without sound, where viewers may not know how to enable captions on the platform
- Short-form content on platforms like Instagram Stories or TikTok where caption toggles are not always accessible
- Presentations and kiosk displays where viewers cannot interact with playback controls
- Ensuring visibility when you need a guarantee that every viewer sees the text
When closed captions are better
- Long-form content where viewer preference matters
- Accessibility compliance since regulations typically require the viewer to control caption display
- Multi-language distribution where you need to offer caption tracks in several languages
- Platform requirements since YouTube, Vimeo, and most streaming services expect closed caption files
Most professional workflows produce closed caption files (SRT or VTT) because they offer maximum flexibility. You can always burn them in later if needed, but you cannot extract open captions from a video once they are rendered.
Legal requirements for captions
Accessibility regulations in multiple jurisdictions require closed captions on certain types of video content. Here is a brief overview of the major ones.
United States
- Americans with Disabilities Act (ADA): Courts have interpreted the ADA to require captions on video content from businesses that qualify as places of public accommodation. This increasingly includes websites and online video.
- Section 508: Federal agencies must make electronic content accessible, including video with captions.
- FCC regulations: Television broadcasters and online distributors of previously televised content must provide closed captions.
- CVAA (21st Century Communications and Video Accessibility Act): Extends captioning requirements to internet-delivered video that was originally shown on TV.
European Union
- European Accessibility Act (EAA): Takes effect in 2025, requiring digital services, including video platforms, to meet accessibility standards. Captioning is a key component.
- EN 301 549: The European standard for ICT accessibility, which references WCAG and includes requirements for captions and audio descriptions.
Web standards
- WCAG 2.1 Level AA: The Web Content Accessibility Guidelines require captions for all prerecorded audio content in synchronized media (Success Criterion 1.2.2) and for live audio content (Success Criterion 1.2.4 at Level AA).
Failing to provide captions does not just limit your audience. It exposes organizations to legal risk, particularly in the United States where ADA-related lawsuits involving digital accessibility have increased significantly in recent years.
When to use closed captions vs subtitles
Choosing between captions and subtitles depends on your audience and distribution context.
Use closed captions when:
- Your audience includes deaf or hard-of-hearing viewers
- You are publishing on a platform that supports toggleable text tracks
- Accessibility compliance is required or expected
- Your content has meaningful non-speech audio (sound effects, music, ambient sounds)
- The captions are in the same language as the audio
Use subtitles when:
- You are translating content for foreign-language audiences
- The audio is clearly audible and the viewer only needs dialogue text
- You are distributing to international markets and need multi-language text tracks
Use both when:
- You want maximum reach, offering same-language captions for accessibility and translated subtitles for international viewers
- Your platform supports multiple text tracks (YouTube, Vimeo, most streaming services)
In many cases, the practical answer is to start with a full caption file in the original language and then create translated subtitle tracks from it. This gives you both accessibility coverage and international reach.
How to create captions and subtitles with AI
Producing captions and subtitles used to mean hours of manual transcription work or expensive professional services. AI transcription tools have changed that equation considerably. A recording that would take a human transcriptionist four to six hours to caption can now be processed in minutes.
Here is a typical workflow for creating captions or subtitles with AI:
Step 1: Transcribe the audio
Upload your audio or video file to an AI transcription tool like Vocova. The tool uses automatic speech recognition to convert speech to text with timestamps and, if supported, speaker labels. Accuracy depends on audio quality, so starting with a clean recording helps. If your audio has background noise, there are techniques to improve the results.
Step 2: Review and edit
AI transcription is not perfect. The industry measures accuracy using word error rate (WER), and even the best models produce some errors, especially with proper nouns, technical terms, or accented speech. Review the transcript and correct any mistakes.
Step 3: Add non-speech elements (for captions)
If you are creating closed captions rather than subtitles, you need to add sound effect descriptions, music cues, and speaker labels that the AI may not have captured. Some tools provide speaker diarization to help with identification, but sound effect descriptions typically require manual annotation.
Step 4: Export in the right format
Export your finished transcript as an SRT or VTT file. These are the two most widely supported caption and subtitle formats across video platforms. Most AI subtitle generators can export in both formats. Vocova supports exporting to SRT, VTT, and several other formats including PDF, DOCX, and CSV.
Step 5: Translate for subtitles
If you need subtitles in additional languages, use the translation feature to generate translated versions of your transcript. Vocova supports translation into 145+ languages, which makes creating multi-language subtitle tracks straightforward. Review translated subtitles for accuracy, particularly for idiomatic expressions and cultural context.
Step 6: Upload to your platform
Add your SRT or VTT files to your video platform. YouTube, Vimeo, and most hosting services allow you to upload multiple caption and subtitle tracks, letting viewers choose their preferred language and format.
Frequently asked questions
Are closed captions the same as subtitles?
No. Closed captions include descriptions of non-speech audio such as sound effects, music, and speaker identification. Subtitles only contain dialogue text and are primarily used for language translation. The terminology overlaps in some regions, but the content differs.
Do I need closed captions or subtitles for YouTube?
YouTube supports both. If you want to reach the widest audience, upload same-language captions for accessibility and translated subtitles for international viewers. YouTube also auto-generates captions, but their accuracy varies and they do not include non-speech audio descriptions.
What file format should I use for captions?
SRT and VTT are the most widely supported formats. SRT works on nearly every video platform and editor. VTT offers additional styling options and is the standard for HTML5 web video. For a detailed comparison, see our guide on SRT vs VTT formats.
Are captions legally required?
In many contexts, yes. The ADA, Section 508, WCAG 2.1, and the European Accessibility Act all include captioning requirements for certain types of content and organizations. Even where not legally mandated, captions improve accessibility, engagement, and SEO.
Can AI generate closed captions automatically?
AI can generate accurate transcriptions with timestamps and speaker labels, which forms the foundation of a closed caption file. However, non-speech audio descriptions such as [music playing] or [door slams] typically need to be added manually, since most ASR models focus on speech recognition rather than general audio event detection.
What is the difference between SDH and closed captions?
SDH stands for "subtitles for the deaf and hard of hearing." It combines elements of both captions and subtitles: it includes non-speech audio descriptions like closed captions, but it is formatted and delivered as a subtitle track. SDH is common on DVDs, Blu-rays, and streaming platforms, and it is often the standard in regions where "subtitles" is the default terminology for all text tracks.