Welcome to CCExtractor's home

A free, GPL licensed closed caption tool


    If you like ccextractor you can buy us a beer:

Current version: 0.54, from April 16th, 2009. Download source code (for Linux and Mac)   Download Windows installer


Update: The Windows binaries are now separate for the main distribution file. If you are a Windows user please check the CCExtractor for Windows page.

Update: There's a major code reorganization in progress. The EIA-708 (DTV) decoder is finally being implemented, proper DVR-MS support has been added, and other long due changes have finally been addressed. It is possible that we broke something and because we only have a few samples for certain sources (Dish, Replay, MythTV), we won't know until we get reports (see contact email below). We are not dropping support for any format. If it does not work - let us know.

What are closed captions?
Closed captions are TV subtitles that you can turn on and off. The term Closed Captions refer to TV subtitles in NTSC, such as the US or Canada. In Europe Teletext (usually page 888) is used for subtitling.

How are closed captions different to DVD subtitles?
There are many technical differences, but these are the two I find more important:
While both provide the user with the same thing (i.e. a transcript of the audio synchronized to it) there are very different from a technical point of view:

- DVD subtitles are image based (they could be anything, not just text) while Closed Captions are text based (they can only be text).
- DVD subtitles are decoded by the DVD player, while Closed Captions are decoded by the TV. This is an important difference because if you are in Europe and buy a DVD from the US that has closed captions but no DVD subtitles you will not be able to use the captions as the TVs in Europe don't have a Closed Caption decoder.

Why do they release DVDs with Captions instead of proper DVD subtitles? It seems that DVD subtitles are better.
I'd say there are two reasons:

- To discourage European people from importing DVDs.
- For TV shows, they already have the closed captions (because they cannot broadcast without them due to legal requirements) but the DVD subtitles need to be made.

Watching a Closed Captioned DVD in Europe
As previously said, European TVs don't come with a built-in decoder, however there are a few options

- Watch the DVDs on a computer using a software player that support Closed Captions (many do).
- Buy a external decoder and put it between the DVD player and the TV. Yes, they do exist. However be aware that your DVD player must help a bit by sending out the Closed Caption data in the video signal, so not all DVD players are good. Anyway here's my setup, which works fine:
    - An Oppo DVD-908h DVD player.
    - A Hitachi Movie-Text external decoder. I bought it around 10 years ago I can't find any place to get one online, however some alternatives exist, such as the Telemole (I haven't tested it myself though) of the Video Reader VR-20
Note: As far as I know closed captions don't work over HDMI. You will need to use an analog connection between the DVD and the closed caption decoder.

What are closed captions for?
- captioning is essential for people with hearing dissabilities, which is the main reason for captions to be a legal requirement for most primetime TV shows.
- Captions are also incredible useful for people learning a foreign language (usually English, Spanish or French)
- Because captions are plain text, they can be used to store accurate transcripts of newscasts.

Recording closed captions
Basically, your recording equipment (PC TV card, DVD recorder, whatever) will either be able to record captions or not. In some cards (at least some Hauppauge models) you need to specifically turn on captions recording. Anyway, usually captions are stored inside the video file (or DVD) without you having to do anything special.
A different issue is reencoding the recording, or editing it, etc: Captions are usually lost because video edit tools don't support captions, so you need to save them before editing the recorded file. This is where CCExtractor comes in.


What's CCExtractor?
A tool that analyzes video files and produces independent subtitle files from the closed captions data. CCExtractor is portable, small, and very fast. It works in Linux, Windows, and OSX.

How easy is it to use CCExtractor?
Very. Just tell it what file to process and it does everything for you.

CCExtractor integration with other tools
It is possible to integrate CCExtractor in a larger process. A couple of tools already call CCExtractor as part their video process - this way they get subtitle support for free.
Starting in 0.52, CCExtractor is very front-end friendly. Front-ends can easily get real-time status information. The GUI source code is provided and can be used for reference.

Any tool, commercial or not, is specifically allowed to use CCExtractor for any use the authors seem fit. So if your favourite video tools still lacks captioning tool, feel free to send the authors here.

What's the point of generating separate files for subtitles, if they are already in the source file?
There are several reasons to have subtitles separated from the video file, including:
- Closed captions never survive MPEG processing. If you take a MPEG file and encode it to any format (such as divx), your result file will not have closed captions. This means that if you want to keep the subtitles, you need to keep the original file. This is hardly practical if you are archiving HDTV shows for example.
- Subtitles files are small - so small (around 250 Kb for a movie) that you can quickly download them, or email them, etc, in case you have a recording without subtitles.
- Subtitles files are indexable: You can have a database with all your subtitles if you want (there are many available), so you can search the dialogs.
- Subtitles files are a de-facto standard: Almost every player can use them. In fact, many setbox players accept subtitles files in .srt format - so you can have subtitles in your divx movies and not just in your original DVDs.
- Closed captions are stored in many different formats by capture cards. Upgrading to a new card, if it comes with a new player, may mean that you can't use your previously recorded closed captions, even if the audio/video are fine.
- Closed captions require a closed caption decoder. All US TV have one (it's a legal requirement), but no European TV does, since there are not closed captions in Europe (teletext is used instead). Basically this means that if you buy a DVD in the US which has closed captions but no DVD subtitles, you are out of luck. This is a problem with many (most) old TV shows DVDs, which only come with closed captions. DVD producers don't bother doing DVD subs, since it's another way to segment the market, same as with DVD regions.

How I do use subtitles once they are in a separate file?
CCExtractor generates files in the two most common formats: .srt (SubRip) and .smi (which is a Microsoft standard). Most players support at least .srt natively. You just need to name the .srt file as the file you want to play it with, for example sample.avi and sample.srt.

What kind of files can I extract closed captions from?
CCExtractor currently handles:

- DVDs.
- Most HDTV captures (where you save the Transport Stream).
- Captures where captions are recorded in bttv format. The number of cards that use this card is huge. My test samples came from a Hauppage PVR-250. You can check the complete list here.
- DVR-MS (microsoft digital video recording).
- Tivo files - ReplayTV files - Dish Network files

Usually, if you record a TV show with your capture card and CCExtractor produces the expected result, it will work for your all recordings. If it doesn't, which means that your card uses a format CCExtractor can't handle, please contact me and we'll try to make it work.

Can I edit the subtitles?
.srt files are just text files, with time information (when subtitles are supposed to be shown and for how long) and some basic formatting (use italics, bold, etc). So you can edit them with any text editor. If you need to do serious editing (such as adjusting timing), you can use subtitle editing tools - there are many available. A good source for your video needs is doom9.org.

Can CCExtractor generate other subtitles formats?
At this time, CCExtractor can generate .srt, .smi and raw and bin files.

What's a raw file?
A raw file is a file that contains an exact dump of the closed captions bytes, without any processing. This lets you use any tool of your choice to process the data. For example, McPoodle's excellent tools can generate subtitles files in several formats, adjust timing, etc.

What's a bin file? How is it different from a raw file?
A bin file contains a dump of the closed captions bytes (same as a raw file) but it also includes timing information. This is a format that we made up for CCExtractor, i.e. it's not any kind of industry standard. However, it's the most useful (to us) for debugging purposes, so if you need to send us a sample please use this format.
Also, a bin format can hold several CC streams (several languages, even from both analog and digital). A raw file cannot.

How long does it take to process a MPEG file?
Obviously, it depends on the computer and the length of the file. In my computer it takes around 90 seconds for a 45 minutes show in HDTV, with CPU usage around 3% (I/O operations are what's holding it back).

What platforms does CCExtractor work on?
CCExtractor is developed and tested in Windows and Linux. It is also known to compile and run fine in OSX (a build script is included in the source .zip).

Where can I download it?
CCExtractor is hosted in sourceforge. This is the download page and this is the project summary page.

How I can contact the author?
Send me an email to .

How do I use this tool (parameters, etc)?
Run it without parameters and you will get a help screen. Basically, you just give it the input file name, like this:

ccextractor the.sopranos.ts

As for the lack of documentation: There is no lack of documentation! It's just included in the program itself. Just run it without parameters and you will get complete details.

How can I contribute to this project?
There are several ways:
- If you are a developer, since the source code is available, you can fix things or add features yourself and submit a patch.
- If you are an user and find any bug, or have good suggestions, let me know.
- If you are doing your own recordings and have any particular one that CCExtractor can't process correctly, I'd definitely like to take a look at it and try to fix it.
- If you really hate that there is not a lot of documentation, you can write it yourself. I'll answer any question you might have.
- Finally, you can give CCExtractor a good rating at freshmeat.

Does CCExtractor use code from other projects?
Yes. Lots of code came originally from McPoodle's tools (even though it was ported from Perl to C). I've also taken code from MythTV (which in turn took some from other places).
A good thing about Open Source is that you don't need to reinvent the wheel unless you want to (or unless you think you can come up with a 'rounder' wheel).

Get ccextractor at SourceForge.net. Fast, secure and Free Open Source software downloads