Current Version: 0.65

March 14th, 2013

Command Line Usage

CCExtractor's main program is console based. There's a GUI for Windows, as well

as provisions so other programs can easily interface with CCExtractor, but the

heavy lefting is done by a command line program (that can be called by scripts so

integration with larger processes is straightforward).

Running CCExtractor without any parameter will display a help screen with all the

options. As of version 0.60 the help screen is as follows:

CCExtractor 0.60, Carlos Fernandez Sanz, Volker Quetschke.

--------------------------------------------------------------------------

Originally based on McPoodle's tools. Check his page for lots of information

on closed captions technical details.

(http://www.geocities.com/mcpoodle43/SCC_TOOLS/DOCS/SCC_TOOLS.HTML)

This tool home page:

http://ccextractor.sourceforge.net

  Extracts closed captions from MPEG files.

    (DVB, .TS, ReplayTV 4000 and 5000, dvr-ms, bttv, Tivo and Dish Network

      are known to work).

  Syntax:

  ccextractor inputfile1 [-o outputfilename]

               [-o1 outputfilename1] [-o2 outputfilename2]

File name related options:

            inputfile: file(s) to process

    -o outputfilename: Use -o parameters to define output filename if you don't

                       like the default ones (same as infile plus _1 or _2 when

                       needed and .raw or .srt extension).

                           -o or -o1 -> Name of the first (maybe only) output

                                        file.

                           -o2       -> Name of the second output file, when

                                        it applies.

         -cf filename: Write 'clean' data to a file. Cleans means the ES

                       without TS or PES headers.

              -stdout: Write output to stdout (console) instead of file. If

                       stdout is used, then -o, -o1 and -o2 can't be used. Also

                       -stdout will redirect all messages to stderr (error).

You can pass as many input files as you need. They will be processed in order.

If a file name is suffixed by +, ccextractor will try to follow a numerical

sequence. For example, DVD001.VOB+ means DVD001.VOB, DVD002.VOB and so on

until there are no more files.

Output will be one single file (either raw or srt). Use this if you made your

recording in several cuts (to skip commercials for example) but you want one

subtitle file with contiguous timing.

Options that affect what will be processed:

          -1, -2, -12: Output Field 1 data, Field 2 data, or both

                       (DEFAULT is -1)

                 -cc2: When in srt/sami mode, process captions in channel 2

                       instead channel 1.

In general, if you want English subtitles you don't need to use these options

as they are broadcast in field 1, channel 1. If you want the second language

(usually Spanish) you may need to try -2, or -cc2, or both.

Input formats:

       With the exception of McPoodle's raw format, which is just the closed

       caption data with no other info, CCExtractor can usually detect the

       input format correctly. To force a specific format:

                  -in=format

       where format is one of these:

                       ts   -> For Transport Streams.

                       ps   -> For Program Streams.

                       es   -> For Elementary Streams.

Page 1ccextractor_help

                       asf  -> ASF container (such as DVR-MS).

                       bin  -> CCExtractor's own binary format.

                       raw  -> For McPoodle's raw files.

       -ts, -ps, -es and -asf (or --dvr-ms) can be used as shorts.

Output formats:

                 -out=format

       where format is one of these:

                       srt    -> SubRip (default, so not actually needed).

                       sami   -> MS Synchronized Accesible Media Interface.

                       bin    -> CC data in CCExtractor's own binary format.

                       raw    -> CC data in McPoodle's Broadcast format.

                       dvdraw -> CC data in McPoodle's DVD format.

                       txt    -> Transcript (no time codes, no roll-up

                                 captions, just the plain transcription.

                       ttxt   -> Timed Transcript (transcription with time info)

                       null   -> Don't produce any file output

Options that affect how input files will be processed.

        -gt --goptime: Use GOP for timing instead of PTS. This only applies

                       to Program or Transport Streams with MPEG2 data and

                       overrides the default PTS timing.

                       GOP timing is always used for Elementary Streams.

     -fp --fixpadding: Fix padding - some cards (or providers, or whatever)

                       seem to send 0000 as CC padding instead of 8080. If you

                       get bad timing, this might solve it.

               -90090: Use 90090 (instead of 90000) as MPEG clock frequency.

                       (reported to be needed at least by Panasonic DMR-ES15

                       DVD Recorder)

    -ve --videoedited: By default, ccextractor will process input files in

                       sequence as if they were all one large file (i.e.

                       split by a generic, non video-aware tool. If you

                       are processing video hat was split with a editing

                       tool, use -ve so ccextractor doesn't try to rebuild

                       the original timing.

   -s --stream : Consider the file as a continuous stream that is

                       growing as ccextractor processes it, so don't try

                       to figure out its size and don't terminate processing

                       when reaching the current end (i.e. wait for more

                       data to arrive). If the optional parameter secs is

                       present, it means the number of seconds without any

                       new data after which ccextractor should exit. Use

                       this parameter if you want to process a live stream

                       but not kill ccextractor externally.

                       Note: If -s is used then only one input file is

                       allowed.

  -poc  --usepicorder: Use the pic_order_cnt_lsb in AVC/H.264 data streams

                       to order the CC information.  The default way is to

                       use the PTS information.  Use this switch only when

                       needed.

                -myth: Force MythTV code branch.

              -nomyth: Disable MythTV code branch.

                       The MythTV branch is needed for analog captures where

                       the closed caption data is stored in the VBI, such as

                       those with bttv cards (Hauppage 250 for example). This is

                       detected automatically so you don't need to worry about

                       this unless autodetection doesn't work for you.

       -wtvconvertfix: This switch works around a bug in Windows 7's built in

                       software to convert *.wtv to *.dvr-ms. For analog NTSC

                       recordings the CC information is marked as digital

                       captions. Use this switch only when needed.

 -pn --program-number: In TS mode, specifically select a program to process.

                       Not needed if the TS only has one. If this parameter

                       is not specified and CCExtractor detects more than one

                       program in the input, it will list the programs found

                       and terminate without doing anything. -haup --hauppauge

                       If the video was recorder using a Hauppauge card, it might

                       need special processing. This parameter will force

                       the special treatment.

Options that affect what kind of output will be produced:

             -unicode: Encode subtitles in Unicode instead of Latin-1

                -utf8: Encode subtitles in UTF-8 instead of Latin-1

  -nofc --nofontcolor: For .srt/.sami, don't add font color tags.

-nots --notypesetting: For .srt/.sami, don't add typesetting tags.

                -trim: Trim lines.

   -dc --defaultcolor: Select a different default color (instead of

                       white). This causes all output in .srt/.smi

                       files to have a font tag, which makes the files

                       larger. Add the color you want in RGB, such as

                       -dc #FF0000 for red.

    -sc --sentencecap: Sentence capitalization. Use if you hate

                       ALL CAPS in subtitles.

  --capfile -caf file: Add the contents of 'file' to the list of words

                       that must be capitalized. For example, if file

                       is a plain text file that contains

                       Tony

                       Alan

                       Whenever those words are found they will be written

                       exactly as they appear in the file.

                       Use one line per word. Lines starting with # are

                       considered comments and discarded.

Options that affect how ccextractor reads and writes (buffering):

    -bi --bufferinput: Forces input buffering.

 -nobi -nobufferinput: Disables input buffering.

Note: -bo is only used when writing raw files, not .srt or .sami

Options that affect the built-in closed caption decoder:

                 -dru: Direct Roll-Up. When in roll-up mode, write character by

                       character instead of line by line. Note that this

                       produces (much) larger files.

     -noru --norollup: If you hate the repeated lines caused by the roll-up

                       emulation, you can have ccextractor write only one

                       line at a time, getting rid of these repeated lines.

Options that affect timing:

            -delay ms: For srt/sami, add this number of milliseconds to

                       all times. For example, -delay 400 makes subtitles

                       appear 400ms late. You can also use negative numbers

                       to make subs appear early.

Notes on times: -startat and -endat times are used first, then -delay.

So if you use -srt -startat 3:00 -endat 5:00 -delay 120000, ccextractor will

generate a .srt file, with only data from 3:00 to 5:00 in the input file(s)

and then add that (huge) delay, which would make the final file start at

5:00 and end at 7:00.

Options that affect what segment of the input file(s) to process:

        -startat time: Only write caption information that starts after the

                       given time.

                       Time can be seconds, MM:SS or HH:MM:SS.

                       For example, -startat 3:00 means 'start writing from

                       minute 3.

          -endat time: Stop processing after the given time (same format as

                       -startat).

                       The -startat and -endat options are honored in all

                       output formats.  In all formats with timing information

                       the times are unchanged.

-scr --screenfuls num: Write 'num' screenfuls and terminate processing.

Adding start and end credits:

  CCExtractor can _try_ to add a custom message (for credits for example) at

  the start and end of the file, looking for a window where there are no

  captions. If there is no such window, then no text will be added.

  The start window must be between the times given and must have enough time

  to display the message for at least the specified time.

        --startcreditstext txt: Write this text as start credits. If there are

                                several lines, separate them with the

                                characters \n, for example Line1\nLine 2.

  --startcreditsnotbefore time: Don't display the start credits before this

                                time (S, or MM:SS). Default: 0

   --startcreditsnotafter time: Don't display the start credits after this

                                time (S, or MM:SS). Default: 5:00

 --startcreditsforatleast time: Start credits need to be displayed for at least

                                this time (S, or MM:SS). Default: 2

  --startcreditsforatmost time: Start credits should be displayed for at most

                                this time (S, or MM:SS). Default: 5

          --endcreditstext txt: Write this text as end credits. If there are

                                several lines, separate them with the

                                characters \n, for example Line1\nLine 2.

   --endcreditsforatleast time: End credits need to be displayed for at least

                                this time (S, or MM:SS). Default: 2

    --endcreditsforatmost time: End credits should be displayed for at most

                                this time (S, or MM:SS). Default: 5

Options that affect debug data:

               -debug: Show lots of debugging output.

                 -608: Print debug traces from the EIA-608 decoder.

                       If you need to submit a bug report, please send

                       the output from this option.

                 -708: Print debug information from the (currently

                       in development and useless) EIA-708 (DTV) decoder.

              -goppts: Enable lots of time stamp output.

            -xdsdebug: Enable XDS debug data (lots of it).

               -vides: Print debug info about the analysed elementary

                       video stream.

               -cbraw: Print debug trace with the raw 608/708 data with

                       time stamps.

              -nosync: Disable the syncing code.  Only useful for debugging

                       purposes.

             -fullbin: Disable the removal of trailing padding blocks

                       when exporting to bin format.  Only useful for

                       for debugging purposes.

          -parsedebug: Print debug info about the parsed container

                       file. (Only for TS/ASF files at the moment.)

Communication with other programs and console output:

   --gui_mode_reports: Report progress and interesting events to stderr

                       in a easy to parse format. This is intended to be

                       used by other programs. See docs directory for.

                       details.

    --no_progress_bar: Suppress the output of the progress bar

               -quiet: Don't write any message.

Error: (This help screen was shown because there were no input files)