usage: segmentation.py [options] options: -h, --help show this help message and exit -n, --numbers -b, --handbreaks -m, --markuppseudo -iBREAKSFILE, --insertbreaks=BREAKSFILE -s, --smoothdata -d, --detectbreaks -fFILENAME, --file=FILENAME
The following are common ways to call the system:
segmentation.py -f source_file -smoothdata - print out the smoothed similarity data for file source_file.
segmentation.py -f source_file -numbers - print out the unsmoothed (raw) similarity data for file source_file
segmentation.py -f source_file -handbreaks - print the human-annotated breaks (if any) found in the file.
segmentation.py -f source_file -detectbreaks - print the detected topic breaks.
segmentation.py -f source_file -markuppseudo - print the original XML file (a subset of it not containing metadata) with pseudosentence tags.
segmentation.py -f source_file -markuppseudo -insertbreaks=breaks_file - print the XML dialogue as above, additionally inserting topic-break tags from the file specified in breaks_file
The R scripts may be called in the following manner:
R BATCH twoplot.R
This will attempt to perform graph generation on every file with the extension .txt in the current working directory. Optional files with the same name ending with .txt.auto, .txt.breaks, or .txt.spiky will be found and used to plot detected topic breaks, hand-transcribed topic breaks, and unsmoothed similarity data respectively.
The XSLT script may be called as follows:
xsltproc newmicase-html.xsl input_file.xml
This will print well-formed HTML on standard out, which can be piped to a file.
makegraph.sh must be called as follows:
makegraph.sh input_file output_name Where output_name is the base name of the output files to be created. They will be created with the extensions .txt.auto, .txt.breaks, and .txt.spiky.
buildgraph.sh must simply be called with no arguments, with their input files in the current working directory. It processes every file with the extension .txt, leaving four EPS files with the same base-name.
James Ballantine 2005-02-19