Misses

With the current settings, the system is rather conservative in its detection of potential topics. In most cases, the human annotator has identified significantly more topics than the system. This is partially a matter of trough-detection sensitivity--clearly, in some cases (such as the first topic break in figure 5.6) a higher sensitivity to troughs would cause higher agreement. This, however, is not the most interesting case of a `miss'; consider instead, for example, the fifth hand-annotated topic break in figure 5.3: It is essentially not inside a trough of any description, but resting on a plateau. The textual context of this case is shown in figure 5.5.

Figure 5.5: Context of missed topic change (manual topic change 5) in figure 5.3
\begin{figure}{\tt
S1: \ldots um, and then tomorrow morning when you come
in to ...
...lly easy i'll show you.
you know you know what the web is \ldots
}\end{figure}

The implication of cases such as this is that there are some topic changes identified as such by human subjects, but which do not register at all using the cosine measure. This suggests that TextTiling, while effective, does not encompass a complete account of the process of topic change.

James Ballantine 2005-02-19