Implications of false positives

When the system locates a potential topic change which is not supported by the human annotator, two possibilities occur: Either there is no break in topic, in which case the system is wrong; or there is indeed a change in topic, in which case the human judge failed to identify it. Given Hearst's experience [6] with lack of inter-annotator agreement, it is certainly conceivable that in these experiments, which used only a single human annotator for each of the test documents, human error exists. Hearst notes an interesting phenomenon during the course of her experiment: The final paragraph of her test document, marked as a new topic by the system, was a summary of the whole article--a fair place to mark a new topic, she argues. However, only two of her seven judges chose to mark it themselves.

Assuming Hearst's assertion has merit and that the system was correct, against the collective weight of the human judges, this raises interesting questions about the false positives located in this research: It is quite likely that at least some of them represent valid topic changes not located by the human annotators.

James Ballantine 2005-02-19