While today data is most often stored as text, as storage capacity grows our ability to record and archive audio data increases. Thus techniques to order and manage data are becoming more applicable to recorded data--meeting room recordings, telephone conversation archives, broadcast news, and other recorded speech. Additionally, audio data is harder to navigate than written text--consider the difficulty experienced when attempting to find the start of a chapter or scene in a book-on-tape or other spoken recording. Audio recordings cannot be `scanned' visually to locate an area of interest in the same way that even unformatted text can. Thus the ability to navigate into a large speech-based recording to find the area of interest would greatly improve access to this kind of data.
Natural spoken dialogue can occur with different purposes in mind--a dialogue can take place between a manager and an employee to communicate instructions for a particular task; it can also take place between friends, synchronising knowledge of interesting experiences and gossip. Regardless of the purpose of a dialogue, however, if meaning is imparted (as we would normally hope) then it always covers one or more topics. In the simplest case, a dialogue addresses a single topic, and this topic can be identified by a party following the conversation. For example, a conversation between a customer and a shopkeeper about the availability of a product may cover a single topic. A longer conversation, however, will typically cover more than one topic in succession--it may carry a main topic to which it returns repeatedly, digressing to other topics at times; or it may follow a chain of related topics, moving away from the original subject completely.
James Ballantine 2005-02-19