Automated Analysis Techniques for Online Conversations with Application in Deception Detection

by Twitchell, Douglas P.

Abstract (Summary)
Email, chat, instant messaging, blogs, and newsgroups are now common ways for people to interact. Along with these new ways for sending, receiving, and storing messages comes the challenge of organizing, filtering, and understanding them, for which text mining has been shown to be useful. Additionally, it has done so using both content-dependent and content-independent methods. Unfortunately, computer-mediated communication has also provided criminals, terrorists, spies, and other threats to security a means of efficient communication. However, the often textual encoding of these communications may also provide for the possibility of detecting and tracking those who are deceptive. Two methods for organizing, filtering, understanding, and detecting deception in text-based computermediated communication are presented. First, message feature mining uses message features or cues in CMC messages combined with machine learning techniques to classify messages according to the sender’s intent. The method utilizes common classification methods coupled with linguistic analysis of messages for extraction of a number of content-independent input features. A study using message feature mining to classify deceptive and nondeceptive email messages attained classification accuracy between 60% and 80%. Second, speech act profiling is a method for evaluating and visualizing synchronous CMC by creating profiles of conversations and their participants using speech act theory and probabilistic classification methods. Transcripts from a large corpus of speech act annotated conversations are used to train language models and a modified hidden Markov model (HMM) to obtain probable speech acts for sentences, which are aggregated for each conversation participant creating a set of speech act profiles. Three studies for validating the profiles are detailed as well as two studies showing speech act profiling’s ability to uncover uncertainty related to deception. 14 The methods introduced here are two content-independent methods that represent a possible new direction in text analysis. Both have possible applications outside the context of deception. In addition to aiding deception detection, these methods may also be applicable in information retrieval, technical support training, GSS facilitation support, transportation security, and information assurance. 15
Bibliographical Information:


School:The University of Arizona

School Location:USA - Arizona

Source Type:Master's Thesis



Date of Publication:

© 2009 All Rights Reserved.