Automated Analysis Techniques for Online Conversations with Application in Deception Detection
Abstract (Summary)
Email, chat, instant messaging, blogs, and newsgroups are now common ways for
people to interact. Along with these new ways for sending, receiving, and storing
messages comes the challenge of organizing, filtering, and understanding them, for
which text mining has been shown to be useful. Additionally, it has done so using
both content-dependent and content-independent methods.
Unfortunately, computer-mediated communication has also provided criminals,
terrorists, spies, and other threats to security a means of efficient communication.
However, the often textual encoding of these communications may also provide for
the possibility of detecting and tracking those who are deceptive. Two methods for
organizing, filtering, understanding, and detecting deception in text-based computermediated
communication are presented.
First, message feature mining uses message features or cues in CMC messages
combined with machine learning techniques to classify messages according to the
sender’s intent. The method utilizes common classification methods coupled with
linguistic analysis of messages for extraction of a number of content-independent
input features. A study using message feature mining to classify deceptive and nondeceptive
email messages attained classification accuracy between 60% and 80%.
Second, speech act profiling is a method for evaluating and visualizing synchronous
CMC by creating profiles of conversations and their participants using speech act
theory and probabilistic classification methods. Transcripts from a large corpus of
speech act annotated conversations are used to train language models and a modified
hidden Markov model (HMM) to obtain probable speech acts for sentences, which
are aggregated for each conversation participant creating a set of speech act profiles.
Three studies for validating the profiles are detailed as well as two studies showing
speech act profiling’s ability to uncover uncertainty related to deception.
14
The methods introduced here are two content-independent methods that represent
a possible new direction in text analysis. Both have possible applications outside the
context of deception. In addition to aiding deception detection, these methods may
also be applicable in information retrieval, technical support training, GSS facilitation
support, transportation security, and information assurance.
15
Bibliographical Information:
Advisor:
School:The University of Arizona
School Location:USA - Arizona
Source Type:Master's Thesis
Keywords:
ISBN:
Date of Publication: