A SYSTEM FOR AUTOMATED CONTENT ORGANIZATION
The main goal of Information Retrieval (IR) is to facilitate information access from large document collections. Starting from a user’s query, usually made in a natural language, a classic IR system retrieves a set of items relevant to the user’s query and displays them as a ranked list. Search-engines are examples of IR Systems. They are effective in finding specific items, but search results for less specific information tend to be off-target, overwhelming, and less useful. In this thesis, we report the design, prototyping, and experiences of an experimental system called ACOSys for automated organization of content using menu/folder hierarchies based on a mathematical theory called Formal Concept Analysis (FCA). ACOSys utilizes the concept of FCA and the structure of a hierarchical menu to categorize search results into more specific groups. The resulting items can be found by quickly zeroing in on subfolders where they may reside, saving the effort of browsing through thousands of off-target items. The technical contribution of this thesis consists of the design and implementation of algorithms in three related categories. First, we develop the principle and the coding of a new algorithm for generating concepts and rules. We show by both theoretical and practical study that it is an efficient algorithm for both sparse and dense contexts. Second, we develop an algorithm for maintaining and updating the construction of concept lattices. In comparison to other incremental algorithms, our algorithm not only updates the concept set, but also updates the menu/folder structure; additional items can be added incrementally, and not as an overhaul. Third, instead of using a simple string-match, we provide a semi-automated process for keyword selection, which involves decision-making by a user based on measures such as the word-distribution statistics of a collection. By using the indexing strategy of the Berkeley DB (database system), context sensitive menu hierarchies are constructed in seconds, making ACOSys practical on large number of objects and attributes.
School:Case Western Reserve University
School Location:USA - Ohio
Source Type:Master's Thesis
Keywords:acosys fca content organization information retrieval hierarchical menu concepts context berkeley db
Date of Publication:01/01/2006