XML Information Retrieval and Information Extraction
Abstract (Summary)
We present a new query language for information retrieval in XML documents and discuss its combination with information extraction methods. XIRQL is an XML query language which implements IR-related features such as weighting and ranking, relevance-oriented search, datatypes with vague predicates, and structural relativism. For information extracted from texts, XIRQL can rank records based on uncertainty weights, and single conditions may be evaluated using vague predicates for fact retrieval. When IE is used for automatic XML markup of plain texts, XIRQL is able to consider uncertainty weights resulting from this process, and the markup leads to increased precision of text searchesIn:
Text mining : theoretical aspects and applications ; [contributions presented to an international workshop on April 26 and 27, 2002 at the DaimlerChrysler AG Research Center in Ulm / Jürgen Franke ... (Eds.) - Heidelberg [u.a.] : Physica-Verl., 2003, S. 21-32
Bibliographical Information:
Advisor:none
School:Universität Duisburg-Essen, Standort Essen
School Location:Germany
Source Type:Master's Thesis
Keywords:informatik datenverarbeitung universitaet duisburg essen
ISBN:
Date of Publication:07/19/2004