Details

XML Information Retrieval and Information Extraction

by Fuhr, Norbert

Abstract (Summary)
We present a new query language for information retrieval in XML documents and discuss its combination with information extraction methods. XIRQL is an XML query language which implements IR-related features such as weighting and ranking, relevance-oriented search, datatypes with vague predicates, and structural relativism. For information extracted from texts, XIRQL can rank records based on uncertainty weights, and single conditions may be evaluated using vague predicates for fact retrieval. When IE is used for automatic XML markup of plain texts, XIRQL is able to consider uncertainty weights resulting from this process, and the markup leads to increased precision of text searches

In:

Text mining : theoretical aspects and applications ; [contributions presented to an international workshop on April 26 and 27, 2002 at the DaimlerChrysler AG Research Center in Ulm / J├╝rgen Franke ... (Eds.) - Heidelberg [u.a.] : Physica-Verl., 2003, S. 21-32

Bibliographical Information:

Advisor:none

School:Universität Duisburg-Essen, Standort Essen

School Location:Germany

Source Type:Master's Thesis

Keywords:informatik datenverarbeitung universitaet duisburg essen

ISBN:

Date of Publication:07/19/2004

© 2009 OpenThesis.org. All Rights Reserved.