Enriching the Digital Library Experience: Innovations with Named Entity Recognition and Geographic Information System Technologies

by MacKay, Adrienne W.

Abstract (Summary)
Digital libraries are seeking innovative ways to share their resources and enhance user experience. To this end, numerous openly available technologies can be exploited. For this project, NER technology was applied to a subset of the Documenting the American South (DocSouth) digital collections. Personal and location names were hand-annotated to achieve a gold standard, and GATE, a text engineering tool, was run under two conditions: a defaults baseline and a test run that included gazetteers built from DocSouth's Colonial and State Records collection. Overall, GATE performance is promising, and numerous strategies for improvement are discussed. Next, derived location annotations were georeferenced and stored in a geodatabase through automated processes, and a prototype for a web-based map search was developed using the Google Maps API. This project showcases innovations with automated NER coupled with GIS technologies, and strongly supports further investment in applying these techniques across DocSouth and other digital libraries.
Bibliographical Information:

Advisor:Hugh A. Cayless

School:University of North Carolina at Chapel Hill

School Location:USA - North Carolina

Source Type:Master's Thesis

Keywords:data mining digital libraries geographic information systems retrieval world wide web


Date of Publication:07/21/2008

© 2009 All Rights Reserved.