Details

AUTOMATIC EXTRACTION OF AUTHOR SELF CONTRIBUTED METADATA FOR ELECTRONIC THESES AND DISSERTATIONS

by Ni, Mao

Abstract (Summary)
This paper discusses the design and implement of an automatic way to extract the metadata from PDF files in the process of the submission to the Electronic and Theses Dissertations (ETDs). During the submission, each ETDs system requires some metadata about the theses to facilitate the metadata search after it is archived. Those metadata, like creator, title, data, abstract, subject and publisher, comply with the Dublin Core Metadata Initiative. In most of all existing ETDs repositories, students are required to manually type in these metadata, which discourages students' submission, especially when resubmissions are needed due to the errors found in the theses, because they have to type all the metadata again each time they submit the theses.

By standardizing a method for capturing the metadata from the original documents, our project aims to enable digital repository, which hosts the ETDs collection, to automatically extract the metadata from the theses, making the submissions much easier and more convenient for the students.

Bibliographical Information:

Advisor:Bradley M. Hemminger

School:University of North Carolina at Chapel Hill

School Location:USA - North Carolina

Source Type:Master's Thesis

Keywords:metadata automatic extraction author self contributed digital library electronic theses and dissertations etds

ISBN:

Date of Publication:05/04/2004

© 2009 OpenThesis.org. All Rights Reserved.