Archives

Romanian Journal of Information Technology and Automatic Control / Vol. 21, No. 4, 2011


Integrated System for Developing Semantically-enhanced Archive eContent

Mihaela DINȘOREANU, Ioan SALOMIE, Cristina Bianca POP

Abstract:

This paper addresses the problem of knowledge processing from historical documents available in archives. Thus, we propose an integrated solution which performs information extraction and knowledge acquisition on one hand and information and knowledge retrieval on the other hand. We present a method that adapts the Text2Onto framework to semi-automatically extract relevant information from the documents content through lexical and semantic text annotation. The semantic annotations will further populate a domain ontology which is used in information and knowledge retrieval. We also present a method for querying the digital knowledge base of historical documents in the Romanian natural language. The method is augmented with suggestions and word meaning disambiguation. We tested and validated our integrated solution on a set of documents addressing the history of Transylvania.

Keywords:
knowledge acquisition, semantic annotation, knowledge retrieval, natural language query.

View full article:

CITE THIS PAPER AS:
Mihaela DINȘOREANU, Ioan SALOMIE, Cristina Bianca POP, "Integrated System for Developing Semantically-enhanced Archive eContent", Romanian Journal of Information Technology and Automatic Control, ISSN 1220-1758, vol. 21(4), pp. 67-77, 2011.