NLP-Driven Semi-automatic Ontology Learning Approach for Academic Biographies
Biralatei Fawei *
Department of Computer Science, Niger Delta University, Amassoma, PMB 581, Bayelsa State, Nigeria.
Patrick Kenekayoro
Department of Computer Science, Niger Delta University, Amassoma, PMB 581, Bayelsa State, Nigeria.
*Author to whom correspondence should be addressed.
Abstract
The majority of biographical texts contain lengthy unstructured text and are often written by the individual authors. Moreover, research has shown the importance of analysing this textual data to extract semantic meanings and connections among researchers. The study adopted the Stanford CoreNLP natural language processing technique to efficiently identify and extract named entities and semantic triples. Which are then mapped to academic biographic concepts and relationships to build a structured biography knowledge base. The machine-readable knowledge base was evaluated with the OntOlogy Pitfall Scanner (OOPS), an online ontology evaluation tool to check for consistency, structural, lexical patterns and general quality sustainance in this research. The resulting output was consistent and the knowledge base allows for the execution of complex queries with Sparql query language that can efficiently retrieve connections between named entities. This in turn provides a useful data source for future research in the areas of infometrics and scientometrics as well as for identifying relationshps and collaborations in research.
Keywords: Biographies, ontology, sparql, named entity recognition