Comparative Analysis of the Main SaaS Algorithms
for Named Entity Recognition Applied for Romanian Language
The Academy of Economic Studies,
6 Piața Romană, 010374 Bucharest, Romania
Abstract: This paper proposes a comparative analysis of the main Name Entity Recognition algorithms available in cloud, applied for texts written in Romanian. The context of this analysis is the one of the semantic web, where the problem of identifying new entities and linking them to existing ontologies persists. There are processes defined that allow the text written in Romanian to be translated in one of the languages supported by the algorithms provided by DBpedia (DBpedia Spotlight), Google (Google Cloud Natural Language API), Microsoft (the NER module from Azure Machine Learning Studio) and IBM (IBM Watson Natural Language Understanding), and afterwards the F1 score is computed in order to identify the optimal process. The article ends with a comparison between the obtained results and the performance achieved by NER algorithms specialized for
English or language independent.
Keywords: Semantic web, NER, LOD, SaaS.
CITE THIS PAPER AS:
Bogdan IANCU, Comparative Analysis of the Main SaaS Algorithms for Named Entity Recognition Applied for Romanian Language, Romanian Journal of Information Technology and Automatic Control, ISSN 1220-1758, vol. 28(1), pp. 25-34, 2018.
- About DBpedia, DBpedia.org. Accessed 25 January 2018. <http://wiki.dbpedia.org/about>.
- BIZER, C., HEATH, T. IDEHEN, K. & BERNERS-LEE, T. (2008). Linked data on the web (LDOW2008). In Proceedings of the 17th international conference on World Wide Web (pp. 1265-1266).
- CUCERZAN, S. & YAROWSKY, D. (1999). Language independent named entity recognition combining morphological and contextual evidence. In 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora.
- DAIBER, J., JAKOB, M., HOKAMP, C. & Mendes, P. N. (2013, September). Improving efficiency and accuracy in multilingual entity extraction. In Proceedings of the 9th
International Conference on Semantic Systems (pp. 121-124).
- Google Cloud Natual Language API Documentation, Google Cloud Platform. Accessed 25 January 2018. <https://cloud.google.com/natural-language/docs/>.
- IBM Knowledge Center – Named Entity Recognition annotator, IBM Knowledge Center. Accessed 25 January 2018. <https://www.ibm.com/support/knowledgecenter/en/
- IRIMIA, E. (2015). Accelerarea dezvoltării unui corpus digital adnotat cu relaţii de dependenţă pentru limba română utilizând resurse şi instrumente construite pentru alte limbi, Revista Română de Informatică şi Automatică, 25(3), 5-16.
- MOHIT, B. (2014). Named entity recognition, Natural language processing of semitic languages, 221-245.
- NADEAU, D. & SEKINE, S. (2007). A survey of named entity recognition and classification, Lingvisticae Investigationes, 30(1), 3-26.
- Named Entity Recognition – Azure Machine Learning Studio, Microsoft Docs. Accessed 25 January 2018. <https://docs.microsoft.com/en-us/azure/machine-learning/studiomodule-reference/named-entity-recognition>.
- RITTER, A., CLARK, S. & ETZIONI, O. (2011, July). Named entity recognition in tweets: an experimental study. In Proceedings of the conference on empirical methods in natural
language processing (pp. 1524-1534).
- TJONG KIM SANG, E. F. & DE MEULDER, F. (2003, May). Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Proceedings of the
seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4 (pp. 142-147).
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.