Developing classification-based named entity recognizers (NER) for Sambalpuri and Odia applying support vector machines (SVM)

Authors

  • Pitambar Behera Jawaharlal Nehru University, India
  • Sharmin Muzaffar Aligarh Muslim University, India

DOI:

https://doi.org/10.3126/nl.v33i1.41066

Keywords:

NER, Sambalpuri, NLP, Odia, SVM, Machine Learning, Indo-Aryan languages, Information Retrieval, Natural Language Processes

Abstract

This paper demonstrates the development of named Entity Recognizers (NER) applying Support Vector Machines (SVM) for Sambalpuri and Odia. The Sambalpuri corpus amounts to 112k word tokens out of which 5,887 are named entities. On the contrary, 250k ILCI corpus has been applied for Odia out of which 18,447 tokens are named entities. The former accurately recognizes 96.72% whereas the latter provides 98.10% accuracy.

Downloads

Download data is not yet available.
Abstract
116
PDF
103

Downloads

Published

2018-11-01

How to Cite

Behera, P., & Muzaffar, S. (2018). Developing classification-based named entity recognizers (NER) for Sambalpuri and Odia applying support vector machines (SVM). Nepalese Linguistics, 33(1), 1–7. https://doi.org/10.3126/nl.v33i1.41066

Issue

Section

Articles