N-gram and k-nearest neighbour based igbo text classification model

Ifeanyi-reuben Nkechi J.; Odikwa Ndubuisi; Ugwu Chidiebere

N-gram and k-nearest neighbour based igbo text classification model

Authors Details :
Ifeanyi-reuben Nkechi J.,
Odikwa Ndubuisi,
Ugwu Chidiebere

2.2K Views Original Article

The evolution in Information Technology has gone a long way of bringing Igbo, one of the major Nigerian languages evolved. Some online service providers report news, publish articles and search with this language. The advancement will likely result to generation of huge textual data in the language, that needs to be organized, managed and classified efficiently for easy information access, extraction and retrieval by the end users. This work presents an enhanced model for Igbo text classification. The classification was based on N-gram and K-Nearest Neighbour techniques. Considering the peculiarities in Igbo language, N-gram model was adopted for the text representation. The text was represented with Unigram, Bigram and Trigram techniques. The classification of the represented text was done using the K-Nearest Neighbour technique. The model is implemented with the Python programming language together with the tools from Natural Language Toolkit (NLTK). The evaluation of the Igbo text classification system performance was done by calculating the recall, precision and F1-measure on N-gram represented text. The result shows text classification on bigram represented Igbo text has highest degree of exactness (precision); trigram has the lowest level of precision and result obtained with the three N-gram techniques has the same level of completeness (recall). Bigram text representation technique is extremely recommended for any text-based system in Igbo. This model can be adopted in text analysis, text mining, information retrieval, natural language processing and any intelligent text-based system in the language.

Article Subject Details

Article Keywords Details

Article File

Full Text PDF

N-gram and k-nearest neighbour based igbo text classification model

Article Subject Details

Article Keywords Details

Article File

More Artificial Intelligence Articles

Credit risk analysis- a case study of canara bank

Design and development of framework for big data based smart farming system

Design and development of framework for big data based smart farming system

A memetic algorithm for the inventory routing problem

A novel heuristic for the transportation problem: dhouib-matrix-tp1

A comparative study of credit risk management: a case study of canara bank and karnataka bank.

Implementation of big data analytics for simulating, predicting and optimizing the solar energy production

Formulation and evaluation of herbal face pack

Comparative analysis of different crossover structures for solving a periodic inventory routing problem

A state-of-the-art analysis of android malware detection methods