50 Views Original Article
This paper compares different optimizable machine learning classification models to predict eight types of anemia from complete blood count (CBC) data. For the research, we used a publicly available Kaggle dataset containing 1281 observations, 14 predictors, and the diagnosis as the categorical target variable with nine categories (eight types of anemia and the healthy category). First, we examined the dataset and observed the histograms of some of the predictors. We compared the values of predictors of observations with no anemia to the observations where any anemia was diagnosed. Next, we used MATLAB R2024a to train and test nine optimizable machine-learning classification models. These models were Ensemble, Tree, SVM, Efficient Linear, Neural Network, Kernel, KNN, Naïve Bayes, and the Discriminant. Bayesian optimization was used to optimize the hyperparameters of all these models. We used 90% of observations for training and 10% of observations for testing. During the training, 10-fold cross-validation was used to prevent overfitting. The results showed the best accuracy was reached with the Ensemble classification model using the bag ensemble method (validation accuracy: 99.22%, test accuracy: 100%). Finally, we inspected our best classification model in more detail. We calculated the permutation feature importance to determine the contribution of each predictor to the final model. The results showed 6–7 important predictors, while the most important feature was the amount of hemoglobin.
Teaching and learning computer programming is challenging for many undergraduate first-year computer science students. during introductory programming courses, novice programmers n...
To acquire algorithmic thinking is a long process that has a few steps. the most basic level of algorithmic thinking is when students recognize the algorithms and various problems ...
The memory card game is a game that probably everyone played in childhood. the game consists of n pairs of playing cards, whereas each card of a pair is identical. at the beginning...
In the last decade, we have observed the usage of artificial intelligence algorithms and machine learning models in industry, education, healthcare, entertainment, and several othe...
Views: 0.9K
Views: 0.9K
Views: 881
Views: 808