top of page

A ‘Cluster-Classify’ Approach to solving Multi-Dimensional Classification Problems

Writer's picture: Questioz EditorQuestioz Editor

A ‘Cluster-Classify’ Approach to solving Multi-Dimensional Classification Problems


Author : Aryan Agarwal

Vasant Valley School, New Delhi, India


Abstract

Today, Multi-Dimensional Classification (MDC) problems are found in every sector, most notably in clinical settings. Such datasets tend to ‘overfit’ classical machine learning models, leading to low accuracies on previously unseen data. The authors propose a two-step Machine Learning framework to solve MDC problems and overcome ‘overfitting’. This

framework clusters instance features in a labelled dataset and then validates the results using classical classifiers. Clustering reduces dimensional complexity while maintaining correlations between features and labels. Classification makes the model fit for predictions, opening it to real-world applications. As a case study, this framework is implemented on two datasets of

different dimensions. Promising results are recorded for the wine quality classification dataset, with eleven features, and the stroke classification dataset with ten features. The ideal k is found to be three for both, using the elbow method. K-Means and K-Medians are used for clustering, while Logistic Regression, Decision Tree, Random Forest, Support Vector Machine (SVM), K Nearest Neighbours (KNN), and Naïve Bayes are used for classification. Five-fold cross-validation is used to reduce bias while measuring model performance, and the results are compared to direct classification without clustering. On the wine dataset, K-Means clustering improves accuracy by 3.89%, while on the stroke dataset it improves accuracy by 8.81%. The results prove that Cluster-Classify is an appropriate candidate for classification where there exists a need to reduce dimensions and understand inter-feature relationships. This framework may thus be used to increase model accuracy by reducing

overfitting in MDC problems.




241 views0 comments

Recent Posts

See All

Comparison of urban layouts

Siddharth Jain ABSTRACT The present study provides a comprehensive analysis and comparison of the transportation networks in three of the...

コメント


Volume 1

Volume 2

Contact Us
Email ID - editor.questioz@gmail.com

Terms and Conditions

Questioz cannot be held responsible for any violation of academic integrity. The intellectual property of all contributing researchers will be respected and protected. Questioz reserves the non-exclusive right to republish submitted material with attribution to the author in any other format, including all print, electronic and online media. However, all individual contributors to Questioz retain the right to submit their work for non-exclusive publication elsewhere.

© 2023 by The Thomas Hill. Proudly created with Wix.com

bottom of page