Authors:
K. Anitha, Bharath Kumar Nagaraj, P. Paramasivan, T. Shynu
Addresses:
1Department of Mathematics, SRM Institute of Science and Technology, Ramapuram, Chennai, Tamil Nadu, India. 2Department of Artificial Intelligence, Digipulse Technologies Inc., Salt Lake City, United States of America. 3Department of Research and Development, Dhaanish Ahmed College of Engineering, Chennai, Tamil Nadu, India. 4Department of Biomedical Engineering, Agni College of Technology, Chennai, Tamil Nadu, India. anithak1@srmist.edu.in1, bharathkumarnlp@gmail.com2, paramasivanchem@gmail.com3, shynu469@gmail.com4
Clustering, a fundamental technique in machine learning, plays a pivotal role in partitioning datasets into homogeneous groups. Traditional clustering algorithms, while widely adopted, face challenges in handling uncertainty and imprecision in real-world data. This research introduces the Rough Set C-Means (RSCM) algorithm, an innovative approach that integrates rough set theory into traditional k-means clustering. The RSCM algorithm capitalizes on the principles of rough set theory to effectively manage imprecise information during the clustering process. In this study, we present a comprehensive examination of the RSCM algorithm, exploring its theoretical foundations, methodology, and practical applications. Through a series of experiments conducted on diverse datasets, this paper demonstrates the superior performance of RSCM compared to conventional clustering algorithms. The results reveal that the RSCM algorithm not only enhances clustering accuracy but also exhibits robustness in handling uncertainties within the data. Furthermore, this work discusses the algorithm's adaptability to various domains, emphasizing its potential applications in real-world scenarios. The RSCM algorithm proves particularly effective in scenarios where traditional algorithms falter due to data vagueness or uncertainty. The findings of this study contribute to the evolving landscape of clustering algorithms, offering a novel perspective on improving performance in the presence of imprecise data.
Keywords: Clustering Algorithms; Rough Set C-Means; Rough Set Theory; Machine Learning; Uncertainty of Data Mining; Fundamental Technique; Binary Relations; Data Analysis; Real-World Datasets.
Received on: 22/04/2023, Revised on: 15/08/2023, Accepted on: 07/10/2023, Published on: 20/12/2023
FMDB Transactions on Sustainable Computing Systems, 2023 Vol. 1 No. 4, Pages: 190-203