A Comparative Analysis of Supervised and Unsupervised Machine Learning Techniques for Sleep Disorder Identification using Lifestyle and Health Data

Y. Rohita

Department of IT, Sreenidhi Institute of Science and Technology, Ghatkesar, Yamnampet, Hyderabad, Telangana, India.

P. N. Siva Jyothi

Department of IT, Sreenidhi Institute of Science and Technology, Ghatkesar, Yamnampet, Hyderabad, Telangana, India.

Nelli Sreevidya

Department of IT, Sreenidhi Institute of Science and Technology, Ghatkesar, Yamnampet, Hyderabad, Telangana, India.

Veladi Rajeshwari *

Department of IT, Sreenidhi Institute of Science and Technology, Ghatkesar, Yamnampet, Hyderabad, Telangana, India.

Majji Hasanth Kumar

Department of IT, Sreenidhi Institute of Science and Technology, Ghatkesar, Yamnampet, Hyderabad, Telangana, India.

Suda Naga Chaitan Reddy

Department of IT, Sreenidhi Institute of Science and Technology, Ghatkesar, Yamnampet, Hyderabad, Telangana, India.

*Author to whom correspondence should be addressed.


Abstract

Sleep disorders such as insomnia and obstructive sleep apnea have a significant impact on physical health, mental well-being, and overall quality of life. Early and accurate identification of these disorders is essential for effective intervention; however, traditional diagnostic approaches are often time-consuming and rely heavily on expert interpretation. In recent years, machine learning techniques have emerged as efficient tools for automated health condition analysis using lifestyle and physiological data.

This paper presents a comparative study of supervised and unsupervised machine learning techniques for sleep disorder identification using the Sleep Health and Lifestyle dataset, which consists of approximately 400 records with demographic, lifestyle, and health-related attributes. Supervised learning models including Logistic Regression, Random Forest, Gradient Boosting, XGBoost, and Multilayer Perceptron (MLP) are evaluated using accuracy, precision, recall, and F1-score. Hyperparameter optimization is performed using GridSearchCV to improve model generalization. In parallel, unsupervised clustering techniques such as K-Means and DBSCAN are applied to explore hidden patterns and natural groupings in sleep behaviour, with performance evaluated using silhouette scores and noise detection analysis.

Experimental results demonstrate that Gradient Boosting achieves the highest predictive performance, with test accuracy approaching 98% and cross-validation accuracy of 99.75%. Among unsupervised methods, DBSCAN outperforms K-Means by effectively identifying well-separated clusters and detecting outliers. The comparative analysis highlights that while unsupervised techniques are valuable for pattern discovery and exploratory analysis, supervised models provide superior predictive accuracy when labelled data is available. The proposed framework supports reliable, interpretable, and data-driven decision-making for automated sleep disorder detection in healthcare applications.

The paper utilizes the Sleep Health and Lifestyle dataset, which contains approximately 400 records with multiple demographic, lifestyle, and physiological attributes such as age, body mass index (BMI), sleep duration, physical activity level, heart rate, and stress indicators. These features capture important behavioural and health-related factors influencing sleep quality. The dataset classifies individuals into three categories of sleep conditions: Normal, Insomnia, and Sleep Apnea, enabling the development and evaluation of machine learning models for accurate sleep disorder prediction and pattern analysis.

The proposed research work is implemented using Python in the Jupyter Notebook environment on a standard computer (Intel Core processor, 8 GB RAM, Windows OS). Libraries including Pandas, NumPy, Scikit-learn, and Matplotlib are used for data processing, model development, and visualization. Future work will focus on larger datasets and wearable sensor data to enhance model robustness and applicability.

Keywords: Sleep disorder identification, machine learning, supervised learning, unsupervised learning, healthcare analytics


How to Cite

Rohita, Y., P. N. Siva Jyothi, Nelli Sreevidya, Veladi Rajeshwari, Majji Hasanth Kumar, and Suda Naga Chaitan Reddy. 2026. “A Comparative Analysis of Supervised and Unsupervised Machine Learning Techniques for Sleep Disorder Identification Using Lifestyle and Health Data”. Asian Journal of Research in Computer Science 19 (3):76-91. https://doi.org/10.9734/ajrcos/2026/v19i3837.

Downloads

Download data is not yet available.