Emotional Aware Anime Recommendation System Using NLP Based Dialogue Analysis
M. A. Navodi Dhananjana *
Department of Information and Communication Technology, Faculty of Humanities and Social Sciences, University of Sri Jayewardenepura, Sri Lanka.
Isiri Indurangala
Department of Information and Communication Technology, Faculty of Humanities and Social Sciences, University of Sri Jayewardenepura, Sri Lanka.
*Author to whom correspondence should be addressed.
Abstract
Aims/Objectives: Traditional anime recommendation systems frequently rely on viewing history, user ratings, and genre-based filtering, all of which often fail to capture the complex emotional immersion inherent in the medium. Given that emotional resonance is a primary driver of audience engagement, incorporating affective data can significantly enhance recommendation accuracy.
Study Design: Quantitative experimental research with hybrid machine learning framework.
Methodology: This study proposes an emotion-aware hybrid anime recommendation system that extracts emotional patterns from dialogue using Natural Language Processing (NLP). The methodology involves preprocessing over 1.5 million anime subtitle lines sourced from a large-scale English-translated anime subtitle corpus, covering thousands of episodes across diverse genres, to eliminate noise and segment dialogue for fine-grained analysis. A transformer-based emotion detection model (j-hartmann/emotion-english-distilroberta-base) is applied to each dialogue line to generate probability distributions across seven emotion categories. These predictions are aggregated to produce episode-level emotion vectors, which are further augmented with three engineered "anchor features" — Total Intensity, Emotion Breadth, and a Positivity Index — expanding the input representation from a 6-dimensional (6D) baseline to a 9-dimensional (9D) feature space.
Results: Six hybrid regression models were benchmarked against both the 6D baseline and 9D feature configurations under Gaussian noise injection (0–10%), simulating real-world variability in user emotional self-reporting. The 9D anchor feature integration produced a 33% to 65% improvement in Noise Resilience Score (NRS) across all architectures compared to the 6D baseline, with attention-based models (TabPFN, TabNet) recovering from near-complete failure (negative R² at 10% noise in the 6D system) to robust performance (R² = 0.7227–0.7792). XGBoost emerged as the optimal backbone with the highest NRS of 0.8980 and R² = 0.7999 under maximum noise, a finding independently validated by the AutoML framework across 141 search iterations. The hybrid scoring pipeline, combining the 9D XGBoost regression model with a Sentence-Transformer semantic layer (all-MiniLM-L6-v2), demonstrated approximately 80% inter-model agreement across architectures for identical user inputs.
Conclusion: This approach yields suggestions that are more context-aware, psychologically grounded, and aligned with the emotional expectations of viewers, establishing a technically robust foundation for future multimodal affective recommendation systems.
Keywords: Emotion-aware recommendation systems, NLP, subtitle-based emotion analysis, transformer-based detection, episode-level emotion modeling, hybrid recommendation framework, noise resilience, anchor features