Speech Emotion Recognition

This project addresses the critical issue of mental health by providing an AI-based speech emotion recognition system that recognizes and tracks emotional distress through ongoing voice signal analysis. On everyday devices (smartphones, call-center microphones, or inexpensive recorders), the system records speech and extracts acoustic indicators ? MFCCs, pitch, spectral centroids and Mel spectrograms ? that are deciphered by machine learning and deep learning algorithms (SVM, Random Forest, XGBoost, and CNNs) to recognize patterns indicative of depression, anxiety, and stress. The solution facilitates earlier detection of concerning emotional patterns and provides personalized, non-invasive tracking?a particular benefit in regions with low mental health services. It allows individuals to benefit from insights into their emotional state while giving clinicians and community workers ongoing, objective information for prompt intervention. Benefits to the project are improved early detection, increased compassionate care from afar, and broader access to emotional tracking for under-served populations. Key obstacles? noisy recording, variants in culture and languages, dataset imbalance ? were addressed with sophisticated pre-processing, focused data augmentation, multi-dataset collection, and user-focused model tuning.Sustainability due to low equipment requirement, and alignment with SDG 3 (Good Health & Well-being), SDG 9 (Industry, Innovation & Infrastructure), and SDG 10 (Reduced Inequalities).

Speech Emotion Recognition

Contact

Quick Links

Localisation