
School of Professional Studies
DV Emotion Net: A Sub-Study on Emotion Detection Model Performance Across P100, T4, and TPU VM V3-8
Document Type
Conference Proceeding
Abstract
This study evaluates the performance of different hardware configurations-GPU P100, GPU T4, and TPU VM v3-8-in the context of emotion detection using the DV EmotionNet framework. Building on prior research that integrates audio and video modalities for emotion recognition, the analysis explores how each hardware setup influences model efficiency and accuracy. Audio features were extracted using techniques such as energy, zero crossing rate, and Mel-Frequency Cepstral Coefficients (MFCC), while video features were obtained through spatial-temporal Gaussian kernels and Gaussian-weighted functions applied to the second momentum matrix. The Multimodal Feature Aggregation (MFA) method was employed to fuse the audio and video features, creating a comprehensive dataset. The evaluation utilized the Fusion of Emotion Recognition Convolutional Neural Network (FERCNN) model, focusing on the impact of accelerators on performance metrics. Recent advancements often face challenges like high computational costs, scalability issues, and sensitivity to noisy data. This study addresses these challenges by systematically evaluating the computational efficiency and accuracy trade-offs across different hardware accelerators. Results from the RAVDESS and CREMAD datasets revealed notable differences in accuracy, with the P100 demonstrating superior performance on simpler tasks, while TPU VM v3-8 excelled in more complex scenarios. These findings highlight the significance of hardware choice in optimizing multimodal emotion recognition systems, reinforcing the critical role of effective computational resources in enhancing applications across various domains, including human-computer interaction, healthcare, and entertainment. © 2025 IEEE.
Publication Title
4th International Conference on Sentiment Analysis and Deep Learning, ICSADL 2025 - Proceedings
Publication Date
2-2025
First Page
826
Last Page
832
ISBN
9798331523923
DOI
10.1109/ICSADL65848.2025.10933197
Keywords
emotion recognition, emotional context understanding, fusion techniques, intelligent systems, multimodal system
Repository Citation
Dommeti, Dhiren; Nallapati, Siva Ramakrishna; and Alfaris, Rand, "DV Emotion Net: A Sub-Study on Emotion Detection Model Performance Across P100, T4, and TPU VM V3-8" (2025). School of Professional Studies. 11.
https://commons.clarku.edu/sops_fac/11
Cross Post Location
Student Publications