International Journal of Innovative Research in Computer and Communication Engineering

ISSN Approved Journal | Impact factor: 8.771 | ESTD: 2013 | Follows UGC CARE Journal Norms and Guidelines

| Monthly, Peer-Reviewed, Refereed, Scholarly, Multidisciplinary and Open Access Journal | High Impact Factor 8.771 (Calculated by Google Scholar and Semantic Scholar | AI-Powered Research Tool | Indexing in all Major Database & Metadata, Citation Generator | Digital Object Identifier (DOI) |


TITLE Design and Development of an Advanced Speaker Recognition System using MFCC and Neural Network
ABSTRACT Speech is one of the most natural and efficient forms of communication among humans, and it is becoming increasingly integral to human-computer interaction in modern applications such as virtual assistants, smart devices, and automated customer service systems. Among the various technologies that leverage speech, speaker recognition stands out as a biometric method that identifies or verifies individuals based on the unique characteristics of their voice. This approach has gained significant traction due to its non-intrusive nature, ease of integration, and applicability across various domains such as security, forensics, and personalised user experiences. In this paper, we introduce a novel speaker recognition system specifically designed to identify speakers based on utterances in the Marathi language, a linguistically rich and widely spoken regional language in India. The system utilises Mel-Frequency Cepstral Coefficients (MFCCS) to extract distinguishing vocal features that closely mimic the human auditory perception. MFCCS are particularly effective in capturing speech’s phonetic and acoustic properties, making them a preferred choice in speech and speaker recognition tasks. Vector quantisation (VQ) is applied to optimise the feature set and reduce computational complexity. VQ compresses the high-dimensional MFCC feature vectors into representative clusters without significantly compromising accuracy, thereby enhancing system efficiency. These features are then processed using hidden Markov Models (HMMS), which are adept at modelling temporal sequences and dynamic variations in speech. HMMS offer a statistical framework that effectively captures the sequential nature of speech patterns, leading to more reliable speaker modelling and recognition.
TITLE



AUTHOR PAWAN S. KAMBLE, PROF. DR. RAMESH R. MANZA, SUSHIL NAMDEO GAWHALE Research Student, Department of Computer Science and IT, Dr. B.A.M. University, Chh. Sambhajinagar, India Professor and Head, Department of Computer Science and IT, Dr. B.A.M. University, Chh. Sambhajinagar, India Research Student, Department of Computer Science and IT, Dr. B.A.M. University, Chh. Sambhajinagar, India
VOLUME 181
DOI DOI: 10.15680/IJIRCCE.2026.1402012
PDF pdf/12_Design and Development of an Advanced Speaker Recognition System using MFCC and Neural Network.pdf
KEYWORDS
References 1. K. K. Nawas et al., “Recurrence plot embeddings as short segment nonlinear features for multimodal speaker identification using air, bone and throat microphones,” Scientific Reports, vol. 14, no. 1, May 2024, doi: 10.1038/s41598-024-62406-3.
2. X. Liu, M. Sahidullah, and T. Kinnunen, “A Comparative Re-Assessment of Feature Extractors for Deep Speaker Embeddings,” in Interspeech 2022, Oct. 2020, p. 3221. doi: 10.21437/interspeech. 2020-1765.
3. S. Srivastava, I. Gupta, A. Prakash, J. Kuriakose, and H. A. Murthy, “Fast and small footprint Hybrid HMM-HiFiGAN-based system for speech synthesis in Indian languages,” arXiv (Cornell University), Jan. 2023, doi: 10.48550/arxiv. 2302.06227.
4. Kanervisto, V. Hautamäki, T. Kinnunen, and J. Yamagishi, “Optimising Tandem Speaker Verification and Anti-Spoofing Systems,” IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 30, p. 477, Dec. 2021, doi: 10.1109/taslp. 2021.3138681.
5. S. Dey, M. Sahidullah, and G. Saha, “An Overview of Indian Spoken Language Recognition from a Machine Learning Perspective,” ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 21, no. 6, p. 1, Mar. 2022, doi: 10.1145/3523179.
6. R. Hasan and S. M. M. Rahman, “SPEAKER IDENTIFICATION USING MEL FREQUENCY CEPSTRAL COEFFICIENTS,” Jan. 2004, Accessed: Aug. 2025. [Online]. Available: https://www.buet.ac.bd/icece/pub2004/P141.pdf
7. Garg, K. Agrawal, and Mrs P. Akilandeshwari, “Vocal Data Assessment to Envision Distinctive Features of an Individual,” International Journal of Innovative Technology and Exploring Engineering, vol. 9, no. 6, p. 1335, Apr. 2020, doi: 10.35940/ijitee.f3771.049620.
8. S. V. Sathe, A. G. Dhepe, and S. S. Nimbhore, “A Comparative Study of Sentiment Analysis Techniques for Hindi Text: Machine Learning vs Deep Learning.” 20AD.
9. N. Bassan and V. Kadyan, “An Experimental Study of Continuous Automatic Speech Recognition System Using MFCC with Reference to Punjabi Language,” in Advances in Intelligent Systems and Computing, Springer Nature, 2018, p. 267. doi: 10.1007/978-981-10-8639-7_28.
10. M. K. Singh, A. Singh, and N. Singh, “Acoustic comparison of electronics disguised voice using Different semitones,” International Journal of Engineering & Technology, vol. 7, p. 98, Apr. 2018, doi: 10.14419/ijet.v7i2.16.11502.
11. Shahin and A. B. Nassif, “Emirati-Accented Speaker Identification in Stressful Talking Conditions,” in 2017 International Conference on Electrical and Computing Technologies and Applications (ICECTA), Nov. 2019, p. 1. doi: 10.1109/icecta48151.2019.8959731.
12. W. Huang, P. J. Martin, and H. Zhuang, “Machine-learning phase prediction of high-entropy alloys,” Acta Materialia, vol. 169, p. 225, Mar. 2019, doi: 10.1016/j.actamat.2019.03.012.
13. “International Journal of Recent Technology and Engineering (IJRTE),” International Journal of Recent Technology and Engineering (IJRTE), Aug. 2019, doi: 10.35940/ijrte. 2277-3878.
image
Copyright © IJIRCCE 2020.All right reserved