International Journal of Innovative Research in Computer and Communication Engineering
ISSN Approved Journal | Impact factor: 8.771 | ESTD: 2013 | Follows UGC CARE Journal Norms and Guidelines
| Monthly, Peer-Reviewed, Refereed, Scholarly, Multidisciplinary and Open Access Journal | High Impact Factor 8.771 (Calculated by Google Scholar and Semantic Scholar | AI-Powered Research Tool | Indexing in all Major Database & Metadata, Citation Generator | Digital Object Identifier (DOI) |
| TITLE | Breaking the Language Barrier: A Comprehensive Study of Transformer Models in Multilingual Sentiment Analysis |
|---|---|
| ABSTRACT | The exponential growth of user-generated content across digital platforms in numerous languages presents a monumental challenge and opportunity for sentiment analysis. Traditional sentiment analysis models, often monolingual and reliant on lexical resources, struggle with the nuances of code-switching, cultural context, and data scarcity for low-resource languages. The advent of transformer-based models, pre-trained on massive multilingual corpora, has heralded a paradigm shift, promising zero-shot and few-shot cross-lingual capabilities. This research paper provides a comprehensive investigation into the application of transformer models for Multilingual Sentiment Analysis (MSA). We begin with a detailed literature survey tracing the evolution from lexicon-based and classical machine learning methods to the current state-of-the-art transformers. The core of this study is an empirical evaluation comparing the performance of several prominent multilingual transformer models—including mBERT, XLM-RoBERTa, and DistilBERT—against a traditional baseline like a Multinomial Naïve Bayes model with TF-IDF features. Using a carefully curated dataset comprising product reviews in English, Spanish, French, and German, we trained and evaluated models in both zero-shot and fine-tuned settings. Our methodology details the experimental setup, data preprocessing pipeline, and model training procedures. The results demonstrate that fine-tuned transformer models, particularly XLM-RoBERTa, significantly outperform traditional methods, achieving an average F1-score of 0.92 across languages. Furthermore, we analyze the models' zero-shot capabilities, where a model trained on English data is directly applied to other languages, revealing substantial performance gaps that highlight the importance of language-specific fine-tuning. We also investigate computational efficiency, providing a trade-off analysis between model performance and resource requirements. The findings underscore that while transformers are powerful, their effective deployment requires careful consideration of the target language's resource availability and the specific task requirements. This study concludes that transformer models are the cornerstone of modern MSA, but challenges in computational cost, bias, and true cross-lingual generalization remain fertile ground for future research. |
| AUTHOR | DR. DESHMUKH SUSHANT DIWAKARRAO Assistant Professor, Department of Computer Applications, M.S.Bidve Engineering College, Barshi Road, Latur, India |
| VOLUME | 177 |
| DOI | DOI: 10.15680/IJIRCCE.2025.1312090 |
| pdf/90_Breaking the Language Barrier A Comprehensive Study of Transformer Models in Multilingual Sentiment Analysis.pdf | |
| KEYWORDS | |
| References | [1]. B. Liu, "Sentiment analysis and opinion mining," Synth. Lect. Hum. Lang. Technol., vol. 5, no. 1, pp. 1–167, 2012. [2]. W. Medhat, A. Hassan, and H. Korashy, "Sentiment analysis algorithms and applications: A survey," Ain Shams Eng. J., vol. 5, no. 4, pp. 1093–1113, 2014. [3]. J. A. G. M. Sánchez, A. M. R. T. López, and M. D. C. V. Tovar, "Multilingual sentiment analysis: A systematic literature review," J. Intell. Fuzzy Syst., vol. 40, no. 5, pp. 1–15, 2021. [4]. A. B. Nassirtoussi, S. Aghabozorgi, T. Y. Wah, and D. C. L. Ngo, "Text mining for market prediction: A systematic review," Expert Syst. Appl., vol. 41, no. 16, pp. 7653–7670, 2014. [5]. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," in Proc. NAACL-HLT, 2019, pp. 4171–4186. [6]. A. Conneau et al., "Unsupervised cross-lingual representation learning at scale," in Proc. ACL, 2020, pp. 8440–8451. [7]. Y. Kim, "Convolutional neural networks for sentence classification," in Proc. EMNLP, 2014, pp. 1746–1751. [8]. A. Vaswani et al., "Attention is all you need," in Adv. Neural Inf. Process. Syst., 2017, pp. 5998–6008. [9]. T. Pires, E. Schlinger, and D. Garrette, "How multilingual is multilingual BERT?," in Proc. ACL, 2019, pp. 4996–5001. [10]. G. Lample and A. Conneau, "Cross-lingual language model pretraining," arXiv preprint arXiv:1901.07291, 2019. [11]. Aliyu, Yusuf, Aliza Sarlan, Kamaluddeen Usman Danyaro, and Abdulahi Sani Rahman. "Comparative Analysis of Transformer Models for Sentiment Analysis in Low-Resource Languages." International Journal of Advanced Computer Science & Applications 15, no. 4 (2024). [12]. Nazir, Muhammad Kashif, CM Nadeem Faisal, Muhammad Asif Habib, and Haseeb Ahmad. "Leveraging multilingual transformer for multiclass sentiment analysis in code-mixed data of low-resource languages." IEEE Access (2025). [13]. Elhadidi, Eslam Ashraf, Ahmad Salah, Marwa Abdella, and Saad M. Darwish. "Multilingual Sentiment Analysis: A Review of Deep Learning Transformer Models and Ensemble Techniques." International Journal of Computers and Informatics (Zagazig University) 7 (2025): 91-103. [14]. Wang, Lihua. "Cross-Lingual NLP: Bridging Language Barriers with Multilingual Model." In 2024 International Conference on Electronics and Devices, Computational Science (ICEDCS), pp. 1005-1012. IEEE, 2024. [15]. Aliyu, Yusuf, Aliza Sarlan, Kamaluddeen Usman Danyaro, Abdullahi Sani BA Rahman, and Mujaheed Abdullahi. "Sentiment analysis in low-resource settings: a comprehensive review of approaches, languages, and data sources." IEEE Access 12 (2024): 66883-66909. |