Benchmarking IndoBERT and Transformer Models for Sentiment Classification on Indonesian E-Government Service Reviews

Authors

  • Dhendra Universitas Muhammadiyah Semarang
  • Victor Gayuh Utomo Universitas Semarang

DOI:

https://doi.org/10.26623/transformatika.v23i1.12095

Abstract

The rapid adoption of e-government services in Indonesia has increased the importance of understanding public sentiment toward digital platforms. This study presents a comparative analysis of five models—IndoBERT, mBERT, XLM-R, CNN, and BiLSTM—for sentiment classification on user reviews of NEWSAKPOLE, a public service application for vehicle tax and licensing. A custom dataset of 11,000+ reviews was scraped from the Google Play Store and labeled using a hybrid rating-based and manual validation approach. Each model was evaluated using accuracy, precision, recall, and F1-score. IndoBERT achieved the highest performance with an F1-score of 0.882, outperforming multilingual and classical deep learning models. Confusion matrix analysis showed that transformer-based models were more effective in detecting neutral and mixed sentiments, while CNN and BiLSTM struggled with misclassification. The results highlight IndoBERT's robustness in low-resource sentiment analysis and its potential to enhance public service monitoring and policy feedback mechanisms in Indonesian digital governance.

Downloads

Published

2025-07-16

Issue

Section

Artikel

How to Cite

Dhendra, & Gayuh Utomo, V. . (2025). Benchmarking IndoBERT and Transformer Models for Sentiment Classification on Indonesian E-Government Service Reviews. Jurnal Transformatika, 23(1), 86-95. https://doi.org/10.26623/transformatika.v23i1.12095