From Text to Action: AI-Driven Classification of Public Service Complaints in Karanganyar, Indonesia
Keywords:
Public Complaint Classification, Logistic Regression, Text Mining, TF-IDF, E-GovernmentAbstract
Efficiently classifying public complaints is crucial for fostering transparent and responsive governance in the digital age. However, the sheer volume and textual nature of complaint data pose significant challenges for manual categorization, particularly within local government systems. This study seeks to develop an automatic classification model for public complaints by employing Logistic Regression and TF-IDF vectorization. The dataset, comprising complaints submitted to the Karanganyar Regency Government from January to June 2025, underwent preprocessing through standard natural language techniques and was converted into numerical features using TF-IDF. Logistic Regression was chosen for its simplicity, interpretability, and effectiveness with sparse text data. To address class imbalance, class weighting and stratified sampling were utilized. The model achieved an overall accuracy of 61%, surpassing the Naive Bayes baseline. Confusion matrix analysis demonstrated strong performance in dominant categories, although minority classes continued to present challenges. The results suggest that Logistic Regression offers a practical and explainable solution for early-stage complaint classification systems, especially in public sector contexts. This study lays the foundation for the future development of intelligent e-government platforms capable of real-time complaint handling.
References
A. Hariguna, R. Sugihartati, N. Suhardi, and F. D. Prasetya, “E-Government public complaints text classification using particle swarm optimization in Naive Bayes algorithm,” Applied Sciences, vol. 14, no. 14, pp. 6282–6298, 2022, doi: 10.3390/app14146282.
D. Xiong, X. Luo, and M. Wu, “Hybrid deep learning model for public service complaint classification,” Journal of Intelligent Systems, vol. 33, no. 1, pp. 55–69, 2024, doi: 10.1515/jisys-2023-0072.
S. Raschka, Python Machine Learning, 1st ed. Birmingham, UK: Packt Publishing, 2015.
N. U. Safawi and N. A. Shafie, “Performance analysis of logistic regression, Naive Bayes and KNN for text classification using TF-IDF,” International Journal of Advanced Computer Science and Applications, vol. 11, no. 10, pp. 391–396, 2020, doi: 10.14569/IJACSA.2020.0111052.
M. Curma and D. Sinaj, “Handling data imbalance in text classification: Techniques and evaluation,” Journal of Data Science and Analytics, vol. 11, no. 3, pp. 224–233, 2023.
R. Singh, A. Kumar, and P. Sharma, “Improving logistic regression on imbalanced text data: A stratified and weighted approach,” Procedia Computer Science, vol. 207, pp. 345–352, 2025, doi: 10.1016/j.procs.2024.12.047.
H. Liu and L. Yu, “Feature selection for text classification,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 472–479, Apr. 2005, doi: 10.1109/TKDE.2005.66.
S. Das, A. Roy, and T. K. Roy, “Performance evaluation of machine learning algorithms for text classification using TF-IDF,” International Journal of Engineering Research & Technology, vol. 12, no. 3, pp. 24–28, 2023.
B. Khalid, S. Dai, T. Taghavi, and S. Lee, “Label-supervised contrastive learning for imbalanced text classification in Euclidean and hyperbolic embedding spaces,” in Proc. W-NUT, Malta, Mar. 2024, pp. 58–67.
G. Khvatskii, N. Moniz, K. Doan, and N. V. Chawla, “Class-aware contrastive optimization for imbalanced text classification,” Complex & Intelligent Systems, vol. 11, no. 2, Art. no. 27, Jul. 2025.
Y. Liu, F. Giunchiglia, L. Huang, et al., “A simple graph contrastive learning framework for short text classification,” arXiv preprint arXiv:2501.09219, Jan. 2025.
F. Taskiran, B. Turkoglu, E. Kaya, et al., “A comprehensive evaluation of oversampling techniques for enhancing text classification performance,” Scientific Reports, vol. 15, Art. no. 21631, Feb. 2025.
J. Gao, G. Liu, B. Zhu, S. Zhou, H. Zheng, and X. Liao, “Multi-level attention and contrastive learning for enhanced text classification with an optimized transformer,” arXiv preprint arXiv:2501.13467, Jan. 2025.
S. Matharaarachchi, M. Domaratzki, and S. Muthukumarana, “Dirichlet ExtSMOTE and other robust oversampling techniques for logistic regression classification,” Machine Learning with Applications, vol. 18, Art. no. 100597, 2024.
D. Mildenberger, P. Hager, D. Rueckert, and M. Menten, “A tale of two classes: Adapting supervised contrastive learning to binary imbalanced datasets,” arXiv preprint arXiv:2503.17024, Mar. 2025.
X. Gao, M. Ramli, M. I. Rosli, et al., “Revisiting self-supervised contrastive learning for imbalanced classification,” International Journal of Electrical and Computer Engineering, vol. 15, no. 2, pp. 1949–1960, Apr. 2025.
I. Valmianski, D. Broniatowski, and K. Dredze, “Evaluating robustness of language models for chief complaint classification in public health surveillance,” arXiv preprint arXiv:1905.00368, 2019.
Z. Zhou, Ensemble Methods: Foundations and Algorithms. Boca Raton, FL: CRC Press, 2012.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Journal of Computing and Smart Ecosystems

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.