Analysis of Clean Water Consumption Segmentation And Classification Using K-Means Clustering And Random Forest Algorithms

Authors

  • Ika Melinia Sapitri Fitriyanti Universitas Putra Indonesia YPTK Padang
  • Sarjon Defit Universitas Putra Indonesia YPTK Padang
  • Rini Sovia Universitas Putra Indonesia YPTK Padang

Keywords:

Customer Segmentation, Water Consumption Patterns, Principal Component Analysis (PCA), K-Means Clustering, Random Forest

Abstract

The administrative grouping of PERUMDA Air Minum Kota Padang customers is not yet able to accurately represent actual customer water consumption patterns. This condition makes it difficult for the company to formulate service policies, customer management, and make appropriate data-based decisions. This study aims to analyze and map customer water consumption patterns to produce more representative customer segmentation as a basis for decision making. The research method used is a data mining approach with the application of Principal Component Analysis (PCA) for dimension reduction, K-Means Clustering for customer segmentation, and Random Forest for customer classification, using primary data from the Padang City Water Company's Customer Meter Reading Report with an initial amount of 371 data. The results of the study show that the clustering process successfully formed three customer segments, namely premium customers with high consumption bills, regular customers with moderate and stable consumption, and new customers with low consumption rates. The evaluation of the Random Forest model's performance resulted in an accuracy rate of 68.85% on the training data and 67.69% on the testing data, with an average precision value above 0.84 and an average F1-score value of around 0.68. The consistency of performance between the training data and the testing data shows that the model has fairly good generalization capabilities and does not experience overfitting.

References

B. E. Adiana, I. Soesanti, and A. E. Permanasari, “Analisis Segmentasi Pelanggan Menggunakan Kombinasi Rfm Model Dan Teknik Clustering,” J. Terap. Teknol. Inf., vol. 2, no. 1, pp. 23–32, 2018, doi: 10.21460/jutei.2018.21.76.

M. Imron and P. Korespondensi, “Implementasi Model Prediksi Churn Pelanggan Menggunakan Algoritma Random Forest Pada Website Industri Telekomunikasi,” J. Teknol. Inf., vol. 1, no. 1, pp. 12–25, 2025.

S. M. Miraftabzadeh, C. G. Colombo, M. Longo, and F. Foiadelli, “K-Means and Alternative Clustering Methods in Modern Power Systems,” IEEE Access, vol. 11, no. November, pp. 119596–119633, 2023, doi: 10.1109/ACCESS.2023.3327640.

K. et al 2023, “PENERAPAN DATA MINING UNTUK MENENTUKAN LOKASI PROMOSI SEKOLAH DENGAN METODE K-MEANS CLUSTERING (STUDI KASUS: SMP ISLAM AL SYUKRO UNIVERSAL),” vol. 32, no. 3, pp. 167–186, 2021.

A. Dinanti and J. Purwadi, “Analisis Performa Algoritma K-Nearest Neighbor dan Reduksi Dimensi Menggunakan Principal Component Analysis,” Jambura J. Math., vol. 5, no. 1, pp. 155–165, 2023, doi: 10.34312/jjom.v5i1.17098.

C. Auzia Nugraha, M. A. Kesuma, O. I. Cahyani, M. Wati, and Haviluddin, “Pengelompokan Harga Cabai Rawit Berdasarkan Provinsi Menggunakan Principal Component Analysis dan K-Means,” JUKI J. Komput. dan Inform., vol. 7, no. 1, pp. 80–88, 2025.

R. Ishak, Nurmawanti, and Amiruddin, “Optimasi K-Means pada Clustering Penyakit Ibu Hamil Menggunakan Random Forest,” Jambura J. Electr. Electron. Eng., vol. 7, no. 1, p. 41, 2024.

K. H. Izzuddin and A. W. Wijayanto, “Pemodelan Clustering Ward, K-Means, Diana, dan PAM dengan PCA untuk Karakterisasi Kemiskinan Indonesia Tahun 2021,” Komputika J. Sist. Komput., vol. 13, no. 1, pp. 41–53, 2024, doi: 10.34010/komputika.v13i1.10803.

Fadil Danu Rahman, M. I. Z. Mulki, and A. Taryana, “Clustering Dan Klasifikasi Data Cuaca Cilacap Dengan Menggunakan Metode K-Means Dan Random Forest,” J. SINTA Sist. Inf. dan Teknol. Komputasi, vol. 1, no. 2, pp. 90–97, 2024, doi: 10.61124/sinta.v1i2.15.

C. Mario, R. R. Suryono, U. T. Indonesia, and B. Lampung, “Analisis Sentimen Publik pada Film Dirty Vote di Youtube Menggunakan Random Forest dan Naive Bayes,” vol. 10, no. 1, pp. 111–122, 2025.

E. A. Winanto, Y. Novianto, S. Sharipuddin, I. S. Wijaya, and P. A. Jusia, “Peningkatan Performa Deteksi Serangan Menggunakan Metode Pca Dan Random Forest,” J. Teknol. Inf. dan Ilmu Komput., vol. 11, no. 2, pp. 285–290, 2024, doi: 10.25126/jtiik.20241127678.

S. Chairunnissa, D. Akbar, S. Defit, and B. Hendrik, “Segmentasi Tunggakan Pelanggan Menggunakan Algoritma K-Means Cluster pada Perusahaan Air Minum Daerah,” J. Pustaka AI, vol. 5, pp. 349–355, 2025.

K. Abdi, A. Warjaya, I. Muthmainnah, and P. H. Pahutar, “Penerapan Algoritma Random Forest dalam Prediksi Kelayakan Air Minum,” J. Ilmu Komput. dan Inform., vol. 3, no. 2, pp. 81–88, 2024, doi: 10.54082/jiki.81.

I. A. Rosyada and D. T. Utari, “Penerapan Principal Component Analysis untuk Reduksi Variabel pada Algoritma K-Means Clustering,” Jambura J. Probab. Stat., vol. 5, no. 1, pp. 6–13, 2024, doi: 10.37905/jjps.v5i1.18733.

Nugraha, “PENINGKATAN PERFORMA DBSCAN DENGAN REDUKSI DIMENSI PRINCIPAL COMPONENT ANALYSIS (PCA) DALAM KLASTERISASI TINGKAT KEMISKINAN DI INDONESIA Muhammad,” vol. 4, no. 5, pp. 595–608, 2025.

S. D. Yuliani and P. Kartikasari, “Klasifikasi Pelanggan Churn Pada Kegiatan Servis Dan Penjualan Sparepart Dengan Metode Random Forest,” Pros. Semin. Nas. Sains dan Teknol. Seri IV Fak. Sains dan Teknol., vol. 2, no. 2, pp. 84–96, 2025.

M. R. Fairuzzen, V. H. Merpaung, A. A. Putra, A. A. Malik, and Mahipal, “Interdisciplinary Explorations in Research JAGUNG DI PROVINSI-PROVINSI INDONESIA,” vol. 2, pp. 1497–1516, 2024.

E. Febrianty, L. Awalina, and W. I. Rahayu, “Optimalisasi Strategi Pemasaran dengan Segmentasi Pelanggan

Menggunakan Penerapan K-Means Clustering pada Transaksi Online Retail Optimizing Marketing Strategies with Customer Segmentation Using K-Means Clustering on Online Retail Transactions,” J. Teknol. dan Inf., vol. 13, no. September, pp. 122–137, 2023, doi: 10.34010/jati.v13i2.

T. S. Kasus, P. Pada, and S. Tripadvisor, “Analisis Sentimen Pelanggan Hotel di,” J. Informatics, Inf. Syst. Softw. Eng. Appl., vol. 8106, pp. 21–29, 2021, doi: 10.20895/INISTA.V3.

D. A. Kusuma, A. R. Dewi, and A. R. Wijaya, “Perbandingan Random Forest dan Convolutional Neural Network dalam Memprediksi Peralihan Pelanggan,” JISKA (Jurnal Inform. Sunan Kalijaga), vol. 10, no. 2, pp. 186–194, 2025, doi: 10.14421/jiska.2025.10.2.186-194.

Downloads

Published

2026-03-30

How to Cite

Fitriyanti, I. M. S., Defit, S., & Sovia, R. (2026). Analysis of Clean Water Consumption Segmentation And Classification Using K-Means Clustering And Random Forest Algorithms. Jurnal KomtekInfo, 13(1), 11–17. Retrieved from https://jkomtekinfo.org/ojs/index.php/komtekinfo/article/view/674

Issue

Section

Articles