Efficient Determination of Water Quality using Incremental Support Vector Machine Learning
Neil Rayala, Edson Alves
Support Vector Machine, Incremental Support Vector Machine, Multilayer Perceptron Neural Network, Ultraviolet-Visible Spectrophotometry, Overfitting, Water Quality Index
The water quality index (WQI) is a major metric for determining overall water quality. Unfortunately, measurement of WQI typically requires expensive and time-consuming methods. WQI can be determined solely with an affordable and relatively common ultraviolet-visible (UV-Vis) spectrophotometry. However, this determination is neither accurate nor efficient enough for high throughput analysis of large water sample collections. Here, we describe a powerful combination of incremental support vector machine (I-SVM) learning and UV-Vis spectrophotometry for WQI classification along with Monte Carlo methods for inexpensive verification. The combined approach greatly improves both testing accuracy and training efficiency. By modeling the UV-Vis spectral analysis data, Monte Carlo methods precisely simulated a total of 115,000 water samples to test for model overfitting and long-run accuracy and efficiency. In the I-SVM, the classifier quickly updates its hyperparameters based on new batches of training samples without substantially increasing the training time. Consequently, the I-SVM reached a lower average training time (1.17152 seconds) compared to the normal support vector machine (27,734.16493 seconds) and a higher classification accuracy rate (91.61043%) compared to the multilayer perceptron neural network (83.05739%). The continuous high accuracy and efficiency of the I-SVM allows water organizations such as the U.S. Geological Survey to compute the WQI at many water sites simultaneously.