Geography

A new framework to deal with the class imbalance problem in urban gain modeling based on clustering and ensemble models

Document Type

Article

Abstract

The data employed in urban gain modeling classes are often imbalanced, negatively affecting the accuracy of traditional and standard data mining and machine learning models. This study presents a new framework on the basis of clustering-based modeling and ensemble models to deal with the class imbalance problem in urban gain modeling. The random forest (RF), artificial neural network (ANN) and support vector machine (SVM) models served as the base models for the generation and evaluation of the results within this framework. The changes in urban land-use pattern of Isfahan in Iran in two time intervals of 1994-2004 and 2004-2014 were considered for the modeling. The findings showed that the proposed sampling strategy yields higher Hits and Correct Rejections rates than the strategies applied in previous studies in all three models. In the second part of the proposed framework (ensemble models), there was no substantial difference in the confusion matrix entries.

Publication Title

Geocarto International

Publication Date

1-1-2022

Volume

37

Issue

19

First Page

5669

Last Page

5692

ISSN

1010-6049

DOI

10.1080/10106049.2021.1923826

Keywords

artificial neural network, imbalance datasets, land-use change modeling, random forest, support vector machine, under-sampling

Share

COinS