Computer Science

L2 -norm transformation for improving k-means clustering: Finding a suitable model by range transformation for novel data analysis

Document Type

Article

Abstract

In the age of increasingly pervasive sensing applications, measurement of unknown pattern phenomena resulting in novel data presents a challenge to selection of appropriate modeling tools. Because there is no rich history of domain knowledge, one can easily make early commitments to poor modeling choices. Data transformation, a solution in effort to modify the data’s geometry, can make important regularities more clear. The wrong transformation can damage the very pattern information one seeks to identify. In contrast to data transformation, we contribute an alternative method, range transformation focusing on altering the measurement tool. As a function, a model maps data inputs to a range. Focusing on transformations of the model’s range, we can find a generally applicable way to alter the model’s properties to best suit the data. Every modification to a function class, something we call editing the function, results in a change to the original function’s range. This work contributes a method for modifying a broad class of models to suit novel data through range transformation. We investigate range transformation for a class of information theoretic transformations and evaluate impact on classification and clustering. We also develop an optimization-based framework employing range transformation based on desired geometric properties and use it to improve a widely used model, k-means clustering.

Publication Title

International Journal of Data Science and Analytics

Publication Date

2017

Volume

3

Issue

4

First Page

247

Last Page

266

ISSN

2364-415X

DOI

10.1007/s41060-017-0054-1

Keywords

Chisini–Jensen–Shannon divergence, CJSD kernel, k-Means clustering, L -norm 2, LIBS, range transformation

APA Citation

Sharma, P. K., & Holness, G. (2017). L^ 2 L 2-norm transformation for improving k-means clustering: Finding a suitable model by range transformation for novel data analysis. International Journal of Data Science and Analytics, 3, 247-266.

Share

COinS