Computer Science
L2 -norm transformation for improving k-means clustering: Finding a suitable model by range transformation for novel data analysis
Document Type
Article
Abstract
In the age of increasingly pervasive sensing applications, measurement of unknown pattern phenomena resulting in novel data presents a challenge to selection of appropriate modeling tools. Because there is no rich history of domain knowledge, one can easily make early commitments to poor modeling choices. Data transformation, a solution in effort to modify the data’s geometry, can make important regularities more clear. The wrong transformation can damage the very pattern information one seeks to identify. In contrast to data transformation, we contribute an alternative method, range transformation focusing on altering the measurement tool. As a function, a model maps data inputs to a range. Focusing on transformations of the model’s range, we can find a generally applicable way to alter the model’s properties to best suit the data. Every modification to a function class, something we call editing the function, results in a change to the original function’s range. This work contributes a method for modifying a broad class of models to suit novel data through range transformation. We investigate range transformation for a class of information theoretic transformations and evaluate impact on classification and clustering. We also develop an optimization-based framework employing range transformation based on desired geometric properties and use it to improve a widely used model, k-means clustering.
Publication Title
International Journal of Data Science and Analytics
Publication Date
2017
Volume
3
Issue
4
First Page
247
Last Page
266
ISSN
2364-415X
DOI
10.1007/s41060-017-0054-1
Keywords
Chisini–Jensen–Shannon divergence, CJSD kernel, k-Means clustering, L -norm 2, LIBS, range transformation
Repository Citation
Sharma, Piyush Kumar and Holness, Gary, "L2 -norm transformation for improving k-means clustering: Finding a suitable model by range transformation for novel data analysis" (2017). Computer Science. 204.
https://commons.clarku.edu/faculty_computer_sciences/204
APA Citation
Sharma, P. K., & Holness, G. (2017). L^ 2 L 2-norm transformation for improving k-means clustering: Finding a suitable model by range transformation for novel data analysis. International Journal of Data Science and Analytics, 3, 247-266.