Incorporating visualisation quality measures to curvilinear component analysis

Jigang Sun, Malcolm Crowe, Colin Fyfe

Research output: Contribution to journalArticle

Abstract

Curvilinear Component Analysis (CCA) is a useful data visualisation method. CCA has the technical property that its optimisation surface, as defined by its stress function, changes during the optimisation according to a decreasing parameter. CCA uses a variant of the stochastic gradient descent method to create a mapping of data. In the optimisation method of CCA, the stress function is only a general guide towards an acceptable mapping. In other multidimensional scaling methods such as Sammon's mapping, the best mapping among multiple runs from different initialisations can be chosen by selecting the mapping with the lowest stress, whereas in CCA the embedding is simply the result of one run, surely we can have multiple starts. As a consequence of the absence of an objective function to be used as a selection criterion, embedding made by CCA can be poorly optimised. In this paper we present a new way of improving the optimisation of CCA by integrating non-stress data visualisation quality measures into the existing algorithm. We first use data visualisation quality measures to select the best mapping from multiple runs of a standard stochastic gradient descent implementation; then we tune various parameters involved to achieve further enhancement. A brief comparison with other dimensionality reduction methods is included.
Original languageEnglish
Pages (from-to)75-101
JournalInformation Sciences
Volume223
DOIs
Publication statusPublished - 20 Feb 2013

Keywords

  • Curvilinear Component Analysis (CCA)
  • Stochastic gradient descent
  • Parameter learning
  • Sammon's mapping
  • Dimensionality reduction

Cite this

Sun, Jigang ; Crowe, Malcolm ; Fyfe, Colin. / Incorporating visualisation quality measures to curvilinear component analysis. In: Information Sciences. 2013 ; Vol. 223. pp. 75-101.
@article{7621a8fcef144a6091dfd6e5657e4704,
title = "Incorporating visualisation quality measures to curvilinear component analysis",
abstract = "Curvilinear Component Analysis (CCA) is a useful data visualisation method. CCA has the technical property that its optimisation surface, as defined by its stress function, changes during the optimisation according to a decreasing parameter. CCA uses a variant of the stochastic gradient descent method to create a mapping of data. In the optimisation method of CCA, the stress function is only a general guide towards an acceptable mapping. In other multidimensional scaling methods such as Sammon's mapping, the best mapping among multiple runs from different initialisations can be chosen by selecting the mapping with the lowest stress, whereas in CCA the embedding is simply the result of one run, surely we can have multiple starts. As a consequence of the absence of an objective function to be used as a selection criterion, embedding made by CCA can be poorly optimised. In this paper we present a new way of improving the optimisation of CCA by integrating non-stress data visualisation quality measures into the existing algorithm. We first use data visualisation quality measures to select the best mapping from multiple runs of a standard stochastic gradient descent implementation; then we tune various parameters involved to achieve further enhancement. A brief comparison with other dimensionality reduction methods is included.",
keywords = "Curvilinear Component Analysis (CCA), Stochastic gradient descent, Parameter learning, Sammon's mapping, Dimensionality reduction",
author = "Jigang Sun and Malcolm Crowe and Colin Fyfe",
year = "2013",
month = "2",
day = "20",
doi = "10.1016/j.ins.2012.09.047",
language = "English",
volume = "223",
pages = "75--101",
journal = "Information Sciences",
issn = "0020-0255",
publisher = "Elsevier B.V.",

}

Incorporating visualisation quality measures to curvilinear component analysis. / Sun, Jigang; Crowe, Malcolm; Fyfe, Colin.

In: Information Sciences, Vol. 223, 20.02.2013, p. 75-101.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Incorporating visualisation quality measures to curvilinear component analysis

AU - Sun, Jigang

AU - Crowe, Malcolm

AU - Fyfe, Colin

PY - 2013/2/20

Y1 - 2013/2/20

N2 - Curvilinear Component Analysis (CCA) is a useful data visualisation method. CCA has the technical property that its optimisation surface, as defined by its stress function, changes during the optimisation according to a decreasing parameter. CCA uses a variant of the stochastic gradient descent method to create a mapping of data. In the optimisation method of CCA, the stress function is only a general guide towards an acceptable mapping. In other multidimensional scaling methods such as Sammon's mapping, the best mapping among multiple runs from different initialisations can be chosen by selecting the mapping with the lowest stress, whereas in CCA the embedding is simply the result of one run, surely we can have multiple starts. As a consequence of the absence of an objective function to be used as a selection criterion, embedding made by CCA can be poorly optimised. In this paper we present a new way of improving the optimisation of CCA by integrating non-stress data visualisation quality measures into the existing algorithm. We first use data visualisation quality measures to select the best mapping from multiple runs of a standard stochastic gradient descent implementation; then we tune various parameters involved to achieve further enhancement. A brief comparison with other dimensionality reduction methods is included.

AB - Curvilinear Component Analysis (CCA) is a useful data visualisation method. CCA has the technical property that its optimisation surface, as defined by its stress function, changes during the optimisation according to a decreasing parameter. CCA uses a variant of the stochastic gradient descent method to create a mapping of data. In the optimisation method of CCA, the stress function is only a general guide towards an acceptable mapping. In other multidimensional scaling methods such as Sammon's mapping, the best mapping among multiple runs from different initialisations can be chosen by selecting the mapping with the lowest stress, whereas in CCA the embedding is simply the result of one run, surely we can have multiple starts. As a consequence of the absence of an objective function to be used as a selection criterion, embedding made by CCA can be poorly optimised. In this paper we present a new way of improving the optimisation of CCA by integrating non-stress data visualisation quality measures into the existing algorithm. We first use data visualisation quality measures to select the best mapping from multiple runs of a standard stochastic gradient descent implementation; then we tune various parameters involved to achieve further enhancement. A brief comparison with other dimensionality reduction methods is included.

KW - Curvilinear Component Analysis (CCA)

KW - Stochastic gradient descent

KW - Parameter learning

KW - Sammon's mapping

KW - Dimensionality reduction

U2 - 10.1016/j.ins.2012.09.047

DO - 10.1016/j.ins.2012.09.047

M3 - Article

VL - 223

SP - 75

EP - 101

JO - Information Sciences

JF - Information Sciences

SN - 0020-0255

ER -