A new dimensionality reduction technique based on the Wavelet Transform for cancer classification

Lisardo Fernández, Mariano Pérez, Juan M. Orduña*, José M. Alcaraz

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Downloads (Pure)

Abstract

Problem
DNA methylation and hydroxymethylation have become important epigenetic markers for early detection of cancer. In recent years, there has been a significant increase in both the number of research works on this topic and the number and size of labeled databases with some type of cancer. Although the advent of methylation microarrays such as the HumanMethylation450 platform has greatly reduced the dimensionality of the problem from billions to 450K positions, this data size is still too large to be processed by machine learning algorithms for cancer prediction and classification.

Aim
In the particular case of methylation, an efficient dimensionality reduction technique should also preserve the spatial information of the original data in order to properly predict and classify cancer.

Method
This work proposes a new approach for data dimensionality reduction technique based on the Discrete Wavelet Transform (DWT), which preserves spatial information. We have evaluated the proposed technique with a dataset collected from the most important cancer databases according to their social impact, and we have compared our proposal to five well-known dimensionality reduction techniques: PCA, ReliefF, Isomap, LLE and UMAP.

Results
The performance evaluation results show that the proposed technique significantly reduces both the computational resources and the execution time required for dimensionality reduction. In addition, it significantly improves the accuracy achieved in the classification by a support vector machine when it uses as input data the resulting dataset yielded by each technique.

Conclusions
The proposed approach based on the DWT can be considered as an efficient alternative for those cases where dimensionality reduction must preserve spatial information.
Original languageEnglish
Article number9
Number of pages23
JournalJournal of Big Data
Volume12
DOIs
Publication statusPublished - 21 Jan 2025

Keywords

  • dimensionality reduction
  • cancer classification
  • DNA methylation analysis
  • Wavelet Transform
  • machine learning classification

Fingerprint

Dive into the research topics of 'A new dimensionality reduction technique based on the Wavelet Transform for cancer classification'. Together they form a unique fingerprint.

Cite this