Enhanced approach for latent semantic indexing using wavelet transform

T. Jaber, A. Amira, P. Milligan

Research output: Contribution to journalArticle

Abstract

Latent semantic indexing (LSI) is a technique used for intelligent information retrieval (IR). It can be used as an alternative to traditional keyword matching IR and is attractive in this respect because of its ability to overcome problems with synonymy and polysemy. This study investigates various aspects of LSI: the effect of the Haar wavelet transform (HWT) as a preprocessing step for the singular value decomposition (SVD) in the key stage of the LSI process; and the effect of different threshold types in the HWT on the search results. The developed method allows the visualisation and processing of the term document matrix, generated in the LSI process, using HWT. The results have shown that precision can be increased by applying the HWT as a preprocessing step, with better results for hard thresholding than soft thresholding, whereas standard SVD-based LSI remains the most effective way of searching in terms of recall value.
Original languageEnglish
Pages (from-to)1236-1245
Number of pages10
JournalIET Image Processing
Volume6
Issue number9
DOIs
Publication statusPublished - Dec 2012
Externally publishedYes

Keywords

  • Wavelet transforms
  • Haar transforms
  • Information retrieval
  • Singular valve decomposition

Cite this

Jaber, T. ; Amira, A. ; Milligan, P. / Enhanced approach for latent semantic indexing using wavelet transform. In: IET Image Processing. 2012 ; Vol. 6, No. 9. pp. 1236-1245.
@article{d8bfed8d21354262ab5cb767db6dc6cc,
title = "Enhanced approach for latent semantic indexing using wavelet transform",
abstract = "Latent semantic indexing (LSI) is a technique used for intelligent information retrieval (IR). It can be used as an alternative to traditional keyword matching IR and is attractive in this respect because of its ability to overcome problems with synonymy and polysemy. This study investigates various aspects of LSI: the effect of the Haar wavelet transform (HWT) as a preprocessing step for the singular value decomposition (SVD) in the key stage of the LSI process; and the effect of different threshold types in the HWT on the search results. The developed method allows the visualisation and processing of the term document matrix, generated in the LSI process, using HWT. The results have shown that precision can be increased by applying the HWT as a preprocessing step, with better results for hard thresholding than soft thresholding, whereas standard SVD-based LSI remains the most effective way of searching in terms of recall value.",
keywords = "Wavelet transforms, Haar transforms, Information retrieval, Singular valve decomposition",
author = "T. Jaber and A. Amira and P. Milligan",
year = "2012",
month = "12",
doi = "10.1049/iet-ipr.2011.0498",
language = "English",
volume = "6",
pages = "1236--1245",
journal = "IET Image Processing",
issn = "1751-9659",
publisher = "Institution of Engineering and Technology",
number = "9",

}

Enhanced approach for latent semantic indexing using wavelet transform. / Jaber, T.; Amira, A.; Milligan, P.

In: IET Image Processing, Vol. 6, No. 9, 12.2012, p. 1236-1245.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Enhanced approach for latent semantic indexing using wavelet transform

AU - Jaber, T.

AU - Amira, A.

AU - Milligan, P.

PY - 2012/12

Y1 - 2012/12

N2 - Latent semantic indexing (LSI) is a technique used for intelligent information retrieval (IR). It can be used as an alternative to traditional keyword matching IR and is attractive in this respect because of its ability to overcome problems with synonymy and polysemy. This study investigates various aspects of LSI: the effect of the Haar wavelet transform (HWT) as a preprocessing step for the singular value decomposition (SVD) in the key stage of the LSI process; and the effect of different threshold types in the HWT on the search results. The developed method allows the visualisation and processing of the term document matrix, generated in the LSI process, using HWT. The results have shown that precision can be increased by applying the HWT as a preprocessing step, with better results for hard thresholding than soft thresholding, whereas standard SVD-based LSI remains the most effective way of searching in terms of recall value.

AB - Latent semantic indexing (LSI) is a technique used for intelligent information retrieval (IR). It can be used as an alternative to traditional keyword matching IR and is attractive in this respect because of its ability to overcome problems with synonymy and polysemy. This study investigates various aspects of LSI: the effect of the Haar wavelet transform (HWT) as a preprocessing step for the singular value decomposition (SVD) in the key stage of the LSI process; and the effect of different threshold types in the HWT on the search results. The developed method allows the visualisation and processing of the term document matrix, generated in the LSI process, using HWT. The results have shown that precision can be increased by applying the HWT as a preprocessing step, with better results for hard thresholding than soft thresholding, whereas standard SVD-based LSI remains the most effective way of searching in terms of recall value.

KW - Wavelet transforms

KW - Haar transforms

KW - Information retrieval

KW - Singular valve decomposition

U2 - 10.1049/iet-ipr.2011.0498

DO - 10.1049/iet-ipr.2011.0498

M3 - Article

VL - 6

SP - 1236

EP - 1245

JO - IET Image Processing

JF - IET Image Processing

SN - 1751-9659

IS - 9

ER -