Advances in computing encouraged the adoption of computer systems in numerous applications. In the health domain, the adoption of computer systems enables the introduction of better services, the provision of reliable services, and the reduction of human errors. Generally, data in computer systems is stored in coded format. However, in health databases certain data cannot be coded, e.g. doctors comments, hence, they are stored in the form of free text. Available literature has demonstrated that such free text contains invaluable information. However, extracting information from the free text portion of health databases is a challenging task due to the complexity of the stored data. Latent Semantic Indexing (LSI) is an Information Retrieval (IR) technique that proved its effectiveness in extracting information from health databases, as it is able to identify the semantics of the terms within and across the documents within the database. However, LSI has a major limitation, which is its inefficiency when extracting information from large scale document collections. In this chapter, two enhancements of the LSI method are proposed and evaluated in order to overcome this limitation. The proposed Distributed LSI and Parallel LSI methods were applied on an artificial electronic health records database (EMRbots) and were evaluated in terms of time complexity, recall, and precision.
|Title of host publication||Engineering and Technology for Healthcare|
|Editors||Muhammad A. Imran, Rami Ghannam, Qammer H. Abbasi|
|Number of pages||11|
|Publication status||Published - 27 Nov 2020|