Abstract
We consider means of extracting information from two data streams simultaneously when each data stream contains information about the other, i.e., there is redundancy in the data streams and we wish to identify the commonality between the data streams. The standard statistical method for doing this is canonical correlation analysis and so we consider extensions of this method: in the first group we use Bregman divergences to create methods of extracting information from the dual data streams which are optimal when the data has a distribution other than the Gaussian distribution. In the second advance, we use the method of reservoir computing in order to extract non-linear relationships. Finally we join the two methods and illustrate on a database of student marks.
Original language | English |
---|---|
Pages (from-to) | 188-202 |
Journal | International Journal of Data Mining, Modelling and Management |
Volume | 4 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2012 |
Keywords
- canonical correlation analysis
- CCA
- Bregman divergence
- reservoir computing
- dual streams
- data exploration
- data streams
- information extraction
- data redundancy
- commonality
- Gaussian distribution
- Carl Friedrich Gauss
- non-linear relationships
- databases
- student marks
- data mining
- data modelling
- data management
- intelligent data analysis