Independent component analysis P24

Thông tin tài liệu

24 Other Applications In this chapter, we consider some further applications of independent component analysis (ICA), including analysis of financial time series and audio signal separation. 24.1 FINANCIAL APPLICATIONS 24.1.1 Finding hidden factors in financial data It is tempting to try ICA on financial data. There are many situations in which parallel financial time series are available, such as currency exchange rates or daily returns of stocks, that may have some common underlying factors. ICA might reveal some driving mechanisms that otherwise remain hidden. In a study of a stock portfolio [22], it was found that ICA is a complementary tool to principal component analysis (PCA), allowing the underlying structure of the data to be more readily observed. If one could find the maximally independent mixtures of the original stocks, i.e., portfolios, this might help in minimizing the risk in the investment strategy. In [245], we applied ICA on a different problem: the cashflow of several stores belonging to the same retail chain, trying to find the fundamental factors common to all stores that affect the cashflow. Thus, the effect of the factors specific to any particular store, i.e., the effect of the managerial actions taken at the individual store and in its local environment, could be analyzed. In this case, the mixtures in the ICA model are parallel financial time series x i (t) , with i indexing the individual time series, i =1::: m and t denoting discrete time. 441 Independent Component Analysis. Aapo Hyv ¨ arinen, Juha Karhunen, Erkki Oja Copyright  2001 John Wiley & Sons, Inc. ISBNs: 0-471-40540-X (Hardback); 0-471-22131-7 (Electronic) 442 OTHER APPLICATIONS We assume the instantaneous ICA model x i (t)= X j a ij s j (t) (24.1) for each time series x i (t) . Thus the effect of each time-varying underlying factor or independent component s j (t) on the measured time series is approximately linear. The assumption of having some underlying independent components in this specific application may not be unrealistic. For example, factors like seasonal variations due to holidays and annual variations, and factors having a sudden effect on the purchasing power of the customers, like price changes of various commodities, can be expected to have an effect on all the retail stores, and such factors can be assumed to be roughly independent of each other. Yet, depending on the policy and skills of the individual manager, e.g., advertising efforts, the effect of the factors on the cash flow of specific retail outlets are slightly different. By ICA, it is possible to isolate both the underlying factors and the effect weights, thus also making it possible to group the stores on the basis of their managerial policies using only the cash flow time series data. The data consisted of the weekly cash flow in 40 stores that belong to the same retail chain, covering a time span of 140 weeks. Some examples of the original data x i (t) are shown in Fig. 24.1. The weeks of a year are shown on the horizontal axis, starting from the first week in January. Thus for example the heightened Christmas sales are visible in each time series before and during week 51 in both of the full years shown. The data were first prewhitened using PCA. The original 40-dimensional signal vectors were projected to the subspace spanned by four principal components, and the variances were normalized to 1. Thus the dimension of the signal space was strongly decreased from 40. A problem in this kind of real world application is that there is no prior knowledge on the number of independent components. Sometimes the eigenvalue spectrum of the data covariance matrix can be used, as shown in Chapter 6, but in this case the eigenvalues decreased rather smoothly without indicating any clear signal subspace dimension. Then the only way is to try different dimensions. If the independent components that are found using different dimensions for the whitened data are the same or very similar, we can trust that they are not just artifacts produced by the compression, but truly indicate some underlying factors in the data. Using the FastICA algorithm, four independent components (ICs) s j (t) j = 1:::4 were estimated. As depicted in Fig. 24.2, the FastICA algorithm has found several clearly different fundamental factors hidden in the original data. The factors have different interpretations. The topmost factor follows the sudden changes that are caused by holidays etc.; the most prominent example is Christmas time. The factor in the bottom row, on the other hand, reflects the slower seasonal variation, with the effect of the summer holidays clearly visible. The factor in the third row could represent a still slower variation, something resembling a trend. The last factor, in the second row, is different from the others; it might be that this factor follows mostly the relative competitive position of the retail chain with respect to its competitors, but other interpretations are also possible. FINANCIAL APPLICATIONS 443 1 20 40 8 28 48 16 1 20 40 8 28 48 16 1 20 40 8 28 48 16 1 20 40 8 28 48 16 1 20 40 8 28 48 16 Fig. 24.1 Five samples of the 40 original cashflow time series (mean removed, normalized to unit standard deviation). Horizontal axis: time in weeks over 140 weeks. (Adapted from [245].) If five ICs are estimated instead of four, then three of the found components stay virtually the same, while the fourth one separates into two new components. Using the found mixing coefficients a ij , it is also possible to analyze the original time series and cluster them in groups. More details on the experiments and their interpretation can be found in [245]. 24.1.2 Time series prediction by ICA As noted in Chapter 18, the ICA transformation tends to produce component signals, s j (t) , that can be compressed with fewer bits than the original signals, x i (t) .They are thus more structured and regular. This gives motivation to try to predict the signals x i (t) by first going to the ICA space, doing the prediction there, and then transforming back to the original time series, as suggested by [362]. The prediction can be done separately and with a different method for each component, depending on its time structure. Hence, some interaction from the user may be needed in the overall prediction procedure. Another possibility would be to formulate the ICA contrast function in the first place so that it includes the prediction errors — some work along these lines has been reported by [437]. In [289], we suggested the following basic procedure: 1. After subtracting the mean of each time series and prewhitening (after which each time series has zero mean and unit variance), the independent components 444 OTHER APPLICATIONS 1 20 40 8 28 48 16 1 20 40 8 28 48 16 1 20 40 8 28 48 16 1 20 40 8 28 48 16 Fig. 24.2 Four independent components or fundamental factors found from the cashflow data. (Adapted from [245].) s j (t) , and the mixing matrix, A , are estimated using the FastICA algorithm. The number of ICs can be variable. 2. For each component s j (t) , a suitable nonlinear filtering is applied to reduce the effects of noise — smoothing for components that contain very low frequencies (trend, slow cyclical variations), and high-pass filtering for components containing high frequencies and/or sudden shocks. The nonlinear smoothing is done by applying smoothing functions f j on the source signals s j (t) , s s j (t)=f j s j (t + r)::: s j (t)::: s j (t  k )]: (24.2) 3. Each smoothed independent component is predicted separately, for instance using some method of autoregressive (AR) modeling [455]. The prediction is done for a number of steps into the future. This is done by applying prediction functions, g j , on the smoothed source signals, s s j (t) : s p j (t +1) = g j s s j (t)s s j (t  1)::: s s j (t  q )] (24.3) The next time steps are predicted by gliding the window of length q over the measured and predicted values of the smoothed signal. 4. The predictions for each independent component are combined by weighing them with the mixing coefficients, a ij , thus obtaining the predictions, x p i (t) , for the original time series, x i (t) : x p (t +1) = As p (t +1) (24.4) and similarly for t +2t +3::: . FINANCIAL APPLICATIONS 445 Fig. 24.3 Prediction of real-world financial data: the upper figure represents the actual future outcome of one of the original mixtures and the lower one the forecast obtained using ICA prediction for an interval of 50 values. To test the method, we applied our algorithm on a set of 10 foreign exchange rate time series. Again, we suppose that there are some independent factors that affect the time evolution of such time series. Economic indicators, interest rates, and psychological factors can be the underlying factors of exchange rates, as they are closely tied to the evolution of the currencies. Even without prediction, some of the ICs may be useful in analyzing the impact of different external phenomena on the foreign exchange rates [22]. The results were promising, as the ICA prediction performed better than direct prediction. Figure 24.3 shows an example of prediction using our method. The upper figure represents one of the original time series (mixtures) and the lower one the forecast obtained using ICA prediction for a future interval of 50 time steps. The algorithm seemed to predict very well especially the turning points. In Table 24.1 there is a comparison of errors obtained by applying classic AR prediction to the original time series directly, and our method outlined above. The right-most column shows the magnitude of the errors when no smoothing is applied to the currencies. While ICA and AR prediction are linear techniques, the smoothing was nonlinear. Using nonlinear smoothing, optimized for each independent component time series separately, the prediction of the ICs is more accurately performed and the results also are different from the direct prediction of the original time series. The noise in the time series is strongly reduced, allowing a better prediction of the underlying factors. The model is flexible and allows various smoothing tolerances and different orders in the classic AR prediction method for each independent component. In reality, especially in real world time series analysis, the data are distorted by delays, noise, and nonlinearities. Some of these could be handled by extensions of the basic ICA algorithms, as reported in Part III of this book. 446 OTHER APPLICATIONS Table 24.1 The prediction errors (in units of 0.001) obtained with our method and the classic AR method. Ten currency time series were considered and five independent components were used. The amount of smoothing in classic AR prediction was varied. Errors Smoothing in 2 0.5 0.1 0.08 0.06 0.05 0 AR prediction ICA prediction 2.3 2.3 2.3 2.3 2.3 2.3 2.3 AR prediction 9.7 9.1 4.7 3.9 3.4 3.1 4.2 24.2 AUDIO SEPARATION One of the original motivations for ICA research was the cocktail-party problem, as reviewed in the beginning of Chapter 7. The idea is that there are n sound sources recorded by a number of microphones, and we want to separate just one of the sources. In fact, often there is just one interesting signal, for example, a person speaking to the microphone, and all the other sources can be considered as noise; in this case, we have a problem of noise canceling. A typical example of a situation where we want to separate noise (or interference) from a speech signal is a person talking to a mobile phone in a noisy car. If there is just one microphone, one can attempt to cancel the noise by ordinary noise canceling methods: linear filtering, or perhaps more sophisticated techniques like wavelet and sparse code shrinkage (Section 15.6). Such noise canceling can be rather unsatisfactory, however. It works only if the noise has spectral characteristics that are clearly different from those of the speech signal. One might wish to remove the noise more effectively by collecting more data using several microphones. Since in real-life situations the positions of the microphones with respect to the sources can be rather arbitrary, the mixing process is not known, and it has to be estimated blindly. In this case, we find the ICA model, and the problem is one of blind source separation. Blind separation of audio signals is, however, much more difficult than one might expect. This is because the basic ICA model is a very crude approximation of the real mixing process. In fact, here we encounter almost all the complications that we have discussed in Part III:  The mixing is not instantaneous. Audio signals propagate rather slowly, and thus they arrive in the microphones at different times. Moreover, there are echos, especially if the recording is made in a room. Thus the problem is more adequately modeled by a convolutive version of the ICA model (Chapter 19). The situation is thus much more complicated than with the separation of mag- netoencephalographic (MEG) signals, which propagate fast, or with feature AUDIO SEPARATION 447 extraction, where no time delays are possible even in theory. In fact, even the basic convolutive ICA model may not be enough because the time delays may be fractional and may not be adequately modeled as integer multiples of the time interval between two samples.  Typically, the recordings are made with two microphones only. However, the number of source signals is probably much larger than 2 in most cases, since the noise sources may not form just one well-defined source. Thus we have the problem of overcomplete bases (Chapter 16).  The nonstationarity of the mixing is another important problem. The mixing matrix may change rather quickly, due to changes in the constellation of the speaker and the microphones. For example, one of these may be moving with respect to the other, or the speaker may simply turn his head. This implies that the mixing matrix must be reestimated quickly in a limited time frame, which also means a limited number of data. Adaptive estimation methods may alleviate this problem somewhat, but this is still a serious problem due to the convolutive nature of the mixing. In the convolutive mixing, the number of parameters can be very large: For example, the convolution may be modeled by filters of the length of 1000 time points, which effectively multiplies the number of parameters in the model by 1000. Since the number of data points should grow with the number of parameters to obtain satisfactory estimates, it may be next to impossible to estimate the model with the small number of data points that one has time to collect before the mixing matrix has changed too much.  Noise may be considerable. There may be strong sensor noise, which means that we should use the noisy ICA model (Chapter 15). The noise complicates the estimation of the ICA model quite considerably, even in the basic case where noise is assumed gaussian. On the other hand, the effect of overcomplete bases could be modeled as noise as well. This noise may not be very gaussian, however, making the problem even more difficult. Due to these complications, it may be that the prior information, independence and nongaussianity of the source signals, are not enough. To estimate the convolutive ICA model with a large number of parameters, and a rapidly changing mixing matrix, may require more information on the signals and the matrix. First, one may need to combine the assumption of nongaussianity with the different time- structure assumptions in Chapter 18. Speech signals have autocorrelations and nonstationarities, so this information could be used [267, 216]. Second, one may need to use some information on the mixing. For example, sparse priors (Section 20.1.3) could be used. It is also possible that real-life speech separation requires sophisticated modeling of speech signals. Speech signals are highly structured, autocorrelations and nonstationarity being just the very simplest aspects of their time structure. Such approaches were proposed in [54, 15]. 448 OTHER APPLICATIONS Because of these complications, audio separation is a largely unsolved problem. For a recent review on the subject, see [429]. One of the main theoretical problems, estimation of the convolutive ICA model, was described in Chapter 19. 24.3 FURTHER APPLICATIONS Among further applications, let us mention  Text document analysis [219, 229, 251]  Radiocommunications [110, 77]  Rotating machine monitoring [475]  Seismic monitoring [161]  Reflection canceling [127]  Nuclear magnetic resonance spectroscopy [321]  Selective transmission, which is a dual problem of blind source separation. A set of independent source signals are adaptively premixed prior to a nondis- persive physical mixing process so that each source can be independently monitored in the far field [117]. Further applications can be found in the proceedings of the ICA’99 and ICA2000 workshops [70, 348]. . this chapter, we consider some further applications of independent component analysis (ICA), including analysis of financial time series and audio signal separation factor or independent component s j (t) on the measured time series is approximately linear. The assumption of having some underlying independent components

Ngày đăng: 07/11/2013, 09:15

Xem thêm: Independent component analysis P24