Independent component analysis P23

Thông tin tài liệu

23 Telecommunications This chapter deals with applications of independent component analysis (ICA) and blind source separation (BSS) methods to telecommunications. In the following, we concentrate on code division multiple access (CDMA) techniques, because this specific branch of telecommunications provides several possibilities for applying ICA and BSS in a meaningful way. After an introduction to multiuser detection and CDMA communications, we present mathematically the CDMA signal model and show that it can be cast in the form of a noisy matrix ICA model. Then we discuss in more detail three particular applications of ICA or BSS techniques to CDMA data. These are a simplified complexity minimization approach for estimating fading channels, blind separation of convolutive mixtures using an extension of the natural gradient algorithm, and improvement of the performance of conventional CDMA receivers using complex-valued ICA. The ultimate goal in these applications is to detect the desired user’s symbols, but for achieving this intermediate quantities such as fading channel or delays must usually be estimated first. At the end of the chapter, we give references to other communications applications of ICA and related blind techniques used in communications. 23.1 MULTIUSER DETECTION AND CDMA COMMUNICATIONS In wireless communication systems, like mobile phones, an essential issue is division of the common transmission medium among several users. This calls for a multiple access communication scheme. A primary goal in designing multiple access systems is to enable each user of the system to communicate despite the fact that the other 417 Independent Component Analysis. Aapo Hyv ¨ arinen, Juha Karhunen, Erkki Oja Copyright  2001 John Wiley & Sons, Inc. ISBNs: 0-471-40540-X (Hardback); 0-471-22131-7 (Electronic) 418 TELECOMMUNICATIONS FDMA TDMA CDMA time code frequency frequencyfrequency timetime codecode Fig. 23.1 A schematic diagram of the multiple access schemes FDMA, TDMA, and CDMA [410, 382]. users occupy the same resources, possibly simultaneously. As the number of users in the system grows, it becomes necessary to use the common resources as efficiently as possible. These two requirements have given rise to a number of multiple access schemes. Figure 23.1 illustrates the most common multiple access schemes [378, 410, 444]. In frequency division multiple access (FDMA), each user is given a nonoverlapping frequency slot in which one and only one user is allowed to operate. This prevents interference of other users. In time division multiple access (TDMA) a similar idea is realized in the time domain, where each user is given a unique time period (or periods). One user can thus transmit and receive data only during his or her predetermined time interval while the others are silent at the same time. In CDMA [287, 378, 410, 444], there is no disjoint division in frequency and time spaces, but each user occupies the same frequency band simultaneously. The users are now identified by their codes, which are unique to each user. Roughly speaking, each user applies his unique code to his information signal (data symbols) before transmitting it through a common medium. In transmission different users’ signals become mixed, because the same frequencies are used at the same time. Each user’s transmitted signal can be identified from the mixture by applying his unique code at the receiver. In its simplest form, the code is a pseudorandom sequence of 1 s, also called a chip sequence or spreading code. In this case we speak about direct sequence (DS) modulation [378], and call the multiple access method DS-CDMA. In DS-CDMA, each user’s narrow-band data symbols (information bits) are spread in frequency before actual transmission via a common medium. The spreading is carried out by multiplying each user’s data symbols (information bits) by his unique wide-band chip sequence (spreading code). The chip sequence varies much faster than the MULTIUSER DETECTION AND CDMA COMMUNICATIONS 419 time -1 -1 -1 +1 +1 +1 T T c binary data spreading code modulated data time time Fig. 23.2 Construction of a CDMA signal [382]. Top: Binary user’s symbols to be transmitted. Middle: User’s specific spreading code (chip sequence). Bottom: Modulated CDMA signal, obtained by multiplying user’s symbols by the spreading code. information bit sequence. In the frequency domain, this leads to spreading of the power spectrum of the transmitted signal. Such spread spectrum techniques are useful because they make the transmission more robust against disturbances caused by other signals transmitted simultaneously [444]. Example 23.1 Figure 23.2 shows an example of the formation of a CDMA signal. On the topmost subfigure, there are 4 user’s symbols (information bits) 1 +1 1 +1 to be transmitted.The middle subfigure shows the chip sequence (spreading code). It is now 1 +1 1 1 +1 . Each symbol is multiplied by the chip sequence in a similar manner. This yields the modulated CDMA signal on the bottom row of Fig. 23.2, which is then transmitted. The bits in the spreading code change in this case 5 times faster that the symbols. Let us denote the m th data symbol (information bit) by b m , and the chip sequence by s(t) . The time period of the chip sequence is T (see Fig. 23.2), so that s(t) 2 f1 +1g when t 2 0T) ,and s(t)=0 when t =2 0T) . The length of the chip sequence is C chips, and the time duration of each chip is T c = T=C . The number of bits in the observation interval is denoted by N . In Fig. 23.2, the observation interval contains N =4 symbols, and the length of the chip sequence is C =5 . 420 TELECOMMUNICATIONS Using these notations, the CDMA signal r(t) at time t arising in this simple example can be written r(t)= N X m=1 b m s(t  mT ) (23.1) In the reception of the DS-CDMA signal, the final objective is to estimate the transmitted symbols. However, both code timing and channel estimation are often prerequisite tasks. Detection of the desired user’s symbols is in CDMA systems more complicated than in the simpler TDMA and FDMA systems used previously in mobile communications. This is because the spreading code sequences of different users are typically nonorthogonal, and because several users are transmitting their symbols at the same time using the same frequency band. However, CDMA systems offer several advantages over more traditional techniques [444, 382]. Their capacity is larger, and it degrades gradually with increasing number of simultaneous users who can be asynchronous [444]. CDMA technology is therefore a strong candidate for future global wireless communications systems. For example, it has already been chosen as the transmission technique for the European third generation mobile system UMTS [334, 182], which will provide useful new services, especially multimedia and high-bit-rate packet data. In mobile communications systems, the required signal processing differs in the base station (uplink) from that in the mobile phone (downlink). In the base station, all the signals sent by different users must be detected, but there is also much more signal processing capacity available. The codes of all the users are known but their time delays are unknown. For delay estimation, one can use for example the simple matched filter [378, 444], subspace approaches [44, 413], or the optimal but computationally highly demanding maximum likelihood method [378, 444]. When the delays have been estimated, one can estimate the other parameters such as the fading process and symbols [444]. In downlink (mobile phone) signal processing, each user knows only its own code, while the codes of the other users are unknown. There is less processing power than in the base station. Also the mathematical model of the signals differs slightly, since users share the same channel in the downlink communications. Especially the first two features of downlink processing call for new, efficient and simple solutions. ICA and BSS techniques provide a promising new approach to the downlink signal processing using short spreading codes and DS-CDMA systems. Figure 23.3 shows a typical CDMA transmission situation in an urban environment. Signal 1 arrives directly from the base station to the mobile phone in the car. It has the smallest time delay and is the strongest signal, because it is not attenuated by the reflection coefficients of the obstacles in the path. Due to multipath propagation, the user in the car in Fig. 23.3 receives also weaker signals 2 and 3, which have longer time delays. The existence of multipath propagation allows the signal to interfere with itself. This phenomenon is known as intersymbol interference (ISI). Using spreading codes and suitable processing methods, multipath interference can be mitigated [444]. MULTIUSER DETECTION AND CDMA COMMUNICATIONS 421 Time delay Magnitude Fig. 23.3 An example of multipath propagation in urban environment. There are several other problems that complicate CDMA reception. One of the most serious ones is multiple access interference (MAI), which arises from the fact that the same frequency band is occupied simultaneously. MAI can be alleviated by increasing the length of the spreading code, but at a fixed chip rate, this decreases the data rate. In addition, the near–far problem arises when signals from near and far are received at the same time. If the received powers from different users become too different, a stronger user will seriously interfere with the weaker ones, even if there is a small correlation between the users’ spreading codes. In the FDMA and TDMA systems, the near–far problem does not arise because different users have nonoverlapping frequency or time slots. The near–far problem in the base station can be mitigated by power control, or by multiuser detection. Efficient multiuser detection requires knowledge or estimation of many system parameters such as propagation delay, carrier frequency, and received power level. This is usually not possible in the downlink. However, then blind multiuser detection techniques can be applied, provided that the spreading codes are short enough [382]. Still other problems appearing in CDMA systems are power control, synchroniza- tion, and fading of channels, which is present in all mobile communications systems. Fading means variation of the signal power in mobile transmission caused for example by buildings and changing terrain. See [378, 444, 382] for more information on these topics. 422 TELECOMMUNICATIONS 23.2 CDMA SIGNAL MODEL AND ICA In this section, we represent mathematically the CDMA signal model which is studied in slightly varying forms in this chapter. This type of models and the formation of the observed data in them are discussed in detail in [444, 287, 382]. It is straightforward to generalize the simple model (23.1) for K users. The m th symbol of the k th user is denoted by b km ,and s k () is k :th user’s binary chip sequence (spreading code). For each user k , the spreading code is defined quite similarly as in Example 23.1. The combined signal of K simultaneous users then becomes r(t)= N X m=1 K X k=1 b km s k (t  mT )+n(t) (23.2) where n(t) denotes additive noise corrupting the observed signal. The signal model (23.2) is not yet quite realistic, because it does not take into account the effect of multipath propagation and fading channels. Including these factors in (23.2) yields our desired downlink CDMA signal model for the observed data r(t) at time t : r(t)= N X m=1 K X k=1 b km L X l=1 a lm s k (t  mT  d l )+n(t) (23.3) Here the index m refers to the symbol, k to the user, and l to the path. The term d l denotes the delay of the l th path, which is assumed to be constant during the observation interval of N symbol bits. Each of the K simultaneous users has L independent transmission paths. The term a lm is the fading factor of the l th path corresponding to the m th symbol. In general, the fading coefficients a lm are complex-valued. However, we can apply standard real-valued ICA methods to the data (23.3) by using only the real part of it. This is the case in the first two approaches to be discussed in the next two sections, while the last method in Section 23.5 directly uses complex data. The continuous time data (23.3) is first sampled using the chip rate, so that C equispaced samples per symbol are taken. From subsequent discretized equispaced data samples rn] , C -length data vectors are then collected: r m =(rmC ]rmC +1]::: r(m +1)C  1]) T (23.4) Then the model (23.3) can be written in vector form as [44] r m = K X k=1 L X l=1 a lm1 b km1 g kl + a lm b km g kl ]+n m (23.5) where n m denotes the noise vector consisting of subsequent C last samples of noise n(t) . The vector g kl denotes the “early” part of the code vector, and g kl the “late” part, respectively. These vectors are given by g kl =s k C  d l +1]:::s k C ] 0 T d l ] T (23.6) CDMA SIGNAL MODEL AND ICA 423 g kl =0 T d l s k 1]:::s k C  d l ]] T (23.7) Here d l is the discretized index representing the time delay, d l 2f0:::(C  1)=2g , and 0 T d l is a row vector having d l zeros as its elements. The early and late parts of the code vector arise because of the time delay d l , which means that the chip sequence generally does not coincide with the time interval of a single user’s symbol, but extends over two subsequent bits b km1 and b km . This effect of the time delay can be easily observed by shifting the spreading code to the right in Fig. 23.2. The vector model (23.5) can be expressed in compact form as a matrix model. Define the data matrix R =r 1  r 2 ::: r N ] (23.8) consisting of N subsequent data vectors r i .Then R can be represented as R = GF + N (23.9) where the C  2KL matrix G contains all the KL early and late code vectors G =g 11  g 11 :::g KL  g KL ] (23.10) and the 2KL  N matrix F = f 1 :::f N ] contains the symbols and fading terms f m = a 1m1 b 1m1 a 1m b 1m  (23.11) ::: a Lm1 b Km1 a Lm b Km ] T The vector f m represents the 2KL symbols and fading terms of all the users and paths corresponding to the m th pair of early and late code vectors. From the physical situation, it follows that each path and user are at least ap- proximately independent of each other [382]. Hence every product a im1 b im1 or a im b im of a symbol and the respective fading term can be regarded as an independent source signal. Because each user’s subsequent transmitted symbols are assumed to be independent, these products are also independent for a given user i . Denote the independent sources a 1m1 b 1m1 ::: a Lm b Km by y i (m)i =1:::2KL . Here every 2L sources correspond to each user, where the coefficient 2 follows from the presence of the early and late parts. To see the correspondence of (23.9) to ICA, let us write the noisy linear ICA model x = As + n in the matrix form as X = AS + N (23.12) The data matrix X has as its columns the data vectors x(1) x(2):::  and S and N are similarly compiled source and noise matrices whose columns consist of the source and noise vectors s(t) and n(t) , respectively. Comparing the matrix CDMA signal model (23.9) with (23.12) shows that it has the same form as the noisy linear ICA model. Clearly, in the CDMA model (23.9) F is the matrix of source signals, R is the observed data matrix, and G is the unknown mixing matrix. 424 TELECOMMUNICATIONS For estimating the desired user’s parameters and symbols, several techniques are available [287, 444]. Matched filter (correlator) [378, 444] is the simplest estimator, but it performs well only if different users’ chip sequences are orthogonal or the users have equal powers. The matched filter suffers greatly from the near–far problem, rendering it unsuitable for CDMA reception without a strict power control. The so-called RAKE detector [378] is a somewhat improved version of the basic matched filter which takes advantage of multiple propagation paths. The maximum likelihood (ML) method [378, 444] would be optimal, but it has a very high computational load, and requires knowledge of all the users’ codes. However, in downlink reception, only the desired user’s code is known. To remedy this problem while preserving acceptable performance, subspace approaches have been proposed for example in [44]. But they are sensitive to noise, and fail when the signal subspace dimension exceeds the processing gain. This easily occurs even with moderate system load due to the multipath propagation. Some other semiblind methods proposed for the CDMA problem such as the minimum mean-square estimator (MMSE) are discussed later in this chapter and in [287, 382, 444]. It should be noted that the CDMA estimation problem is not completely blind, because there is some prior information available. In particular, the transmitted symbols are binary (more generally from a finite alphabet), and the spreading code (chip sequence) is known. On the other hand, multipath propagation, possibly fading channels, and time delays make separation of the desired user’s symbols a very challenging estimation problem which is more complicated than the standard ICA problem. 23.3 ESTIMATING FADING CHANNELS 23.3.1 Minimization of complexity Pajunen [342] has recently introduced a complexity minimization approach as a true generalization of standard ICA. In his method, temporal information contained in the source signals is also taken into account in addition to the spatial independence utilized by standard ICA. The goal is to optimally exploit all the available information in blind source separation. In the special case where the sources are temporally white (uncorrelated), complexity minimization reduces to standard ICA [342]. Complexity minimization has been discussed in more detail in Section 18.3. Regrettably, the original method for minimizing the Kolmogoroff complexitymea- sure is computationally highly demanding except for small scale problems. But if the source signals are assumed to be gaussian and nonwhite with significant time correlations, the minimization task becomes much simpler [344]. Complexity minimization then reduces to principal component analysis of temporal correlation matrices. This method is actually just another example of blind source separation approaches based on second-order temporal statistics; for example [424, 43], which were discussed earlier in Chapter 18. ESTIMATING FADING CHANNELS 425 In the following, we apply this simplified method to the estimation of the fading channel coefficients of the desired user in a CDMA systems. Simulations with downlink data, propagated through a Rayleigh fading channel, show noticeable performance gains compared with blind minimum mean-square error channel estimation, which is currently a standard method for solving this problem. The material in this section is based on the original paper [98]. We thus assume that the fading process is gaussian and complex-valued. Then the amplitude of the fading process is Rayleigh distributed; this case is called Rayleigh fading (see [444, 378]). We also assume that a training sequence or a preamble is available for the desired user, although this may not always be the case in practice. Under these conditions, only the desired user’s contribution in the sampled data is time correlated, which is then utilized. The proposed method has the advantage that it estimates code timing only implicitly, and hence it does not degrade the accuracy of channel estimation. A standard method for separating the unknown source signals is based on minimization of the mutual information (see Chapter 10 and [197, 344]) of the separated signals f m = y 1 (m) :::y 2KL (m)] T = y : J (y)= X i H (y i )+log j det G j (23.13) where H (y i ) is the entropy of y i (see Chapter 5). But entropy has the interpretation that it represents the optimum averaged code length of a random variable. Hence mutual information can be expressed by using algorithmic complexity as [344] J (y)= X i K (y i )+log j det G j (23.14) where K () is the per-symbol Kolmogoroff complexity, given by the number of bits needed to describe y i . By using prior information about the signals, the coding costs can be explicitly approximated. For instance, if the signals are gaussian, independence becomes equivalent to uncorrelatedness. Then the Kolmogoroff complexity can be replaced by the per-symbol differential entropy, which in this case depends on second-order statistics only. For Rayleigh type fading transmission channels, the prior information can be for- mulated by considering that the probability distributions of the mutually independent source signals y i (m) have zero-mean gaussian distributions. Suppose we want to estimate the channel coefficients of the transmission paths, by sending a given length constant b 1m = 1 symbol sequence to the desired user. We consider the signals y i (m) , i =1:::2L , with i representing the indexes of the 2L sources corresponding to the first user. Then y i (m) will actually represent the channel coefficients of all the first user’s paths. Since we assume that the channel is Rayleigh fading, then these signals are gaussian and time correlated. In this case, blind separation of the sources can be achieved by using only second-order statistics. In fact, we can express the Kolmogoroff complexity by coding these signals using principal component analysis [344]. 426 TELECOMMUNICATIONS 23.3.2 Channel estimation * Let y i (m) = y i (m)::: y i (m  D +1)] denote the vector consisting of D last samples of every such source signal y i (m) , i =1:::2L . Here D is the number of delayed terms, showing what is the range of time correlations taken into account when estimating the current symbol. The information contained in any of these sources can be approximated by the code length needed for representing the D principal components, which have variances given by the eigenvalues of the temporal correlation matrix C i =E y i (m)y T i (m)] [344]. Since we assume that the transmission paths are mutually independent, the overall entropy of the source is given by summing up the entropies of the principal components. Using the result that the entropy of a gaussian random variable is given by the logarithm of the variance, we get for the entropy of each source signal H (y i )= 1 2L X k log  2 k = 1 2L log det C i (23.15) Inserting this into the cost function (23.13) yields J (y)) = X i 1 2L log det C i  log j det W j (23.16) where W = G 1 is the separating matrix. The separating matrix W can be estimated by using a gradient descent approach for minimizing the cost function (23.16), leading to the update rule [344] W =  @ log J (y) @ W + W (23.17) where  is the learning rate and  is the momentum term [172] that can be introduced to avoid getting trapped into a local minimum corresponding to a secondary path. Let w T i denote the i th row vector of the separating matrix W . Since only the correlation matrix C i of the i th source depends on w i , we can express the gradient of the cost function by computing the partial derivatives @ log det C i @w ik with respect to the scalar elements of the vector w T i = w i1 :::w iC ] . For these partial derivatives, one can derive the formula [344] @ log det C i @w ik = 2 trace  C 1 i E  y T i @ y i @w ik  (23.18) Since y i (m) = w T i r m ,weget @ y i @w ik = r km :::r kmL+1 ] (23.19) [...]... separated, and the different independent components can be found one by one, by taking into account the previously estimated components, contained in the subspace spanned by the columns of the matrix i Since our principal interest lies in the transmission path having the largest power, corresponding usually to the desired user, it is sufficient to estimate the first such independent component In this case,... two last samples are taken into account, so that the the delay D = 2 First, second-order correlations are removed from the data by whitening This can be done easily in terms of standard principal component analysis as explained in Chapter 6 After whitening, the subsequent separating matrix will be orthogonal, and thus the second term in Eq (23.16) disappears, yielding the cost function R J (y) with... mixtures Such methods have been discussed earlier in Chapter 19 The model consists of a linear transformation of both the independent variables (the transmitted symbols) and their delayed version, where the delay is one time unit For separating mixtures of delayed and convolved independent sources, we use an extension of the natural gradient method based on the information maximization principle [79,... gKl Kl l=1 l=1 l=1 l=1 L X a (23.44) The 3K -dimensional symbol vector bm = 1 m 1 b1m b1 m+1 b ::: b K m 1 bKm bK m+1 ]T (23.45) s G contains the symbols, and corresponds to the vector of independent (or roughly independent) sources Note that both the code matrix and the symbol vector m consists of subsequent triplets corresponding to the early, middle, and late parts 23.5.2 b ICA based receivers In... iterations The complex FastICA algorithm discussed earlier in Section 20.3 and in [47] is a natural choice for the ICA postprocessing method It can deal with complex-valued data, and extracts one independent component at a time, which suffices in this application Alternatively, known or estimated symbols can be used for initializing the ICA iteration This follows from the uncorrelatedness of the symbols,... is a small learning parameter b 4 Compute new estimate of the symbol vector m from Eq (23.34) 5 If the matrices H0 and H1 have not converged, return back to step 2 6 Apply the sign nonlinearity to each component of the final estimate of the symbol vector m This quantizes the estimated symbols to the bits +1 or 1 b 7 Identify the desired user’s symbol sequence which best fits the training sequence If some... (users’ codes) and symbol sequences are unknown makes the separation problem blind One method for solving this convolutive BSS problem is to consider a feedback architecture Assuming that the users are independent of each other, we can apply to the convolutive BSS problem the principle of entropy maximization discussed earlier in Section 9.3 The weights of the network can be optimized using the natural... H1)qm bm 1 (23.36) Here I is a K K unit matrix, and qm = f (bm ) is a nonlinearly transformed symbol vector bm The nonlinear function f is typically a sigmoidal or cubic nonlinearity, and it is applied componentwise to the elements of bm 1 Initialize randomly the matrices 2 432 TELECOMMUNICATIONS H1 z-1 + Whitening rm -1 H0 vm bm Fig 23.6 A feedback network for a convolutive CDMA signal model H0 and... : : : (C 1)=2g The delays and the path gains were assumed to be known The signal-to-noise ratio (in the chip matched filter output) varied with respect to the desired user from 5 dB to 35 dB, and 10000 independent trials were made A constant = 3 was used in the ICA iteration Figure 23.8 shows the achieved bit-error-rates (BERs) for the methods as a function of the SNR The performance of RAKE is quite . 23 Telecommunications This chapter deals with applications of independent component analysis (ICA) and blind source separation (BSS) methods to telecommunications of the system to communicate despite the fact that the other 417 Independent Component Analysis. Aapo Hyv ¨ arinen, Juha Karhunen, Erkki Oja Copyright 

Ngày đăng: 07/11/2013, 09:15

Xem thêm: Independent component analysis P23