Báo cáo hóa học: " Research Article A Unified Approach to List-Based Multiuser Detection in Overloaded Receivers" pot

14 209 0
Báo cáo hóa học: " Research Article A Unified Approach to List-Based Multiuser Detection in Overloaded Receivers" pot

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Hindawi Publishing Corporation EURASIP Journal on Wireless Communications and Networking Volume 2008, Article ID 817272, 14 pages doi:10.1155/2008/817272 Research Article A Unified Approach to List-Based Multiuser Detection in Overloaded Receivers Michael Krause, Desmond P Taylor, and Philippa A Martin Department of Electrical and Computer Engineering, University of Canterbury, Private Bag, 4800 Christchurch, New Zealand Correspondence should be addressed to Michael Krause, michael.krause@elec.canterbury.ac.nz Received 31 August 2007; Revised 13 December 2007; Accepted 25 February 2008 Recommended by Huaiyu Dai A wireless communication system is overloaded when the number of transmitted signals exceeds the number of receive antennas The presence of the resulting cochannel interference (CCI) under overload causes linear detection techniques to perform poorly We develop a unified approach to the separation and detection of the user signals for an overloaded system using a novel iterative list-based multiuser detector It combines a linear preprocessor with a nonlinear list detector and approximates optimum joint maximum-likelihood detection at lower complexity Complexity savings are achieved by first, exploiting the spatial separation of the users to mitigate CCI in the preprocessor stage and second, by estimating residual CCI in the following list detection stage The proposed list detection algorithm is applied to receivers with either a uniform circular array or a uniform linear array The preprocessor is implemented using either a special purpose spatial filter to mitigate the CCI or maximum ratio diversity combining to achieve diversity gain Simulation results and a complexity analysis indicate that the approach is suitable for practical application Copyright © 2008 Michael Krause et al This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited INTRODUCTION The use of multiple receive antennas allows significant increases in capacity and reliability of wireless data transfer by exploiting spatial diversity [1–4] Space-time processing for the detection of the signals from multiple users is now receiving considerable attention Wireless systems where the number of signals to be resolved exceeds the number of receive antennas are referred to as overloaded systems [5] Severe cochannel interference (CCI) occurs in such systems Under overload, the receive antenna array’s number of degrees of freedom is exceeded This causes linear detection techniques to perform poorly [2, 6] Multiuser detection (MUD) of the user signals is then difficult Comprehensive fundamental work on MUD is available in [7] Here, we restrict ourselves to reviewing literature specifically focused on MUD in the overloaded case Signal separation and detection in overloaded environments has been shown to be possible by exploiting the response differences among the user’s received cochannel signals [4] In [8, 9], maximum likelihood approaches to blind MUD in nonoverloaded receivers with antenna arrays were studied This work was extended to the overloaded case in [5, 10], which showed that under overload, linear detection algorithms suffer severe degradation and that joint maximum likelihood (JML) detection is optimum JML requires an exhaustive search over all possible symbol combinations Due to the search complexity, JML is not feasible for most applications Therefore, reduced complexity algorithms that achieve near JML performance are of significant interest This is particularly important under overloaded conditions Several reduced complexity algorithms have been developed In [6, 10–14], a high-altitude receiver with symbolsynchronous signals impinging on a circular antenna array is considered This is often referred to as the “base station in the sky” model For this model it has been shown that a preprocessor at the receiver can improve performance of reduced complexity detection [5, 15] The work of [6, 11– 14] employs a spatial filter as a preprocessor to mitigate CCI It achieves no diversity gain since it employs beam forming The detectors in [11–13] use either successive or parallel interference cancellation following preprocessing Compared to JML, complexity is low but the performance is poor if the user signals have similar energies In contrast, spatially reduced search joint detection (SRSJD) [6], when used with a circular array, achieves near JML performance It employs EURASIP Journal on Wireless Communications and Networking a beam former as a preprocessor and reduces complexity by searching a reduced-state search trellis, constructed over the subset of signals with “dominant” energy in each beam (The term “dominant” refers to a user signal that has significantly more energy than other signals.) The search relies on delayed-decision feedback sequence estimation (DDFSE) [16] and is efficiently done using the Viterbi algorithm [17] SRSJD requires the user’s overall channel matrix as seen at the receiver to have a “trellis-oriented” form which is achieved by only a few array geometries such as circular arrays (A matrix is said to be “trellis-oriented” if it has a diagonal banded structure.) Recently, we have developed two iterative list-based parallel detection algorithms for use under overloaded conditions These employ list feedback of the best estimates [14, 18] One, known as parallel symbol detection with reduced complexity interference estimation (PSD-RCIE) [14], uses the linear beam former of [6] as its preprocessor The second, known as parallel symbol detection with parallel interference cancellation (PSD-PIC) [18], uses maximum ratio combining (MRC) in the preprocessing stage A linear spatial beam former employed by a receiver with an Melement array can at most cancel M − interfering signals [19] and provides no diversity gain On the other hand, MRC maximizes the instantaneous signal-to-noise ratio (SNR) at the combiner output [20] but fails to eliminate CCI under overload The residual CCI level increases in both cases with the receiver overload factor In the detection stage, PSD-RCIE explicitly estimates the residual CCI based on a trellis representation and is hence restricted to trellis-oriented array geometries PSDPIC does not have this limitation Following MRC, it performs iterative parallel interference cancellation (PIC) coupled with joint list-based detection of the user symbols Both algorithms use estimates of the residual CCI to cancel interference In both instances, a list of the most likely symbols in each interval is obtained by searching over the signal symbols with “dominant” energy This is done for each received signal and creates a list for each These per signal lists are combined into a global list which is fed back to obtain improved symbol estimates After several iterations, the global list is output by the detector The iterative approach has the advantage that, even with inaccurate estimates of the residual CCI, symbol detection is possible In this paper, we develop a unified list-based, iterative approach to MUD in overloaded receivers that includes the PSD-RCIE and PSD-PIC approaches we proposed in [14, 18] as special cases The algorithm is here applied to receivers with either a uniform circular array (UCA) or a uniform linear array (ULA) but can easily be extended to an arbitrary geometry Both a linear spatial prefilter and an MRCbased diversity combiner are considered as preprocessors Performance is evaluated using Monte Carlo simulation The results show that our MUD approach outperforms existing reduced complexity algorithms and approximates JML at lower complexity, especially under heavy overload In Section 2, the system model and the receiver structure are introduced Spatial filtering and diversity combining are discussed in Section Symbol detection is described in Section and performance is evaluated in Section Complexity is analyzed in Section Conclusions are drawn in Section SYSTEM MODEL AND RECEIVER STRUCTURE Consider a single-input multiple-output (SIMO) communication system with an M-element arbitrary receive array and D single-antenna users The receiver load factor is f = D/M, where f > under overload The D users are assumed to transmit QAM signals which are incident on all receive antennas For simplicity, we consider symbol synchronous signals with no intersymbol interference present in the channel (The extension to the symbol nonsynchronous case is straightforward.) Figure shows a model of the proposed receiver At each antenna, the received signal is passed through a filter matched to the transmitted pulse shape and then sampled at symbol rate to give the M × received signal vector x = As + z, (1) where s = [s1 s2 · · · sD ]T is the D × symbol vector containing the user symbols, sd Each user symbol sd is independent and uniformly drawn from an alphabet A The vector s is multiplied by the M × D composite array response matrix A = [a[1] a[2] · · · a[D]] with a[d] being the M × array steering vector for the dth user (In a more complex channel, the matrix A also includes the channel response.) We assume that A is computed by a channel estimator which estimates the direction of arrival for each of the D signals The quantity z is an M × temporally uncorrelated noise vector with zero mean and autocorrelation Φzz = E[zzH ], where E[·] denotes expectation For spatially uncorrelated 2 noise, Φzz = σz I, where σz denotes the noise variance and I is the M × M identity matrix Throughout this paper, any time dependance in equations is dropped for convenience 2.1 Uniform circular array The UCA has isotropic antenna elements equispaced on a circle with radius R as shown in Figure Following [21], the array steering vector for each of the D signals is denoted a(θd ) = [a1 a2 · · · aM ]T with components given by am = exp − j π 2πR − θd − φm sin cos λ d , (2) d = 1, 2, , D, where θd is the estimated azimuthal angle of arrival (AOA), d is the elevation (or depression) angle, λ is the wavelength at the carrier frequency, φm = 2π(m − 1)/M is the angle of the mth element in azimuth [22] For simplicity, only azimuth is considered ( d = 90◦ ) However, the results can easily be extended to three dimensions Michael Krause et al User User Matched filter Matched filter User d Preprocessor H x Matched filter User D List-based multiuser detection algorithm y s P A D M > Symbol-synchronous Antenna cochannel signals receive array Channel estimator Figure 1: Receiver structure form of the JML criterion that lends itself to suboptimal approximation If no intersymbol interference is present, JML leads to the symbol by symbol detector given by [10] User D User d θd φ2 − s = arg (x − As)H Φzz1 (x − As), D s∈A User Uniform circular array R User B φ3 φ1 Uniform linear array φ4 Figure 2: System model for a ULA and a UCA with M = 4-elements and D > M single-antenna users where (·)H denotes Hermitian transpose The minimization requires a search over all |A|D possible transmit symbol combinations The resulting complexity mandates approximation The key to approximating (4) is to find a transform that maps the M × received vector x into the D × vector y = [y[1], y[2], , y[D]]T and the M × D array response matrix A into a D × D square matrix H = [h[1], h[2], , h[D]]T , where y[d] is the dth component of y and h[d] = [hd1 , hd2 , , hdD ] is the corresponding × D row vector of H with elements hdu We seek a transform that maps x(M ×1) −→ y(D×1) , A(M ×D) −→ H(D×D) 2.2 Uniform linear array In the ULA configuration, isotropic antenna elements are located in a straight line with equal spacing between the elements, B, as in Figure [23] The array steering vector for each signal is again denoted a(θd ) = [a1 a2 · · · aM ]T , but with components given by [21] am = exp − j 2πB(m − 1) sin θd , λ The estimated array response matrix A and the received signal vector x, following matched filtering, are input to a preprocessor as shown in Figure It exploits the spatial separation of the users to mitigate CCI effects so as to enable complexity reduction in the subsequent MUD stage We will consider two approaches, but we first find an alternate (5) We call y the transformed receive vector and H the user channel matrix There are two interpretations possible for the transform of (5), either spatial filtering or diversity combining (Note that both are essentially projection operations.) In each case, the solution is a D × M complex weight matrix W such that d = 1, 2, , D (3) PREPROCESSOR (4) y = Wx 3.1 (6) Spatial filtering A spatial filter exploits the fact that user signals incident on the antenna array with greater spread in AOA interfere with each other less than signals that are closely spaced in AOA CCI from users reasonably widely spaced in AOA can thus be effectively reduced This is essentially a beam forming operation 4 EURASIP Journal on Wireless Communications and Networking The matrix W can be derived from the JML criterion of (4) by choosing y and H such that [6] − HH H = AH Φzz1 A, (7) − HH y = AH Φzz1 x This satisfies the mapping of (5) and yields the JML detector in the form s = arg y − Hs D s∈A D = arg s∈AD y[d] − h[d]s = arg s∈AD (8) d =1 D D y[d] − hdu su u=1 d =1 signals The users are uniformly spaced within the array’s view angle defined as θmax = ±60◦ Hence the user’s azimuth AOAs are θd = {±60◦ , ±36◦ , ±12◦ } with d = 1, 2, , The antenna elements are spaced at distance B = 3λ apart In contrast to Figure 3(a), the energy is not uniformly concentrated along the main diagonal of H as there are elements with “high” energy further away from the main diagonal (At this stage, the term “high” refers to an intuitive definition of matrix elements with significant energy The mathematical definition is given later.) Thus H does not have a banded structure and is not trellis-oriented † − From (7), we find W = (HH ) AH Φzz1 , where (·)† denotes the pseudoinverse The matrix W is a trellis-oriented multipleinput multiple-output (MIMO) beam former since each row places a beam in the direction of only one transmitted signal [6] It increases the number of observation samples and acts as a noise whitening interference rejection filter The elements of y denote the received signal in each of the D beams and each row of H shows the energy contribution to the received signal in the dth beam Figure 3(a) shows the form of H for a receiver employing a spatial filter as a preprocessor The receiver has an M = 5-element UCA front end with radius R = 0.2λ Data is received from D = equal energy users uniformly spaced in AOA We see that most of the energy is concentrated on or near the main diagonal of H, resulting in a banded structure, where in each row only a few elements contain most of the energy 3.3 Spatial filtering versus diversity combining The beam forming spatial filter works best if relatively closely spaced antenna elements are available to form beam patterns To ensure sufficient correlation, the element spacing should be within half a wavelength at the carrier frequency This follows from the Nyquist sampling theorem [25] We note that a linear spatial filter cannot cancel more than D = M − interfering cochannel users (see, e.g., [19]) In overloaded receivers, the advantage of beam forming tends to be lost as there will still be significant CCI In contrast, diversity combining requires little or no cross-correlation between the antenna elements If a signal at one element goes through a deep fade, it is then unlikely that the other elements encounter a deep fade for the same signal at the same time Hence combining the signals from different elements can improve receive performance as there is nearly always good reception at one of them Antenna spacing is usually on the order of several carrier frequency wavelengths and does not satisfy the Nyquist sampling theorem As a result, spatial aliasing and grating lobes occur [26] when the array properties are considered This is offset by the diversity gain attained We will see that our unified MUD algorithm works well with both types of preprocessors 3.2 Diversity combining In contrast to (7), if we consider (5) from the viewpoint of diversity combining, we seek to combine the multiple replicas of the received information-bearing signal in an advantageous way MRC is the classical and optimal [24] diversity combining technique The combiner output is a weighted linear combination of the signal replicas For MRC with perfect channel estimation, the optimum weight matrix in (6) is W = AH [24] MRC tries to map the receive vector x into y such that each user has maximum SNR in one of the components of y Defining the channel matrix H such that H = AH A (9) allows us to write the JML detector as in (8) with the difference being the definitions of W and H in the two cases The row elements of H denote the energy contribution from the D users to the received signal in which the SNR of the corresponding user is maximized In Figure 3(c), the form of H is illustrated for a receiver using MRC as a preprocessor The antenna array is an M = 5element ULA Again D = users transmit equal energy 3.4 Sparsity pattern The two examples of the channel matrix H in Figures 3(a) and 3(c) show that only a few elements in each row contain most of the signal energy Therefore, we can derive a sparsity matrix, P, that contains unity entries for elements with “high” energy and zeros for elements with “low” energy [6] (We describe the selection of matrix elements with “high” and “low” energy later Here it is only an intuitive definition.) The sparsity matrix is a D × D matrix, P = [p[1], p[2], , p[D]]T , where each element pdu corresponds to the element hdu in H for d, u = 1, , D Its use allows reduced complexity approximations to the JML detector of (4) The sparsity matrices for Figures 3(a) and 3(c) are shown in Figures 3(b) and 3(d), respectively We first define enumeration sets, Ue [d], which contain the column indices of the unity elements in each row p[d] ∈ P (As in [6], the term enumeration set is used because the detection algorithm enumerates over all combinations of user symbols {su |u ∈ Ue [d]}.) The indices in Ue [d] indicate users with “high” energy For example, in the first row of H in Figure 3(a), Ue [1] = {6, 1, 2} and U e [1] = {3, 4, 5} Michael Krause et al Sparsity matrix P Spectral square root of H 1.8 Beam former 1, 2, · · · , D 1.6 1.4 1.2 0.8 0.6 0.4 0.2 User signal 1, 2, , D 1 0 25.5 1 0 19.7 1 0 19.7 0 1 25.5 0 1 19.7 Beam former indices c, d = 1, 2, , D 1 0 1 19.7 SEIR (dB) Symbol index u = 1, 2, , D (a) (b) Sparsity matrix P Spectral square root of H 1.6 1.4 1.2 0.8 0.6 0.4 0.2 User signal 1, 2, , D (c) 1 0 11.5 1 0 10.2 1 0 13 0 1 13 0 1 10.2 Row indices c, d = 1, 2, , D 1.8 Row index 1, 2, · · · , D 0 1 11.5 SEIR (dB) Symbol index u = 1, 2, , D (d) (1/2) Figure 3: (a) Spectral square root (HH H) of H and (b) sparsity matrix P for a 5-element UCA The users are uniformly spaced in AOA (1/2) of H and (d) sparsity matrix P for a 5-element ULA The user AOAs are uniform within θmax = ±60◦ (c) Spectral square root (HH H) There are D = equal energy users Elements with “1” in P are obtained by using the SEAIR and SSSER criteria with thresholds T1 = and T2 = 0.1, respectively are the column user indices of elements with “high” and “low” energy, respectively Hence the corresponding sparsity pattern in Figure 3(b) is p[1] = [1, 1, 0, 0, 0, 1] The quality of the sparsity matrix found depends on the criterion used to choose its elements A so-called desired energy to interference ratio (DEIR) criterion was used in [6, 14] In [18], the strongest energy to interference ratio (SEIR) was used (Note that the SEIR [18] and DEIR [6] criteria are equivalent if, in the dth row h[d] ∈ H, the diagonal element hdd has the most signal power, |hdd |2 = max1≤u≤D |hdu |2 ) Both use a threshold which, if chosen poorly, erroneously treats signals with low energy as high-energy signals, and results in higher detection complexity than necessary for a given level of performance A poor choice can also lead to considering strong signals as low energy signals, which results in lower complexity at the cost of poorer overall performance Here we present a different approach to the construction of P that appears robust over a wider range of cochannel users than the DEIR and SEIR criteria It is based on two empirically chosen thresholds T1 and T2 and determines the complexity-performance tradeoff of subsequent MUD Because this approach considers energy separation of the preprocessed user signals, it is limited to scenarios where sufficient separation can be achieved, meaning that it tends to perform poorly if, after preprocessing, the user signals have too similar energies As a result, either too few or too many signals with high energy would be selected This can occur under extreme overload when using a linear preprocessor The optimum choice of T1 and T2 is an open research topic 6 EURASIP Journal on Wireless Communications and Networking In general, the choice depends on the desired complexity/ performance tradeoff, the receive antenna geometry, the type of preprocessor, the number of receive antennas M, and the number of cochannel users D We first compute the signal energy to average interference ratio (SEAIR) and use the empirical threshold T1 to ensure sufficient separation between high-energy and low-energy signals The SEAIR is defined as SEAIR[d, u] = = E hdu su 1/ U e [d] E v∈U e [d] hdv sv hdu 1/ U e [d] (10) v∈U e [d] hdv 2, where the numerator represents a high-energy signal and the denominator is the average interference energy with |U e [d]| = D − |Ue [d]| denoting the number of signals outside the enumeration set Ue [d] The quantities su and sv are the user symbols corresponding to hdu and hdv , respectively We find the column indices u ∈ Ue [d] by choosing Ue [d] = (ξ) arg max hdu 1≤u≤D ≤ ξ ≤ ρ[d], , (11) where max(ξ) denotes the ξth greatest value and ρ[d] is the number of column indices considered in the dth row Computation stops if the ξth SEAIR value is below the predefined threshold, T1 , or when all ξ = ρ[d] indices have been processed (Ideally, we would choose ρ[d] = D to allow all signals to be considered The choice ρ[d] < D leads to an upper bound on the complexity of the detection algorithm which is desirable for practical systems.) The selected indices u are then retained in the selected enumeration set Ue [d] only if they fulfill a second criteria, specified by the signal to strongest signal energy ratio (SSSER) defined as SSSER[d, u] = E hdu su E max1≤v≤D hdv sv = hdu 2, max1≤v≤D hdv (12) where the numerator represents the energy of the uth user signal in Ue [d] and the denominator is the energy of the strongest signal in h[d] We compute the SSSER for all selected column indices u ∈ Ue [d] The second predefined threshold T2 is used to remove indices u with SSSER[d, u] < T2 from the set Ue [d] We then construct the sparsity pattern p[d] ∈ P by assigning unity entries to all users corresponding to indices u ∈ Ue [d] and zero entries for those where u ∈ U e [d] From p[d] ∈ P, we obtain τ[d] = su | u ∈ Ue [d] , CCI which degrades the detection of the high-energy symbol sets, τ[d] For the examples in Figure 3, the SEAIR and SSSER thresholds were found empirically and set to T1 = and T2 = 0.1, respectively These yield the sets τ[1] = {s6 , s1 , s2 } and ω[1] = {s3 , s4 , s5 } in Figure 3(a), and τ[1] = {s1 , s3 } and ω[1] = {s2 , s4 , s5 , s6 } in Figure 3(c) Similar results are obtained for all other values of d Note that different numbers of users and receive antennas, D and M, as well as different antenna array geometries and element spacing may change the empirically determined thresholds T1 and T2 However, once T1 and T2 have been set for a given M, the algorithm appears robust over a wide range of D ω[d] = su | u ∈ U e [d] , (13) as the sets of high- and low-energy user symbol vectors, respectively The low-energy sets, ω[d], are referred to as interfering symbol sets, since they correspond to residual SYMBOL DETECTION We now describe the proposed list-based MUD algorithm, the so-called parallel detection with interference estimation (PD-IE) algorithm As shown in Figure 1, it operates on the preprocessor output and takes the transformed receive vector y, the channel matrix H, and the estimated sparsity matrix P as inputs A structural block diagram is shown in Figure It uses Q iterations to compute an ordered global list of symbol (2) (L) (l) vectors S = {s (1) , s , , s }, where s is the lth D × symbol vector in the list (The list S is ordered from most to least likely.) The rows of the inputs y(D×1) , H(D×D) , and P(D×D) are first reordered to produce y(D×1) , H(D×D) , and P(D×D) as indicated by the row ordering block in Figure (Reordering the input quantities improves performance in subsequent detection stages.) This ordering is in terms of the SEIR [18] criterion, which is defined as SEIR[d] = E max1≤u≤D hdu su E v∈U e [d] hdv sv = max1≤u≤D hdu v∈U e [d] hdv (14) The numerator denotes the signal power of the strongest user in the dth row h[d] ∈ H, and the denominator is the overall power of the signals outside the enumeration set Ue [d] The reordering is in order of decreasing SEIR In Figures 3(c) and 3(d), the rows {1, 2, 3, 4, 5, 6} of y, H, and P become rows {3, 5, 1, 2, 6, 4} of y , H , and P , respectively 4.1 Symbol estimation The key to successful detection in overloaded receivers is to estimate and cancel residual CCI We use D parallel detection branches as shown in Figure Each branch corresponds to one user and performs CCI cancellation and symbol estimation Figure shows two implementations In Figure 5(a), residual CCI is estimated explicitly using the trellis implementation as we proposed in [14] In contrast, Figure 5(b) illustrates joint detection as we described in [18] (We use the term “joint detection” because the user symbols and the residual CCI are jointly estimated using PIC techniques.) Both implementations include identical high-energy symbol estimators and take y , H , P , and the tentative global list S as inputs In addition, y, H, and P are needed for estimation of the residual CCI in Figure 5(a) Michael Krause et al Each of the D symbol estimators outputs a branch list (1) (2) (L) Sbr [d] = {sbr [d], sbr [d], , sbr [d]} of (D × 1) symbol (k) vectors sbr [d], where k = 1, 2, , L (k) Each vector sbr [d] contains estimates of the high- and low-energy symbol sets τ[d] and ω[d], respectively, and can be decomposed into (k) sbr [d] = τ (k) [d], ω (k) [d] , (15) where ω (k) [d] and τ (k) [d] are the estimated low- and highenergy user symbol sets in the dth detection branch (The symbol sets τ[d] and ω[d] for each branch list Sbr [d] are derived from p [d] ∈ P ) We consider the low-energy sets ω (k) [d] as residual CCI and obtain them by an interference estimation process The high-energy sets τ (k) [d] are found by an exhaustive search over all possible |A||τ[d]| symbol combinations τ[d], where |τ[d]| = |Ue [d]| is the number of signals in the dth enumeration set Ue [d] This is done by the high-energy symbol estimators shown in Figure Each such estimator takes the list W[d] = (1) (2) (Id ) [d], ω [d], , ω [d]} and the quantities y [d] ∈ y , h [d] ∈ H , and p [d] ∈ P as inputs The list W[d] contains estimates of the residual CCI with the tilde notation (·) denoting nonredundant list elements (Storing only the {ω To illustrate estimation of the residual CCI, we consider two examples, one for explicit CCI estimation and the other using joint detection 4.1.1 Symbol estimation with explicit CCI estimation Consider a UCA with a banded sparsity matrix P as illustrated in Figure 3(b) The dth CCI estimator in Figure 5(a) has the inputs y, H, P, p [d] ∈ P , and the global tentative symbol list S It uses the iterative tail-biting delayed decision feedback sequence estimation (ITB-DDFSE) algorithm of [6] to compute estimates of the residual CCI It constructs a spatial trellis from P and employs the Viterbi algorithm to find the minimum cost path through it In order to minimize computational complexity, we first create the list Sin [d] from S in each receiver branch using the sparsity pattern p [d] ∈ P It is defined as Sin [d] = (1) (2) energy symbol sets together with the best initial estimates of the residual CCI Hence the kth symbol vector in the dth list, (k) sin [d] ∈ Sin [d], is decomposed into (k) e(i, j) [d] = y [d] − y (i, j) [d] , (16) (i, j) [d] where y [d] is the dth component of y and y is the (i, j)th “candidate component” used as an approximation of y [d] Values for y (i, j) [d] are computed as the sum of an “enumeration component” ye(i) [d] and an “interference component” yif(i) [d] as y (i, j) [d] = ye(i) [d] + yif(i) [d], ye(i) [d] = hdu su , u∈Ue [d] yif(i) [d] = (17) (i) hdu su , where hdu is an element of h [d] ∈ H The values su for ye(i) [d] are drawn from the jth high-energy symbol set (i) τ ( j) [d] with j = 1, 2, , |A||τ[d]| The values su in the (i) interference component yif [d] are estimates of the residual (i) CCI, drawn from the ith list element ω [d] ∈ W[d] (k) We then find the vectors sbr [d] ∈ Sbr [d] by choosing symbol values from the (i, j) symbol combination with the kth smallest error metric, (k) 1≤i≤Id 1≤ j ≤|A||τ[d]| e(i, j) [d] , k = 1, 2, , L, where min(k) denotes the kth smallest value (19) (k) where τ in [d] is a high-energy symbol set that is nonre(k) dundant in Sin [d] and the low-energy symbol set ωin [d] is the best initial estimate of the residual CCI chosen from S (k) (The best initial estimate ωin [d] can easily be found from S because the elements in S are ordered from most to least likely.) The list Sin [d] is input to the dth CCI estimator in Figure 5(a) It operates on a spatial trellis having D stages indexed by c = 1, 2, , D It starts and ends in a fixed state Note that both fixed states contain the high-energy symbol (k) set τ in [d] and are equivalent due to the tail-biting trellis structure The trellis is applied to each of the Kd symbol (k) u∈U e [d] (i, j)(k) = arg (k) (k) sin [d] = τ in [d], ωin [d] , (i) nonredundant elements ω [d] ∈ W, i = 1, 2, , Id , ensures that the complexity of high-energy symbol estimation is minimal.) The list size is Id with ≤ Id ≤ L We search over all high-energy symbol sets τ[d] and compute the Euclidean error metric (Kd ) {sin [d], sin [d], , sin [d]}, where Kd is the list size with ≤ Kd ≤ L Its elements contain the nonredundant high- (18) vectors sin [d] ∈ Sin [d] Figure depicts an example trellis for the CCI estimator of Figure 5(a) for the M = antenna, D = user environment of Figures 3(a) and 3(b) using BPSK signaling The extension to other signal types is straightforward The states at the cth stage of the trellis are defined as [14] σ[c] = su | u ∈ Ue [c − 1] ∩ Ue [c] = τ[c − 1] ∩ τ[c], c = 1, 2, , D (20) Note that for the chosen example τ[c = 1] = {s6 s1 s2 } are the high-energy symbols They are represented by fixed (k) states in the trellis and initialized with the kth value τ in [d] (k) The corresponding low-energy symbol sets ωin [d] are used as initial estimates of the residual CCI and are stored in the partial state estimate ν[c] The trellis state sequence is σ[1] = {s6 s1 }, σ[2] = {s1 s2 }, σ[3] = {s2 s3 }, σ[4] = {s3 s4 }, σ[5] = {s4 s5 }, σ[6] = {s5 s6 } and the number of symbols with variable state values is {μ[c]} = {0, 0, 1, 2, 2, 1}, where EURASIP Journal on Wireless Communications and Networking S No Yes y, H, P Explicit CCI estimation? y, H, P S y ,H ,P Row ordering S y, H P Symbol estimator #1 including co-channel interference estimation Sbr [1] Symbol estimator #d including co-channel interference estimation List combiner Sbr [d] S {s(1) ,s(2) , ,s(L) } Global list of L D × symbol vectors Symbol estimator #D including co-channel interference estimation (Output to decision device) Sbr [D] S y, H, P (Input from preprocessor) Global tentative list of L D × symbol vectors D branch lists of L D × symbol vectors Figure 4: Block diagram of the parallel detector with interference estimation (PD-IE) Symbol estimator #d with co-channel interference estimation y, H, P S Sin [d] y, H, P, Trellis-based CCI estimator #d p [d] W [d] High energy symbol estimator #d y [d], Sbr [d] h [d], p [d] y, H ,P Global tentative list S (feedback from list combiner) (a) Exchange of tentative decisions from symbol estimator (d − 1) Symbol estimator #d with co-channel interference estimation qpic > S y [d], h [d], p [d] W [d] qpic = High energy symbol estimator #d Sbr [d] qpic = Qpic qpic < Qpic Tentative list storage #d y, H ,P Exchange of tentative decisions to symbol estimator (d + 1) Global tentative list S (feedback from list combiner) (b) Figure 5: The dth symbol estimator in the PD-IE in Figure using (a) explicit CCI estimation and (b) joint detection Michael Krause et al s6 s1 s1 s2 s2 s3 i (−1) s3 s4 (−1, 1) (1) c=2 s6 s1 (−1) Start of next iteration qitb (−1, 1) (1, −1) (1, 1) c=3 s5 s6 (−1, −1) (1, −1) j c=1 s4 s5 (−1, −1) (1, 1) c=4 (1) c=5 c=6 Figure 6: ITB-DDFSE trellis for explicit CCI estimation in symbol estimator #1 in Figure 5(a) The trellis is shown for the UCA example in Figures 3(a) and 3(b) using BPSK signals c = 1, 2, , is the trellis stage index We denote the number of transitions from a previous state i into a new state j as T j [c] The cth trellis stage has j = |A|μ[c] states and there are |A|μ[c] T[c] = T j [c], (21) j =1 overall transitions In Figure 6, the sequence of overall i→ j transitions is {T[c]} = {1, 2, 4, 8, 4, 2} The algorithm finds the minimum cost path, according to a Euclidean distance error metric using the symbols from the current i→ j transition and the partial state estimate ν[c] After processing all transitions at the cth trellis stage, the surviving transitions are stored and the partial state estimate ν[c] is updated After typically Qitb = or iterations around the tail-biting trellis, the estimate of the residual CCI, ω (i) [d], is found by tracing back the trellis path with the least cost The nonredundant (i) estimates, ω [d], are stored as the list W [d] which is output by the dth CCI estimator, as shown in Figure 5(a) 4.1.2 Symbol estimation with joint detection We next consider a ULA with a nonbanded sparsity matrix P as shown in Figure 3(d) In this case the symbol estimator of Figure 5(b) is needed It uses an iterative PIC approach to jointly find estimates of the low- and high-energy symbol sets ω[d] and τ[d] The required inputs to the dth symbol estimator are the tentative global list S and the dth row components of y , H , and P The symbol estimators compute D tentative branch lists Sbr [d] by searching over the high-energy symbols τ[d] using (16) and (17) Each list Sbr [d] serves as input to the (d + 1)th high-energy symbol estimator in the (qpic + 1)th iteration For qpic = 1, the tentative global list S is chosen as the input From the input list to the dth symbol estimator, the list of estimates of the residual CCI, W [d], is obtained using the sparsity pattern p [d] ∈ P After the Qpic th iteration, the branch lists Sbr [d] are output by the symbol estimators We have found Qpic = to works well 4.2 List combining The D branch lists Sbr [d] are output by the symbol estimators and input to a list combiner (cf Figure 4) The symbols in each branch vector sbr [d] ∈ Sbr [d] contain estimates of both the low- and high-energy symbol sets ω[d] and τ[d] Here instead of an exhaustive search over all symbol combinations as in (8), only the high-energy symbol sets τ[d] are searched using the error metric of (16) Because of the estimation process, the JML vector s satisfying (8) may not be included in the D branch lists Sbr [d] By searching and combining the branch lists, we can find improved estimates with high probability of including the desired symbol vector s In [14], we proposed a list combining algorithm that finds the L-member tentative ordered global list S of most likely (l) symbol estimate vectors s ∈ S, l = 1, 2, , L We briefly summarize the algorithm here The list combiner in Figure takes as inputs y, H, P , and the D branch lists Sbr [d] For the qth global iteration, the tentative global list S and the corresponding list of error metrics E = {e (1) , e (2) , , e (L) } are stored and S is fed back to the D detector branches If q = Q (Q is arbitrarily set), S is output by the detector as an estimate of the ordered list of most likely symbol vectors Typically, only Q = or iterations are necessary A decision device then selects the (1) first element s ∈ S as the best estimate Alternatively, S can be used to provide soft information to subsequent receiver stages such as error control decoders List combining is done in two stages: initial update and iterative search over the estimates of the high-energy symbol sets τ[d] In the initial update, the stored lists S and E are updated with the symbol vectors and error metrics obtained in the current iteration The iterative search combines the estimates of the high-energy symbol sets τ[d] with the symbols stored in S This typically requires Qlc = or iterations The algorithm uses dynamic programming principles and is summarized in Algorithm PERFORMANCE EVALUATION Analytical performance bounds for PD-IE are difficult to obtain due to the iterative and list reduction processes Hence, we use Monte Carlo simulation to compare performance to other MUD algorithms under overload We assume D single-antenna users transmitting equal power symbol synchronous QPSK (4-QAM) signals The signals are incident on a receiver with an M-element UCA or ULA where D > M For simplicity, we assume the same phase reference is used for all signals The SNR at each receive antenna is defined as the ratio of signal to noise variances, SNR = 10 log10 (σs2 /σz ), where σs2 is the average received power per signal Simulations are stopped after one user experiences 50 errors 10 EURASIP Journal on Wireless Communications and Networking Initial Update (k) Define a list of D × branch symbol vectors, Sbr Initialize the elements sbr ∈ Sbr with the nonredundant symbol vectors from the D branch lists Sbr [d] Note that k = 1, 2, , K and ≤ K ≤ LD (1) (2) (K) (k) Corresponding to Sbr , define the list of error metrics Ebr = {ebr , ebr , , ebr } Compute each ebr ∈ Ebr as (k) (k) (k) ebr = y − Hsbr , where sbr ∈ Sbr Define the list of L tentative minimum error metrics, Emin , and the corresponding list of D × symbol vectors, (l) Smin Obtain the elements emin ∈ Emin by searching (l) (l) (k) emin = ebr , e (i) , l = 1, 2, , L, 1≤i≤L 1≤k≤K where e (i) is the ith element in E , obtained in the (q − 1)th iteration For q = 1, choose E = {∞} Find the (l) elements smin ∈ Smin by choosing symbol values from the corresponding lists Sbr and S Set S = Smin and E = Emin Iterative Search ( j) Define the d = 1, 2, , D lists T [d] Find the elements τ [d] ∈ T [d] by using p [d] ∈ P to select the nonredundant high-energy symbol sets from Sbr [d] Note that j = 1, 2, , Jd and Jd ≤ L (1) (2) (L) (1) (2) (L) Define the lists Scand = {scand , scand , , scand } and Ecand = {ecand , ecand , , ecand } These store D × candidate symbol vectors and corresponding error metrics For each iteration qlc = 1, 2, , Qlc and all j = 1, 2, , Jd elements τ lists, T [d], ( j) [d] ∈ T [d] of the d = 1, 2, , D (i) Use p[d] ∈ P to find the estimates of the low-energy symbol sets ω[d] in the list S and copy the nonredundant sets into Scand The resulting list Scand has size Ld with ≤ Ld ≤ L (k) (ii) For each element scand ∈ Scand , k = 1, 2, , Ld , (a) Copy the high-energy symbol set estimate τ ( j) (k) (k) [d] into scand (k) (b) Compute the error metric, ecand = y − Hscand (iii) Update the tentative list Emin by finding the l smallest metrics, (l) (l) (k) emin = ecand , e (i) , 1≤i≤L l = 1, 2, , L, 1≤k≤Ld where e (i) ∈ E is the ith element in E Update the corresponding list Smin by choosing the l = 1, 2, , L (l) symbol vectors from Scand and S with minimum error metric emin (iv) Set S = Smin and E = Emin Terminate the list combining algorithm Set q = q + Algorithm 1: Iterative list combining algorithm 5.1 UCA Figure shows the relative performance of the PD-IE, SRSJD, and JML algorithms at SNR = 10 dB The receiver employs an M = 5-element UCA front end with radius R = 0.2λ We use the linear beam former of (7) as a spatial filter in the preprocessing stage of the detector The SEAIR and SSSER thresholds for derivation of the sparsity matrix P are empirically set to T1 = and T2 = 0.1, respectively, for up to 100% overload (D ≤ 10) For higher overload factors (D > 10), we set T1 = and T2 = 0.5, respectively As a result, for this example, each row of the channel matrix H contains |τ[d]| = high-energy symbols τ[d] The matrix P is used for both the PD-IE and SRSJD algorithms SRSJD performs two iterations around the tail-biting trellis as suggested in [6] Simulations run with more iterations achieved only marginal performance improvements for the increase in SRSJD complexity The choices of the PD-IE parameters are shown in Table In order to compare the two PD-IE symbol estimators using either explicit CCI estimation (Figure 5(a)) or joint detection (Figure 5(b)), we set Qitb = and adjust the iteration parameter Qpic so that both approaches have similar complexity Complexity values are presented in Table as the number of real squaring operations per output symbol vector Michael Krause et al 11 1E + 00 1E + 00 SER worst user SER worst user 1E − 01 1E − 02 1E − 03 D = 12 1E − 01 1E − 02 D = 10 1E − 04 1E − 05 10 11 Number of cochannels signals D 12 SRSJD, SNR = 10 dB PD-IE, explicit CCI estimation, L = D PD-IE, joint detection, L = D PD-IE, explicit CCI estimation, L = 2D PD-IE, joint detection, L = 2D JML detector, SNR = 10 dB Figure 7: SER of the worst user versus number of cochannel signals at SNR = 10 dB for a 5-element UCA using JML, SRSJD, and PD-IE algorithms Iteration parameters for PD-IE are shown in Table From Figure it can be seen that the symbol error rate (SER) essentially increases with the number of users D This is due to residual CCI in the filtered received signal which increases with the overload factor of the receiver The somewhat better performance for odd numbers of users, for examble, D = 7, 9, is an artifact of the UCA geometry, as in these cases there are no user signals received from opposite AOAs Note that the AOA dependance of the UCA is not observed if the SER performance is dominated by the residual CCI This occurs under heavier overload (e.g., D = 11 users as shown in Figure 7) JML is the optimum detector and achieves the lowest SER SRSJD approximates JML up to D = users but fails for D > PD-IE outperforms SRSJD at the cost of higher complexity and achieves near JML performance when using a global list S of size L = 2D For L = D, performance is impaired due to the increased probability of the transmitted symbols not being in the list S At a similar complexity, symbol estimation with explicit CCI estimation slightly outperforms joint detection in PD-IE for L = 2D, but performance is worse for L = D This arises because the trellis-based CCI estimation process can outperform the PIC technique if the correct high-energy symbols are already contained in the global list S In contrast, joint detection is able to better estimate the CCI for smaller list sizes L because it jointly estimates both the CCI and the high-energy symbols Figure illustrates SER versus SNR performance curves for PD-IE using the same receiver setup as in Figure Results are shown for the heavily overloaded cases of D = 10 and 1E − 03 10 15 20 25 30 35 40 SNR (dB) PD-IE, explicit CCI estimation, D = 10, L = D PD-IE, explicit CCI estimation, D = 10, L = 2D PD-IE, joint detection, D = 10, L = D PD-IE, joint detection, D = 10, L = 2D PD-IE, explicit CCI estimation, D = 12, L = D PD-IE, explicit CCI estimation, D = 12, L = 2D PD-IE, joint detection, D = 12, L = D PD-IE, joint detection, D = 12, L = 2D Figure 8: SER of the worst user versus SNR for PD-IE with list sizes L = D and 2D using a 5-element UCA with D = 10 and 12 users The iteration parameters are set to give comparable complexity for PD-IE with explicit CCI estimation and PD-IE with joint detection as shown in Table 12 users employing symbol estimators with either explicit CCI estimation or joint detection The SER in Figure decreases with increasing SNR until it reaches an error floor Its minimum value is dominated by the probability of the correct symbol values not being included in the branch lists Sbr [d] which explains the higher error floor for the smaller list size L = D in contrast to L = 2D Increasing the list size L reduces the error floor because more symbol combinations are considered as candidates This of course increases PD-IE complexity At low SNR (SNR < 10 dB), the performance results are similar for both PD-IE symbol estimator implementations whereas at higher SNR (SNR ≥ 15 dB), joint detection clearly outperforms explicit CCI estimation in PD-IE This can be explained by the different symbol estimation processes considered Since PDIE with explicit CCI estimation relies on correct estimates of the residual CCI, its SER performance is sensitive to CCI estimation errors These are more likely to occur if the global list S contains only erroneous symbols and the list size L is small The explicit CCI estimation process has too few degrees of freedom and cannot possibly accurately estimate all the CCI There will then always be significant residual CCI In contrast, PD-IE with joint estimation reestimates both the residual CCI and the high-energy symbol values during the iterative PIC process It has more degrees of freedom and thus higher probability of finding the correct 12 EURASIP Journal on Wireless Communications and Networking symbol estimates even if the list S is small or initially contains only erroneous estimates Increasing the size of S from L = D to 2D reduces the superiority of joint detection due to better explicit CCI estimation in PD-IE This is observed in Figure Figure depicts SER versus SNR curves for a receiver with an M = 6-element ULA with element spacing B = 3λ The users are randomly allocated to D equal size sectors within the array’s view angle of θmax = ±60◦ (For nonfading memoryless channels, the ULA is highly selective in AOA We therefore use random user spacing into equal size sectors to obtain comparable results for different numbers of users.) The transmitted signals are incident with random phase on the antenna array We set the SEAIR and SSSER thresholds to T1 = and T2 = 0.1, respectively The detection algorithm is PD-IE with joint detection of user symbols and residual CCI The iterative PIC process uses either Qpic = and Qpic = iterations The global list S has size L = 2D Results are shown for D = and 12 users (50% and 100% overload) All other parameters remain unchanged It can be seen that increasing the number of iterations, Qpic , significantly improves detection performance for D = 12 users In contrast, performance improvements are much smaller for D = users as Qpic increases This is expected because increasing Qpic yields more accurate estimation of the residual CCI which is more critical at higher levels of overload Furthermore, it is evident that more iterations (increased Qpic ) yield a lower error floor as the SNR increases Better CCI estimation comes at the cost of increased complexity COMPLEXITY We now consider the computational complexity of PD-IE As a measure of this we use the number of real squaring operations in the calculation of the Euclidean error metrics, as this is usually the most hardware intensive operation [6, 10, 14, 17, 18] Complexity of PD-IE depends on many parameters Among these are the number of users D, the alphabet size |A|, the number of high-energy symbols |τ[d]|, the number of iterations Qitb or Qpic , Qlc , and Q, and the sizes of the lists Sbr [d] and S The overall complexity of PD-IE, C, can be expressed as the sum of the complexities of the symbol estimator and the list combiner, namely, Cse and Clc , respectively From the block diagram in Figure 4, we find C = 2Q Cse + Clc , (22) D where Cse = d =1 Cse [d] is the sum of the individual symbol estimator complexities, Cse [d] The scaling factor of two is introduced because computation of each Euclidean error metric requires two real squarings Each symbol estimator contains a high-energy symbol estimator which has complexity Chese [d] = Id |A||τ[d]| , where Id denotes the 1E − 01 SER worst user 5.2 ULA 1E + 00 D = 12 D=9 1E − 02 1E − 03 1E − 04 10 15 20 25 30 35 40 SNR (dB) PD-IE, joint detection, Qpic PD-IE, joint detection, Qpic PD-IE, joint detection, Qpic PD-IE, joint detection, Qpic = 1, D = 5, D = 1, D = 5, D =9 =9 = 12 = 12 Figure 9: SER of the worst user versus SNR for PD-IE using an M = 6-element ULA with element spacing B = 3λ There are D = and 12 cochannel users The size of the global list S is L = 2D size of the input list W [d] (cf Figures 5(a) and 5(b)) For explicit CCI estimation with ITB-DDFSE (Figure 5(a)), (itb) Cse [d] = Citb [d] + Chese [d], (23) where Citb [d] = Kd Qitb D T[c] is the complexity of each c= CCI estimator with Kd being the size of the input list Sin [d] and T[c] denoting the number of transitions at the cth trellis stage defined in (21) For joint detection as in Figure 5(b), Cse [d] is derived as (pic) Cse [d] = Qpic Chese [d] The complexity of the (Algorithm 1) is given by list combining (24) algorithm D Clc = D K + Qlc Jd Ld , (25) d =1 where Jd , K, and Ld are the sizes of the lists T [d], Sbr , and Scand , respectively Note that K and Jd may vary in each of the Q global iterations, whereas Ld may change in each of the Qlc list combining iterations In Table 2, complexity of the JML [10], SRSJD [6], and PD-IE algorithms is compared for receivers with an M = 8-element UCA The array radius is R = λ/4 and the linear beam former of (7) is used as a preprocessor JML requires 2M |A|D while SRSJD needs only 2Qitb D|A| (μ[c]+1) real squarings [6] Complexity values for PD-IE are shown for |τ[d]| = high-energy symbols, obtained through adjusting the SEAIR and SSSER thresholds The global list Michael Krause et al 13 Table 1: Parameters and complexity for PD-IE simulations in Figures and using an M = 5-element UCA Users D 10 11 12 Qitb 2 2 2 Size of S, L = D Qlc = Q = Qpic 3 4 5 Qitb 2 2 2 Complexity ∼2.5E4 ∼4.0E4 ∼6.6E4 ∼1.0E5 ∼1.5E5 ∼2.0E5 ∼2.8E5 Size of S, L = 2D Qlc = Q = Qpic 5 6 Complexity ∼4.2E4 ∼8.1E4 ∼1.4E5 ∼2.2E5 ∼3.4E5 ∼4.8E5 ∼6.8E5 Table 2: Comparison of computational complexity for a receiver with an M = 8-element UCA Users D 10 11 12 13 14 15 JML C 4.2E06 1.7E07 6.7E07 2.7E08 1.1E09 4.3E09 1.7E10 SRSJD μ[d] 2 4 4 PD-IE C 2.3E03 2.6E03 2.8E03 4.9E04 5.3E04 5.7E04 6.1E04 S has size L = 2D We use Qitb = Qlc = Q = iterations for PD-IE with explicit CCI estimation and Qpic = 3, Qlc = Q = iterations for PD-IE with joint detection Both list size and iteration parameters were chosen empirically to achieve good detection performance at low complexity In general, these parameters provide a complexity-performance tradeoff and their values may thus be chosen according to practical restrictions and requirements The results of Table clearly show that JML has extremely high complexity, increasing exponentially with the number of users SRSJD achieves the lowest complexity It has a linear increase within the subsets of users where μ[d] is constant and has an exponential dependance with an increasing number of subsets PD-IE provides complexity savings of several orders of magnitude over JML but has higher complexity than SRSJD This is the price to pay for the better performance of PD-IE (cf Figure 7) The comparison of symbol estimation with explicit CCI estimation and joint detection in PD-IE indicates that joint detection of user symbols and residual CCI has complexity advantages over explicit CCI estimation This is expected because explicit CCI estimation requires an additional trellis stage for each additional user, whereas for joint detection, the complexity of each symbol estimator remains constant This can be seen in Table by the increasing complexity ratio Cse /Clc for explicit CCI estimation and decreasing values for joint detection Similar results are found when a ULA is used Expl CCI estimation C Cse /Clc 1.2 3.0E5 1.3 5.2E5 2.7 1.2E6 2.8 1.8E6 2.7 2.5E6 6.4 7.2E6 16.7 2.2E7 Joint detection Cse /Clc C 3.1 1.4E5 2.6 1.8E5 1.9 2.5E5 1.5 3.3E5 1.3 4.2E5 1.1 5.2E5 0.8 7.1E5 CONCLUSION In this paper, a unified algorithmic structure for the separation and detection of multiple cochannel signals in an overloaded SIMO environment is proposed The detection algorithm is applied to receivers with either a UCA or a ULA A linear preprocessor employing either spatial beam forming or diversity combining is used to reduce the amount of CCI in the received signals Due to the overloaded environment and the linear preprocessing, residual CCI is still present The detection of the user symbols is done using the proposed PD-IE algorithm It estimates the residual CCI and performs nonlinear iterative list detection of the user symbols Performance is evaluated using Monte Carlo simulation PD-IE is shown to approximate the optimum JML detector with significantly lower complexity and outperforms existing low-complexity algorithms Comparison to the SRSJD algorithm shows that PD-IE yields better performance at the cost of some increase in complexity Unlike JML whose complexity is exponential in the number of users, PDIE has a much lower rate of complexity increase Complexity savings become more significant when the number of receive antennas is large PD-IE simulation results suggest that joint detection and CCI estimation has advantages over explicit CCI estimation It achieves a better performancecomplexity tradeoff, yields simpler implementation, and most importantly, it can be used with arbitrary receive array 14 EURASIP Journal on Wireless Communications and Networking geometries The parallel processing structure makes PD-IE well suited for practical implementation [15] REFERENCES [1] J H Winters, “On the capacity of radio communication systems with diversity in a Rayleigh fading environment,” IEEE Journal on Selected Areas in Communications, vol 5, no 5, pp 871–878, 1987 [2] A Paulraj and C B Papadias, “Space-time processing for wireless communications,” IEEE Signal Processing Magazine, vol 14, no 6, pp 49–83, 1997 [3] G J Foschini and M J Gans, “On limits of wireless communications in a fading environment when using multiple antennas,” Wireless Personal Communications, vol 6, no 3, pp 311–335, 1998 [4] S J Grant and J K Cavers, “Performance enhancement through joint detection of cochannel signals using diversity arrays,” IEEE Transactions on Communications, vol 46, no 8, pp 1038–1049, 1998 [5] S Bayram, J Hicks, R J Boyle, and J H Reed, “Overloaded array processing in wireless airborne communication systems,” in Proceedings of 21st Century Military Communications Conference (MILCOM ’00), vol 1, pp 24–29, Los Angeles, Calif, USA, October 2000 [6] J Hicks, S Bayram, W H Tranter, R J Boyle, and J H Reed, “Overloaded array processing with spatially reduced search joint detection,” IEEE Journal on Selected Areas in Communications, vol 19, no 8, pp 1584–1593, 2001 ´ [7] S Verdu, Multiuser Detection, Cambridge University Press, Cambridge, UK, 1998 [8] S Talwar, M Viberg, and A Paulraj, “Blind separation of synchronous co-channel digital signals using an antenna array— part I: algorithms,” IEEE Transactions on Signal Processing, vol 44, no 5, pp 1184–1197, 1996 [9] S Talwar and A Paulraj, “Blind separation of synchronous co-channel digital signals using an antenna array—part II: performance analysis,” IEEE Transactions on Signal Processing, vol 45, no 3, pp 706–718, 1997 [10] S Bayram, J Hicks, R J Boyle, and J H Reed, “Joint maximum likelihood approach in overloaded array processing,” in Proceedings of the 52nd IEEE Vehicular Technology Conference (VTC ’00), vol 1, pp 394–400, Boston, Mass, USA, September 2000 [11] J.-A Tsai, J Hicks, and B D Woerner, “Joint MMSE beamforming with SIC for an overloaded array system,” in Proceedings of the IEEE Military Communications Conference on Communications for Network-Centric (MILCOM ’01), vol 2, pp 1261–1265, McLean, Va, USA, October 2001 [12] J Hicks, J.-A Tsai, J H Reed, W H Tranter, and B D Woerner, “Overloaded array processing with MMSE-SIC,” in Proceedings of the 55th IEEE Vehicular Technology Conference (VTC ’02), vol 2, pp 542–546, Birmingham, Ala, USA, May 2002 [13] J.-A Tsai and B D Woerner, “Performance of combined MMSE beamforming with parallel interference cancellation for overloaded OFDM-CDMA systems,” in Proceedings of IEEE Military Communications Conference (MILCOM ’02), vol 1, pp 748–752, Anaheim, Calif, USA, October 2002 [14] M Krause, D P Taylor, and P A Martin, “On list detection for overloaded receivers,” in Proceedings of the 18th IEEE International Symposium on Personal, Indoor and Mobile Radio [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] Communications (PIMRC ’07), Athens, Greece, September 2007 M J Colella, J N Martin, and F Akyildiz, “The HALO networkTM ,” IEEE Communications Magazine, vol 38, no 6, pp 142–148, 2000 A Duel-Hallen and C Heegard, “Delayed decision-feedback sequence estimation,” IEEE Transactions on Communications, vol 37, no 5, pp 428–436, 1989 G D Forney Jr., “Maximum-likelihood sequence estimation of digital sequences in the presence of intersymbol interference,” IEEE Transactions on Information Theory, vol 18, no 3, pp 363–378, 1972 M Krause, D P Taylor, and P A Martin, “List detection for overloaded receivers with a linear array,” in Proceedings of IEEE Military Communications Conference (MILCOM ’07), Orlando, Fla, USA, October 2007 S Haykin, Communication Systems, John Wiley & Sons, New York, NY, USA, 4th edition, 2001 D G Brennan, “Linear diversity combining techniques,” Proceedings of the IRE, vol 47, no 6, pp 1075–7102, 1959 J.-A Tsai, R M Buehrer, and B D Woerner, “BER performance of a uniform circular array versus a uniform linear array in a mobile radio environment,” IEEE Transactions on Wireless Communications, vol 3, no 3, pp 695–700, 2004 R Monzingo and T Miller, Introduction to Adaptive Arrays, John Wiley & Sons, New York, NY, USA, 1980 J Litva and T K Lo, Digital Beamforming in Wireless Communications, Artech House, Boston, Mass, USA, 1996 J G Proakis, Digital Communications, McGraw-Hill, New York, NY, USA, 3rd edition, 1995 F Alam, “Space time processing for third generation CDMA systems,” Ph.D dissertation, Virginia Tech, Blacksburg, Va, USA, November 2002 W L Stutzmann and G A Thiele, Antenna Theory and Design, John Wiley & Sons, New York, NY, USA, 1981 ... circular array (UCA) or a uniform linear array (ULA) but can easily be extended to an arbitrary geometry Both a linear spatial prefilter and an MRCbased diversity combiner are considered as preprocessors... 1996 [9] S Talwar and A Paulraj, “Blind separation of synchronous co-channel digital signals using an antenna array—part II: performance analysis,” IEEE Transactions on Signal Processing, vol 45,... detector The iterative approach has the advantage that, even with inaccurate estimates of the residual CCI, symbol detection is possible In this paper, we develop a unified list-based, iterative approach

Ngày đăng: 21/06/2014, 22:20

Tài liệu cùng người dùng

Tài liệu liên quan