64 A Unified Instrumental Variable

Stoica, P.; Viberg, M.; Wong, M & Wu, Q “A Unified Instrumental Variable Approach to Direction Finding in Colored Noise Fields”Ó Digital Signal Processing Handbook Ed Vijay K Madisetti and Douglas B Williams Boca Raton: CRC Press LLC, 1999 c 1999 by CRC Press LLC 64 A Unified Instrumental Variable Approach to Direction Finding in Colored Noise Fields P Stoica Uppsala University M Viberg Chalmers University of Technology M Wong McMaster University Q Wu CELWAVE 64.1 Introduction 64.2 Problem Formulation 64.3 The IV-SSF Approach 64.4 The Optimal IV-SSF Method 64.5 Algorithm Summary 64.6 Numerical Examples 64.7 Concluding Remarks References Appendix A: Introduction to IV Methods The main goal herein is to describe and analyze, in a unifying manner, the spatial and temporal IV-SSF approaches recently proposed for array signal processing in colored noise fields (The acronym IV-SSF stands for “Instrumental Variable - Signal Subspace Fitting”) Despite the generality of the approach taken herein, our analysis technique is simpler than those used in previous more specialized publications We derive a general, optimally-weighted (optimal, for short), IV-SSF direction estimator and show that this estimator encompasses the UNCLE estimator of Wong and Wu, which is a spatial IV-SSF method, and the temporal IV-SSF estimator of Viberg, Stoica and Ottersten The latter two estimators have seemingly different forms (among others, the first of them makes use of four weights, whereas the second one uses three weights “only”), and hence their asymptotic equivalence shown in this paper comes as a surprising unifying result We hope that the present paper, along with the original works aforementioned, will stimulate the interest in the IV-SSF approach to array signal processing, which is sufficiently flexible to handle colored noise fields, coherent signals and indeed also situations were only some of the sensors in the array are calibrated This work was supported in part by the Swedish Research Council for Engineering Sciences (TFR) c 1999 by CRC Press LLC 64.1 Introduction Most parametric methods for Direction-Of-Arrival (DOA) estimation require knowledge of the spatial (sensor-to-sensor) color of the background noise If this information is unavailable, a serious degradation of the quality of the estimates can result, particularly at low Signal-to-Noise Ratio (SNR) [1, 2, 3] A number of methods have been proposed over the recent years to alleviate the sensitivity to the noise color If a parametric model of the covariance matrix of the noise is available, the parameters of the noise model can be estimated along with those of the interesting signals [4, 5, 6, 7] Such an approach is expected to perform well in situations where the noise can be accurately modeled with relatively few parameters An alternative approach, which does not require a precise model of the noise, is based on the principle of Instrumental Variables (IV) See [8, 9] for thorough treatments of IV methods (IVM) in the context of identification of linear time-invariant dynamical systems A brief introduction is given in the appendix of this chapter Computationally simple IVMs for array signal processing appeared in [10, 11] These methods perform poorly in difficult scenarios involving closely spaced DOAs and correlated signals More recently, the combined Instrumental Variable Signal Subspace Fitting (IV-SSF) technique has been proposed as a promising alternative to array signal processing in spatially colored noise fields [12, 13, 14, 15] The IV-SSF approach has a number of appealing advantages over other DOA estimation methods These advantages include: • IV-SSF can handle noises with arbitrary spatial correlation, under minor restrictions on the signals or the array In addition, estimation of a noise model is avoided, which leads to statistical robustness and computational simplicity • The IV-SSF approach is applicable to both non-coherent and coherent signal scenarios • The spatial IV-SSF technique can make use of the information contained in the output of a completely uncalibrated subarray under certain weak conditions, which other methods cannot Depending on the type of “instrumental variables” used, two classes of IV methods have appeared in the literature: Spatial IVM, for which the instrumental variables are derived from the output of a (possibly uncalibrated) subarray the noise of which is uncorrelated with the noise in the main calibrated subarray under consideration (see [12, 13]) Temporal IVM, which obtains instrumental variables from the delayed versions of the array output, under the assumption that the temporal-correlation length of the noise field is shorter than that of the signals (see [11, 14]) The previous literature on IV-SSF has treated and analyzed the above two classes of spatial and temporal methods separately, ignoring their common basis In this contribution, we reveal the common roots of these two classes of DOA estimation methods and study them under the same umbrella Additionally, we establish the statistical properties of a general (either spatial or temporal) weighted IV-SSF method and present the optimal weights that minimize the variance of the DOA estimation errors In particular, we point out that the optimal four-weight spatial IV-SSF of [12, 13] (called UNCLE there, and arrived at by using canonical correlation decomposition ideas) and the optimal three-weight temporal IV-SSF of [14] are asymptotically equivalent when used under the same conditions This asymptotic equivalence property, which is a main result of the present section, is believed to be important as it shows the close ties that exist between two seemingly different DOA estimators This section is organized as follows In Section 64.2 the data model and technical assumptions are introduced Next, in Section 64.3 the IV-SSF method is presented in a fairly general setting In c 1999 by CRC Press LLC Section 64.4, the statistical performance of the method is presented along with the optimal choices of certain user-specified quantities The data requirements and the optimal IV-SSF (UNCLE) algorithm are summarized in Section 64.5 The anxious reader may wish to jump directly to this point to investigate the usefulness of the algorithm in a specific application In Section 64.6, some numerical examples and computer simulations are presented to illustrate the performance The conclusions are given in Section 64.7 In the appendix we give a brief introduction to IV methods The reader who is not familiar with IV might be helped by reading the appendix before the rest of the paper Background material on the subspace-based approach to DOA estimation can be found in Chapter 62 of this Handbook 64.2 Problem Formulation Consider a scenario in which n narrowband plane waves, generated by point sources, impinge on an array comprising m calibrated sensors Assume, for simplicity, that the n sources and the array are situated in the same plane Let a(θ) denote the complex array response to a unit-amplitude signal with DOA parameter equal to θ Under these assumptions, the output of the array, y(t) ∈ C m×1 , can be described by the following well-known equation [16, 17]: y(t) = Ax(t) + e(t) (64.1) where x(t) ∈ C n×1 denotes the signal vector, e(t) ∈ C m×1 is a noise term, and A = [a(θ1 ) · · · a(θn )] (64.2) Hereafter, θk denotes the kth DOA parameter The following assumptions on the quantities in the array equation, (64.1), are considered to hold throughout this section: A1 The signal vector x(t) is a normally distributed random variable with zero mean and a possibly singular covariance The signals may be temporally correlated; in fact the temporal IV-SSF approach relies on the assumption that the signals exhibit some form of temporal correlation (see below for details) A2 The noise e(t) is a random vector that is temporally white, uncorrelated with the signals and circularly symmetric normally distributed with zero mean and unknown covariance matrix2 Q > O, E [e(t)e∗ (s)] = Q δt,s ; E [e(t)eT (s)] = O (64.3) A3 The manifold vectors {a(θ)}, corresponding to any set of m different values of θ , are linearly independent Note that assumption A1 above allows for coherent signals, and that in A2 the noise field is allowed to be arbitrarily spatially correlated with an unknown covariance matrix Assumption A3 is a wellknown condition that, under a weak restriction on m, guarantees DOA parameter identifiability in the case Q is known (to within a multiplicative constant) [18] When Q is completely unknown, DOA identifiability can only be achieved if further assumptions are made on the scenario under consideration The following assumption is typical of the IV-SSF approach: Henceforth, the superscript “∗” denotes the conjugate transpose; whereas the transpose is designated by a superscript “T ” The notation A ≥ B, for two Hermitian matrices A and B, is used to mean that (A − B) is a nonnegative definite matrix Also, O denotes a zero matrix of suitable dimension c 1999 by CRC Press LLC ¯ A4 There exists a vector z(t) ∈ C m×1 , which is normally distributed and satisfies E [z(t)e∗ (s)] E [z(t)eT (s)] = = O for t ≤ s O for all t, s (64.4) (m × n) ¯ = E [z(t)x ∗ (t)] = rank ( ) ≤ m ¯ (64.6) (64.5) Furthermore, denote n ¯ It is assumed that no row of (64.7) is identically zero and that the inequality n > 2n − m ¯ (64.8) holds (note that a rank-one matrix can satisfy the condition (64.8) if m is large enough, and hence the condition in question is rather weak) Owing to its (partial) uncorrelatedness with {e(t)}, the vector {z(t)} can be used to eliminate the noise from the array output equation (64.1), and for this reason {z(t)} is called an IV vector Below, we briefly describe three possible ways to derive an IV vector from the available data measured with an array of sensors (for more details on this aspect, the reader should consult [12, 13, 14]) EXAMPLE 64.1: Spatial IV Assume that the n signals, which impinge on the main (sub)array under consideration, are also received by another (sub)array that is sufficiently distanced from the main one so that the noise vectors in the two subarrays are uncorrelated with one another Then z(t) can be made from the outputs of the sensors in the second subarray (note that those sensors need not be calibrated) [12, 13, 15] EXAMPLE 64.2: Temporal IV When a second subarray, as described above, is not available but the signals are temporally correlated, one can obtain an IV vector by delaying the output vector: z(t) = [y T (t − 1) y T (t − 2) · · · ]T Clearly, such a vector z(t) satisfies (64.4) and (64.5), and it also satisfies (64.8) under weak conditions on the signal temporal correlation This construction of an IV vector can be readily extended to cases where e(t) is temporally correlated, provided that the signal temporal correlation length is longer than that corresponding to the noise [11, 14] In a sense, the above examples are both special cases of the following more general situation: EXAMPLE 64.3: Reference Signal In many systems a reference or pilot signal [19, 20] z(t) (scalar or vector) is available If the reference signal is sufficiently correlated with all signals of interest (in the sense of (64.8)) and uncorrelated with the noise, it can be used as an IV Note that all signals that are not correlated with the reference will be treated as noise Reference signals are commonly available in communication applications, for example a PN-code in spread spectrum communication [20] or a training signal used for synchronization and/or equalizer training [21] A closely related possibility is utilization of cyclo-stationarity (or self-coherence), a property that is exhibited by many man-made signals The reference signal(s) can then consist, for example, of sinusoids of different frequencies [22, 23] In these techniques, the data is usually pre-processed by computing the auto-covariance function (or a higher-order statistic) before correlating with the reference signal c 1999 by CRC Press LLC The problem considered in this section concerns the estimation of the DOA vector θ = [θ1 , · · · , θn ]T (64.9) given N snapshots of the array output and of the IV vector, {y(t), z(t)}N The number of signals, t=1 n, and the rank of the covariance matrix , n, are assumed to be given (for the estimation of these ¯ integer-valued parameters by means of IV/SSF-based methods, we refer to [24, 25]) 64.3 The IV-SSF Approach Let ˆ ˆ R = WL N N ˆ z(t)y ∗ (t) W R (m × m) ¯ (64.10) t=1 ˆ ˆ where W L and W R are two nonsingular Hermitian weighting matrices which are possibly datadependent (as indicated by the fact that they are roofed) Under the assumptions made, as N → ∞, ˆ R converges to the matrix: R = W L E[z(t)y ∗ (t)]W R = W L A∗ W R (64.11) where W L and W R are the limiting weighting matrices (assumed to be bounded and nonsingular) Owing to assumptions A2 and A3, rank (R) = n ¯ (64.12) Hence, the Singular Value Decomposition (SVD) [26] of R can be written as R = [U ?] O O O S∗ ? = U S∗ (64.13) ¯ ¯ where U ∗ U = S ∗ S = I , ∈ Rn×n is diagonal and nonsingular, and where the question marks stand for blocks that are of no importance for the present discussion The following key equality is obtained by comparing the two expressions for R in Eqs (64.11) and (64.13) above: S = W R AC (64.14) ¯ where C = ∗ W L U −1 ∈ C n×n has full column rank For a given S, the true DOA vector can be obtained as the unique solution to Eq (64.14) under the parameter identifiability condition (64.8) (see, e.g., [18]) In the more realistic case when S is unknown, one can make use of Eq (64.14) to estimate the DOA vector in the following steps ˆ The IV step — Compute the pre- and post-weighted sample covariance matrix R in Eq (64.10), along with its SVD: ˆ R= ˆ U ? ˆ O O ? ˆ∗ S ? (64.15) ˆ ˆ where ˆ contains the n largest singular values Note that U , ˆ , and S are consistent estimates of ¯ U , , and S in the SVD of R c 1999 by CRC Press LLC The SSF step — Compute the DOA estimate as the minimizing argument of the following signal subspace fitting criterion: ˆ ˆ ˆ ˆ ˆ min{min[vec (S − W R AC)]∗ V [vec (S − W R AC)]} θ C (64.16) ˆ where V is a positive definite weighting matrix, and “vec” is the vectorization operator3 Alternatively, one can estimate the DOA instead by minimizing the following criterion: ˆ −1 ˆ ˆ ˆ −1 ˆ min{[vec (B ∗ W R S)]∗ W [vec (B ∗ W R S)]} θ (64.17) ˆ where W is a positive definite weight, and B ∈ C m×(m−n) is a matrix whose columns form a basis of the null-space of A∗ (hence, B ∗ A = and rank (B) = m − n) The alternative fitting criterion above is obtained from the simple observation that Eq (64.14) along with the definition of B imply that (64.18) B ∗ W −1 S = R It can be shown [27] that the classes of DOA estimates derived from Eqs (64.16) and (64.17), ˆ respectively, are asymptotically equivalent More exactly, for any V in Eq (64.16) one can choose ˆ in Eq (64.17) so that the DOA estimates obtained by minimizing Eq (64.16) and, respectively, W Eq (64.17) have the same asymptotic distribution and vice-versa In view of the previous result, in an asymptotical analysis it suffices to consider only one of the two criteria above In the following, we focus on Eq (64.17) Compared with Eq (64.16), the criterion (64.17) has the advantage that it depends on the DOA only On the other hand, for a general array there is no known closed-form parameterization of B in terms of θ However, as shown in the following, this is no drawback because the optimally weighted criterion (which is the one to be used in applications) is an explicit function of θ 64.4 The Optimal IV-SSF Method ˆ ˆ ˆ In what follows, we deal with the essential problem of choosing the weights W , W R , and W L in the IV-SSF criterion (64.17) so as to maximize the DOA estimation accuracy First, we optimize the ˆ ˆ ˆ accuracy with respect to W , and then with respect to W R and W L ˆ Optimal Selection of W Define ˆ −1 ˆ (64.19) g(θ ) = vec (B ∗ W R S) and observe that the criterion function in Eq (64.17) can be written as, ˆ g ∗ (θ )W g(θ ) (64.20) In [27] it is shown that g(θ) (evaluated at the true DOA vector) has, asymptotically in N , a circularly symmetric normal distribution with zero mean and the following covariance: G(θ ) = [(W L U N −1 ∗ ) R z (W L U If x is the kth column of a matrix X, then vec (X) = [x T x T k c 1999 by CRC Press LLC −1 T )] ⊗ [B ∗ R y B] · · · ]T (64.21) where ⊗ denotes the Kronecker matrix product [28]; and where, for a stationary signal s(t), we use the notation (64.22) R s = E [s(t)s ∗ (t)] Then, it follows from the ABC (Asymptotically Best Consistent) theory of parameter estimation4 that the minimum variance estimate, in the class of estimates under discussion, is given by the minimizing ˆ −1 ˆ argument of the criterion in Eq (64.20) with W = G (θ ), that is ˆ −1 f (θ ) = g ∗ (θ )G (θ )g(θ ) ˆ ˆ −1 ˆ ˆ ˆ −1 ˆ ˆ G(θ) = [(W L U ˆ )∗ R z (W L U ˆ )]T ⊗ [B ∗ R y B] N where (64.23) (64.24) ˆ ˆ and where R z and R y are the usual sample estimates of R z and R y Furthermore, it is easily shown that the minimum variance estimate, obtained by minimizing Eq (64.23), is asymptotically normally distributed with mean equal to the true parameter vector and the following covariance matrix: {Re [J ∗ G−1 (θ )J ]}−1 (64.25) ∂g(θ ) N →∞ ∂θ H = (64.26) where J = lim The following more explicit formula for H is derived in [27]: H = where 2N Re −1/2 D∗Ry −1/2 ⊥ D −1/2 R y Ry A T −1 (64.27) denotes the Hadamard-Schur matrix product (elementwise multiplication) and = ∗ W L U (U ∗ W L R z W L U )−1 U ∗ W L (64.28) Furthermore, the notation Y −1/2 is used for a Hermitian (for notational convenience) square root of the inverse of a positive definite matrix Y , the matrix D is made from the direction vector derivatives, D = [d · · · d n ]; and, for a full column-rank matrix X, X ∗ as ⊥ X =I − ⊥ X dk = ∂a(θ k ) ∂θ k defines the orthogonal projection onto the nullspace of X; X = X(X∗ X)−1 X∗ (64.29) ˆ ˆ ˆ To summarize, for fixed W R and W L , the statistically optimal selection of W leads to DOA estimates with an asymptotic normal distribution with mean equal to the true DOA vector and covariance matrix given by Eq (64.27) For details on the ABC theory, which is an extension of the classical BLUE (Best Linear Unbiased Estimation) / Markov theory of linear regression to a class of nonlinear regressions with asymptotically vanishing residuals, the reader is referred to [9, 29] c 1999 by CRC Press LLC ˆ ˆ Optimal Selection of W R and W L ˆ ˆ The optimal weights W R and W L are, by definition, those that minimize the limiting covariance matrix H of the DOA estimation errors In the expression (64.27) of H , only depends on W R and ¯ W L (the dependence on W R is implicit, via U ) Since the matrix has rank n, it can be factorized as follows: (64.30) = ∗ ¯ ¯ ¯ where both ∈ C m×n and ∈ C n×n have full column rank Insertion of Eq (64.30) into the ∗ ∗ equality W L A W R = U S yields the following equation, after a simple manipulation, 1T WL =U (64.31) ¯ ¯ where T = ∗ A∗ W R S −1 ∈ C n×n is a nonsingular transformation matrix By using Eq (64.31) in Eq (64.28), we obtain: ∗ ∗ 2 −1 ∗ ∗ ( W L )( W L R z W L ) ( W L ) = (64.32) ˆ Observe that does not actually depend on W R Hence, W R can be arbitrarily selected, as any nonsingular Hermitian matrix, without affecting the asymptotics of the DOA parameter estimates! ˆ Concerning the choice of W L , it is easily verified that ≤ |W R −1 − z = R −1/2 = z L= ∗ −1 ∗ ( Rz 1) = ∗ R −1 z (64.33) Indeed, ∗ ×( ∗ W L )] ∗ −1 [ Rz ∗ ∗ −1/2 Rz = ∗ ∗ 2 −1 W L )( W L R z W L ) −1/2 ⊥ Rz 1/2 Rz WL −( × (64.34) −1/2 which is obviously a nonnegative definite matrix Hence, W L = R z maximizes Then, it follows from the expression of the matrix H and the properties of the Hadamard-Schur product that ˆ this same choice of W L minimizes H The conclusion is that the optimal weight W L , which yields the best limiting accuracy, is ˆ −1/2 ˆ (64.35) W L = Rz The (minimum) covariance matrix H , corresponding to the above choice, is given by Ho = −1/2 {Re [(D ∗ R y 2N −1/2 ⊥ D) −1/2 R y Ry A ( ∗ R −1 )T ]}−1 z (64.36) Remark It is worth noting that H o monotonically decreases as m (the dimension of z(t)) increases The ¯ proof of this claim is similar to the proof of the corresponding result in [9], Complement C8.5 Hence, as could be intuitively expected, one should use all available instruments (spatial and/or temporal) to obtain maximal theoretical accuracy However, practice has shown that too large a dimension of the IV vector may in fact decrease the empirically observed accuracy This phenomenon can be explained by the fact that increasing m means that a longer data set is necessary for the asymptotic ¯ results to be valid Optimal IV-SSF Criteria Fortunately, the criterion, (64.23) and (64.24) can be expressed in a functional form that depends on the indeterminate θ in an explicit way (recall that, for most cases, the dependence of B in Eq (64.23) on θ is not available in explicit form) By using the following readily verified equality [28], tr (AX∗ BY ) = [vec (X)]∗ [AT ⊗ B][vec (Y )] c 1999 by CRC Press LLC (64.37) which holds for any conformable matrices A, X, B, and Y , one can write Eq (64.23) as:5 ˆ ∗ ˆ −1 ˆ ˆ −1 ˆ ˆ ˆ −1 ˆ ˆ −1 ˆ f (θ) = tr {[(W L U ˆ )∗ R z (W L U ˆ )]−1 S W R B(B ∗ R y B)−1 B ∗ W R S} (64.38) However, observe that −1/2 ˆ −1/2 ˆ 1/2 R Ry B y ˆ ˆ B(B ∗ R y B)−1 B ∗ = R y −1/2 ⊥ ˆ −1/2 R ˆ −1/2 Ry A y (64.39) ⊥ ˆ −1/2 W −1 S] ˆR ˆ −1/2 R ˆy A y R (64.40) ˆ = Ry Inserting Eq (64.39) into Eq (64.38) yields: ˆ ∗ ˆ −1 ˆ −1/2 ˆ∗ ˆ ˆ ˆ ˆ f (θ) = tr [ ˆ (U W L R z W L U )−1 ˆ S W R R y which is an explicit function of θ Insertion of the optimal choice of W L into Eq (64.40) leads to a further simplification of the criterion as seen below ˆ Owing to the arbitrariness in the choice of W R , there exists an infinite class of optimal IV-SSF criteria In what follows, we consider two members of this class Let ˆ −1/2 ˆ (64.41) W R = Ry Insertion of Eq (64.41), along with Eq (64.35), into Eq (64.40) yields the following criterion function: fW W (θ ) = tr ⊥ ˜ S ˆ −1/2 Ry A ˜ 2S∗ ˜ (64.42) ˜ where S and ˜ are made from the principal singular right vectors and singular values of the matrix ˜ ˆ −1/2 ˆ ˆ −1/2 R = R z R zy R y (64.43) ˆ (with R zy defined in an obvious way) The function (64.42) is the UNCLE (spatial IV-SSF) criterion of Wong and Wu [12, 13] ˆ Next, choose W R as ˆ (64.44) WR = I The corresponding criterion function is fV SO (θ) = tr ⊥ ¯ ˆ −1/2 S R ˆ −1/2 Ry A y ¯ S ∗ R −1/2 ¯ ˆy (64.45) ¯ where S and ¯ are made from the principal singular pairs of ¯ ˆ −1/2 ˆ R = R z R zy (64.46) The function (64.45) above is recognized as the optimal (temporal) IV-SSF criterion of Viberg et al [14] An important consequence of the previous discussion is that the DOA estimation methods of [12, 13] and [14], respectively, which were derived in seemingly unrelated contexts and by means of somewhat different approaches, are in fact asymptotically equivalent when used under the same conditions These two methods have very similar computational burdens, which can be seen by comparing Eqs (64.42) and (64.43) with Eqs (64.45) and (64.46) Also, their finite-sample properties appear to be rather similar, as demonstrated in the simulation examples Numerical algorithms for the minimization of the type of criterion function associated with the optimal IV-SSF methods are discussed in [17] Some suggestions are also given in the summary below To within a multiplicative constant c 1999 by CRC Press LLC 64.5 Algorithm Summary The estimation method presented in this section is useful for direction finding in the presence of noise of unknown spatial color The underlying assumptions and the algorithm can be summarized as follows: Assumptions — A batch of N samples of the array output y(t), that can accurately be described by the model (64.1) and (64.2) is available The array is calibrated in the sense that a(θ ) is a known function of its argument θ In addition, N samples of the IV-vector z(t), fulfilling Eqs (64.4) through (64.8), are given In words, the IV vector is uncorrelated with the noise but well correlated with the signal In practice, z(t) may be taken from a second subarray, a delayed version of y(t), or a reference (pilot) signal In the former case, the second subarray need not be calibrated Algorithm — In the following we summarize the UNCLE version (64.42) of the algorithm ˜ First, compute R from the sample statistics of y(t) and z(t), according to ˜ ˆ −1/2 ˆ ˆ −1/2 R = R z R zy R y From a numerical point of view, this is best done using QR factorization Next, partition the singular ˜ value decomposition of R according to ˜ R= ˜ O ˜ U ? ˜∗ S ? O ? , ˜ where S contains the n principal right singular vectors and the diagonal matrix ˜ the corresponding ¯ singular values If n is unknown, it can be estimated as the number of significant singular values ¯ Finally, compute the DOA estimates as the minimizing arguments of the criterion function ⊥ ˜ S ˆ −1/2 Ry A fW W (θ ) = tr ˜ 2S∗ ˜ using n = n If the minimum value of the criterion is “large”, it is an indication that more than n ¯ ¯ sources are present In the general case, a numerical search must be performed to find the minimum The leastsq implementation in MatlabTM , which uses the Levenberg-Marquardt or Gauss-Newton techniques [30], is a possible choice To initialize the search, one can use the alternating projection procedure [31] In short, a grid search over fW W (θ ) is first performed assuming n = 1, i.e., using ˆ fW W (θ1 ) The resulting DOA estimate θ1 is then “projected out” from the data, and a grid search for the second DOA is performed using the modified criterion f2 (θ2 ) The procedure is repeated until initial estimates are available for all DOAs The kth modified criterion can be expressed as fk (θk ) = − a ∗ (θk ) ⊥ ˜ S ˆ −1/2 ˆ Ry Ak−1 a ∗ (θk ) ˜ 2S∗ ˜ ⊥ a(θk ) ˆ −1/2 ˆ Ry Ak−1 ⊥ a(θk ) ˆ −1/2 ˆ Ry Ak−1 where ˆ Ak ˆ θk = = ˆ A(θ k ) ˆ ˆ [θ1 , , θk ]T The initial estimate of θk is taken as the minimizing argument of fk (θk ) Once all DOAs have been initialized one can, in principle, continue the alternating projection minimization in the same way However, the procedure usually converges rather slowly and therefore it is recommended instead to switch to a Newton-type search as indicated above Empirical investigations in [17, 32] using similar subspace fitting criteria, have indicated that this indeed leads to the global minimum with high probability c 1999 by CRC Press LLC 64.6 Numerical Examples This section reports the results of a comparative performance study based on Monte-Carlo simulations The scenarios are identical to those presented in [33] (spatial IV-SSF) and [14] (temporal IV-SSF) The plots presented below contain theoretical standard deviations of the DOA estimates along with empirically observed RMS (root mean square) errors The former are obtained from Eq (64.36), whereas the latter are based on 512 independent noise and signal realizations The minimizers of Eq (64.42) (UNCLE) and Eq (64.45) (IV-SSF) are computed using a modified GaussNewton search initialized at the true DOAs (since here we are interested only in the quality of the global optimum) DOA estimates that are more than 5◦ off the true value are declared failures, and not included in the empirical RMS calculation If the number of failures exceeds 30%, no RMS value is calculated In all scenarios, two planar wavefronts arrive from DOAs 0◦ and 5◦ relative to the array broadside Unless otherwise stated, the emitter signals are zero-mean Gaussian with signal covariance matrix P = I Only the estimation statistics for θ1 = 0◦ are shown in the plots below, the ones for θ2 being similar The array output (both subarrays in the spatial IV scenario) is corrupted by additive zero-mean temporally white Gaussian noise The noise covariance matrix has klth element π Qkl = σ 0.9|k−l| ej (k−l) (64.47) The noise level σ is adjusted to give a desired SNR, defined as P 11 /σ = P 22 /σ This noise is reminiscent of a strong signal cluster at the location θ = 30◦ EXAMPLE 64.4: Spatial IVM In the first example, a ULA of 16 elements and half-wavelength separation is employed The first m = contiguous sensors form a calibrated subarray, whereas the outputs of the last m = sensors ¯ ˜ are used as instrumental variables, and these sensors could therefore be uncalibrated Letting y(t) denote the 16-element array output, we thus take ˜ ˜ y(t) = y 1:8 (t) z(t) = y 9:16 (t) Both subarray outputs are perturbed by independent additive noise vectors, both having × covariance matrices given by Eq (64.47) In this example, the emitter signals are assumed to be temporally white In Fig 64.1, the theoretical and empirical RMS errors are displayed vs the number of samples The SNR is fixed at dB Figure 64.2 shows the theoretical and empirical RMS errors vs the SNR The number of snapshots is here fixed to N = 100 To demonstrate the applicability to situations involving highly correlated signals, Fig 64.2 is repeated but using the signal covariance P = 1 1 The resulting RMS errors are plotted with their theoretical values in Fig 64.3 By comparing Figs 64.2 and 64.3, we see that the methods are not insensitive to the signal correlation However, the observed RMS errors agree well with the theoretically predicted values, and in spatial scenarios this is the best possible RMS performance (the empirical RMS error appears to be lower than the CRB for low SNR; however this is at the price of a notable bias) c 1999 by CRC Press LLC FIGURE 64.1: RMS error of DOA estimate vs number of snapshots Spatial IVM The solid line is the theoretical standard deviation FIGURE 64.2: RMS error of DOA estimate vs SNR Spatial IVM The solid line is the theoretical standard deviation c 1999 by CRC Press LLC FIGURE 64.3: RMS error of DOA estimate vs SNR Spatial IVM Coherent signals The solid line is the theoretical standard deviation FIGURE 64.4: RMS error of DOA estimate vs number of snapshots Temporal IVM The solid line is the theoretical standard deviation c 1999 by CRC Press LLC In conclusion, no significant performance difference is observed between the two IV-SSF versions The observed RMS errors of both methods follow the theoretical curves quite closely, even in fairly difficult scenarios involving closely spaced DOAs and highly correlated signals EXAMPLE 64.5: Temporal IVM In this example, the temporal IV approach is investigated The array is a 6-element ULA of half wavelength interelement spacing The real and imaginary parts of both signals are generated as uncorrelated first-order complex AR processes with identical spectra The poles of the driving ARprocesses are 0.6 In this case, y(t) is the array output, whereas the instrumental variable vector is chosen as z(t) = y(t − 1) In Fig 64.4, we show the theoretical and empirical RMS errors vs the number of snapshots The SNR is fixed at 10 dB Figure 64.5 displays the theoretical and empirical RMS errors vs the SNR The number of snapshots is here fixed at N = 100 FIGURE 64.5: RMS error of DOA estimate vs SNR Temporal IVM The solid line is the theoretical standard deviation The figures indicate a slight performance difference among the methods in temporal scenarios, namely when the number of samples is small but the SNR is relatively high However, no definite conclusions can be drawn regarding this somewhat unexpected phenomenon from our limited simulation study 64.7 Concluding Remarks The main points made by the present contribution can be summarized as follows: c 1999 by CRC Press LLC The spatial and temporal IV-SSF approaches can be treated in a unified manner under general conditions In fact, a general IV-SSF approach using both spatial and temporal instruments is also possible ˆ The optimization of the DOA parameter estimation accuracy, for fixed weights W L and ˆ W R , can be most conveniently carried out using the ABC theory The resulting derivations are more concise than those based on other analysis techniques ˆ The column (or post-)weight W R has no effect on the asymptotics An important corollary of the above-mentioned result is that the optimal IV-SSF methods of [12, 13] and, respectively, [14] are asymptotically equivalent when used on the same data In closing this section, we reiterate the fact that the IV-SSF approaches can deal with coherent signals, handle noise fields with general (unknown) spatial correlations, and, in their spatial versions, can make use of outputs from completely uncalibrated sensors They are also comparatively simple from a computational standpoint, since no noise modelling is required Additionally, the optimal IV-SSF methods provide highly accurate DOA estimates More exactly, in spatial IV scenarios these DOA estimation methods can be shown to be asymptotically statistically efficient under weak conditions [33] In temporal scenarios, they are no longer exactly statistically efficient, yet their accuracy is quite close to the best possible one [14] All these features and properties should make the optimal IV-SSF approach appealing for practical array signal processing applications The IV-SSF approach can also be applied, with some modifications, to system identification problems [34] and is hence expected to play a role in that type of application as well References [1] Li, F and Vaccaro, R.J., Performance degradation of DOA estimators due to unknown noise fields, IEEE Trans SP, SP-40(3), 686–689, March 1992 [2] Viberg, M., Sensitivity of parametric direction finding to colored noise fields and undermodeling, Signal Processing, 34(2), 207–222, Nov 1993 [3] Swindlehurst, A and Kailath, T., A performance analysis of subspace-based methods in the presence of model errors: Part — Multidimensional algorithms, IEEE Trans on SP, SP-41, 28822890, Sept 1993 [4] Bă hme, J.F and Kraus, D., On least squares methods for direction of arrival estimation in the o presence of unknown noise fields, Proc ICASSP 88, 2833–2836, New York, 1988 [5] Le Cadre, J.P., Parametric methods for spatial signal processing in the presence of unknown colored noise fields, IEEE Trans on ASSP, ASSP-37(7), 965–983, July 1989 [6] Nagesha, V and Kay, S., Maximum likelihood estimation for array processing in colored noise, Proc ICASSP 93, 4, 240–243, Minneapolis, MN, 1993 [7] Ye, H and DeGroat, R., Maximum likelihood DOA and unknown colored noise estimation with asymptotic Cram´ r-Rao bounds, Proc 27th Asilomar Conf Sig., Syst., Comput., 1391–1395, e Pacic Grove, CA, Nov 1993 [8] Să derstră m, T and Stoica, P., Instrumental Variable Methods for System Identification, o o Springer-Verlag, Berlin, 1983 [9] Să derstră m, T and Stoica, P., System Identification, Prentice-Hall, London, U.K., 1989 o o [10] Moses, R.L and Beex, A.A., Instrumental variable adaptive array processing, IEEE Trans on AES, AES-24, 192–202, March 1988 [11] Stoica, P., Viberg, M and Ottersten, B., Instrumental variable approach to array processing in spatially correlated noise fields, IEEE Trans SP, SP-42, 121–133, Jan 1994 c 1999 by CRC Press LLC [12] Wu, Q and Wong, K.M., UN-MUSIC and UN-CLE: An application of generalized canonical correlation analysis to the estimation of the directions of arrival of signals in unknown correlated noise, IEEE Trans SP, 42, 2331–2341, Sept 1994 [13] Wu, Q and Wong, K.M., Estimation of DOA in unknown noise: Performance analysis of UN-MUSIC and UN-CLE, and the optimality of CCD, IEEE Trans SP, 43, 454–468, Feb 1995 [14] Viberg, M., Stoica, P and Ottersten, B., Array processing in correlated noise fields based on instrumental variables and subspace fitting, IEEE Trans SP, 43, 1187–1199, May 1995 [15] Stoica, P., Viberg, M., Wong, M and Wu, Q., Maximum-likelihood bearing estimation with partly calibrated arrays in spatially correlated noise fields, IEEE Trans on SP, 44, 88–899, Apr 1996 [16] Schmidt, R.O., Multiple emitter location and signal parameter estimation, IEEE Trans on AP, 34, 276–280, Mar 1986 [17] Ottersten, B., Viberg, M., Stoica, P and Nehorai, A., Exact and large sample ML techniques for parameter estimation and detection in array processing, in Radar Array Processing, Haykin, Litva, and Shepherd, Eds., Springer-Verlag, Berlin, 1993, 99–151 [18] Wax, M and Ziskind, I., On unique localization of multiple sources by passive sensor arrays, IEEE Trans on ASSP, ASSP-37(7), 996–1000, July 1989 [19] Hudson, J.E., Adaptive Array Principles, Peter Peregrinus, 1981 [20] Compton, R.T., Jr., Adaptive Antennas, Prentice-Hall, Englewood Cliffs, NJ, 1988 [21] Lee, W.C.Y., Mobile Communications Design Fundamentals, 2nd ed., John Wiley & Sons, New York, 1993 [22] Agee, B.G., Schell, A.V and Gardner, W.A., Spectral self-coherence restoral: A new approach to blind adaptive signal extraction using antenna arrays, Proc IEEE, 78, 753–767, Apr 1990 [23] Shamsunder, S and Giannakis, G., Signal selective localization of nonGaussian cyclostationary sources, IEEE Trans SP, 42, 2860–2864, Oct 1994 [24] Zhang, Q.T and Wong, K.M., Information theoretic criteria for the determination of the number of signals in spatially correlated noise, IEEE Trans SP, SP-41(4), 1652–1663, Apr 1993 [25] Wu, Q and Wong, K.M., Determination of the number of signals in unknown noise environments, IEEE Trans SP, 43, 362–365, Jan 1995 [26] Golub, G.H and VanLoan, C.F., Matrix Computations, 2nd ed., Johns Hopkins University Press, Baltimore, MD, 1989 [27] Stoica, P., Viberg, M., Wong, M and Wu, Q., A unified instrumental variable approach to direction finding in colored noise fields: Report version, Technical Report CTH-TE-32, Chalmers University of Technology, Gothenburg, Sweden, July 1995 [28] Brewer, J.W., Kronecker products and matrix calculus in system theory, IEEE Trans on CAS, 25(9), 772–781, Sept 1978 [29] Porat, B., Digital Processing of Random Signals, Prentice-Hall, Englewood Cliffs, NJ, 1993 [30] Gill, P.E., Murray, W and Wright, M.H., Practical Optimization, Academic Press, London, 1981 [31] Ziskind, I and Wax, M., Maximum likelihood localization of multiple sources by alternating projection, IEEE Trans on ASSP, ASSP-36, 1553–1560, Oct 1988 [32] Viberg, M., Ottersten, B and Kailath, T., Detection and estimation in sensor arrays using weighted subspace fitting, IEEE Trans SP, SP-39(11), 2436–2449, Nov 1991 [33] Stoica, P., Viberg, M., Wong, M and Wu, Q., Optimal direction finding with partly calibrated arrays in spatially correlated noise fields, Proc 28th Asilomar Conf Sig., Syst., Comput., Pacific Grove, CA, Oct 1994 [34] Cedervall, M and Stoica, P., System identification from noisy measurements by using instrumental variables and subspace fitting, Proc ICASSP 95, 1713–1716, Detroit, MI, May 1995 c 1999 by CRC Press LLC [35] Ljung, L., System Identification: Theory for the User, Prentice-Hall, Englewood Cliffs, NJ, 1987 Appendix A: Introduction to IV Methods In this appendix we give a brief introduction to instrumental variable methods in their original context, which is time series analysis Let y(t) be a real-valued scalar time series, modeled by the auto-regressive moving average (ARMA) equation y(t) + a1 y(t − 1) + · · · + ap y(t − p) = e(t) + b1 e(t − 1) + · · · + bq e(t − q) (64.48) Here, e(t) is assumed to be a stationary white noise Suppose we are given measurements of y(t) for t = 1, , N and wish to estimate the AR parameters a1 , , ap The roots of the AR polynomial zp + a1 zp−1 + · · · + ap are the system poles, and their estimation is of importance, for instance, for stability monitoring Also, the first step of any “linear” method for ARMA modeling involves finding the AR parameters as the first step The optimal way to approach the problem requires a p q non-linear search over the entire parameter set {ak }k=1 , {bk }k=1 ; using a maximum likelihood or a prediction error criterion [9, 35] However, in many cases this is computationally prohibitive, and in addition the “noise model” (the MA parameters) is sometimes of less interest per se In contrast, the IV approach produces estimates of the AR part from a solution of a (possibly overdetermined) linear system of equations as follows: Rewrite Eq (64.48) as y(t) = ϕ T (t)θ + v(t) , (64.49) ϕ(t) = [−y(t − 1), , −y(t − p)]T θ = [a1 , , ap ]T v(t) = e(t) + b1 e(t − 1) + · · · + bq e(t − q) (64.50) where (64.51) (64.52) Note that Eq (64.49) is a linear regression in the unknown parameter θ A standard least-squares (LS) estimate is obtained by minimizing the LS criterion VLS (θ ) = E (y(t) − ϕ T (t)θ)2 (64.53) Equating the derivative of Eq (64.53) (w.r.t θ ) to zero gives the so-called normal equations ˆ E ϕ(t)ϕ T (t) θ = E ϕ(t)y(t) resulting in ˆ θ = R −1 R ϕy = E ϕ(t)ϕ T (t) ϕϕ −1 E ϕ(t)y(t) (64.54) (64.55) Inserting Eq (64.49) into Eq (64.55) shows that ˆ θ = θ + R −1 R ϕv ϕϕ (64.56) In case q = (i.e., y(t) is an AR process), we have v(t) = e(t) Because ϕ(t) and e(t) are uncorrelated, Eq (64.56) shows that the LS method produces a consistent estimate of θ However, when q > 0, ϕ(t) and v(t) are in general correlated, implying that the LS method gives biased estimates From the above we conclude that the problem with the LS estimate in the ARMA case is that the regression vector ϕ(t) is correlated with the “equation error noise” v(t) An instrumental variable c 1999 by CRC Press LLC vector ζ (t) is one that is uncorrelated with v(t), while still “sufficiently correlated” with ϕ(t) The most natural choice in the ARMA case (provided the model orders are known) is ζ (t) = ϕ(t − q) (64.57) which clearly fulfills both requirements Now, multiply both sides of the linear regression model (64.49) by ζ (t) and take expectation, resulting in the “IV normal equations” E ζ (t)y(t) = E ζ (t)ϕ T (t) θ (64.58) The IV estimate is obtained simply by solving the linear system of equations (64.58), but with the unknown cross-covariance matrices R ζ ϕ and R ζy replaced by their corresponding estimates using time averaging Since the latter are consistent, so are the IV estimates of θ The method is also referred to as the extended Yule-Walker approach in the literature Its finite sample properties may often be improved upon by increasing the dimension of the IV vector, which means that Eq (64.58) must be solved in an LS sense, and also by appropriately pre-filtering the IV-vector This is quite similar to the optimal weighting proposed herein In order to make the connection to the IV-SSF method more clear, a slightly modified version of Eq (64.58) is presented Let us rewrite Eq (64.58) as follows Rζ φ where θ =0, R ζ φ = E ζ (t) [y(t), −ϕ T (t)] (64.59) (64.60) The relation (64.59) shows that R ζ φ is singular, and that θ can be computed from a suitably normalized vector in its one-dimensional nullspace However, when R ζ φ is estimated using a finite number of data, it will with probability one have full rank The best (in a least squares sense) low-rank approximation of R ζ φ is obtained by truncating its singular value decomposition A natural estimate ˆ of θ can therefore be obtained from the right singular vector of R ζ φ that corresponds to the minimum singular value The proposed modification is essentially an IV-SSF version of the extended YuleWalker method, although the SSF step is trivial because the parameter vector of interest can be computed directly from the estimated subspace Turning to the array processing problem, the counterpart of Eq (64.49) is the (Hermitian transposed) data model (64.1) y ∗ (t) = x ∗ (t)A∗ + e∗ (t) Note that this is a non-linear regression model, owing to the non-linear dependence of A on θ Also observe that y(t) is a complex vector as opposed to the real scalar y(t) in Eq (64.49) Similar to Eq (64.58), the IV normal equations are given by E z(t)y ∗ (t) = E z(t)x ∗ (t) A∗ (64.61) under the assumption that the IV-vector z(t) is uncorrelated with the noise e(t) Unlike the standard IV problem, the “regressor” x(t) [corresponding to ϕ(t) in Eq (64.49)] cannot be measured Thus, it is not possible to get a direct estimate of the “regression variable” A However, its range space, or at least a subset thereof, can be computed from the principal right singular vectors In the finite sample case, the performance can be improved by using row and column weighting, which leads to the weighted IV normal equations (64.11) The exact relation involving the principal right singular vectors is Eq (64.14), and two SSF formulations for revealing θ from the computed signal subspace are given in Eqs (64.16) and (64.17) c 1999 by CRC Press LLC ... algorithm can be summarized as follows: Assumptions — A batch of N samples of the array output y(t), that can accurately be described by the model (64. 1) and (64. 2) is available The array is calibrated... should use all available instruments (spatial and/or temporal) to obtain maximal theoretical accuracy However, practice has shown that too large a dimension of the IV vector may in fact decrease the... is allowed to be arbitrarily spatially correlated with an unknown covariance matrix Assumption A3 is a wellknown condition that, under a weak restriction on m, guarantees DOA parameter identifiability

64 A Unified Instrumental Variable

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan