# **Digital Multicarrier Demodulator for Regenerative Communication Satellites** (\*) Enrico Del Re, Romano Fantacci Dipartimento di Ingegneria Elettronica – Università di Firenze Via S. Marta, 3 – 50139 Firenze – Italy **Abstract.** Regenerative, on board Processing FDMA/TDM payloads have been recently proposed as valid candidates for user-oriented satellite systems. Both business traffic for fixed service and mobile satellite systems can potentially take advantage of the peculiarities of such payloads, which substantially require multicarrier demodulation (MCD) of the uplink FDMA carriers to recover the individual modulating streams, which are in turn TDM-formatted to modulate a unique downlink carrier. Therefore two main functions are implemented by a MCD: the demultiplexing (DEMUX) and the demodulation (DEMOD). We focus here only on a digital implementation of the MCD looking at its advantages, flexibility, better performance and VLSI possible integrability. This paper expands on the most suitable digital techniques to implement on-board MCD. In particular the impact of the use of a kind of network clock synchronization on the overall MCD complexity is investigated in detail. The digital architecture of the proposed MCD can be adapted to different digital modulation techniques. However, we focus here only on the application for QPSK signals, considering the interest of this modulation scheme for digital satellite communications. Theoretical results and computer simulations are indicated to evaluate the performance degradation of the proposed MCD, including the finite arithmetic implementation effects. #### 1. INTRODUCTION One of the most attractive architectures for business-service satellite systems recently proposed envisages different access methods in the two links [1], i.e. Frequency Division Multiple Access (FDMA) in the uplink and Time Division Multiplexing (TDM) in the downlink. In this way, the user uplink RF power requirements are proportional to the individual bandwidths (differently from TDMA which requires power levels proportional to the transponder bandwidth), and, to some extent, network synchronization procedures are not requested. In the downlink, TDM permits the on-board High Power Amplifier (HPA) to be saturated or slightly backed-off, due to absence of multicarrier intermodulation. Such an architecture which attempts to optimize the system RF power resources requires non-trivial access format conversion on-board from FDMA to TDM. The feasibility of this approach depends therefore on efficient means of translating between the two multiple access formats on board the satellite. The on board system implementation complexity (including the VLSI design) and power consumption are, of course, of primary concern. The on board processing system receives an input FDMA signal and supplies an output to interface the TDM link. Therefore it must accomplish the function of the separation of each individual radio channel, its demodulation and its correct switching to the appropriate downlink channel. An appropriate name for the on board processing system performing the first two operations is the "multicarrier demodulator" (MCD). Two main functions are implemented by a MCD: the demultiplexing (DEMUX) and the demodulation (DEMOD). The focus here is only on a digital implementation of the MCD because in perspective it offers several advantages such as flexibility, VLSI integrability, better efficiency [22]. The operation of the DEMUX is to separate the individual input FDMA channels and to supply each of them, down-converted to baseband, to a demodulator input for the appropriate symbol decision. Therefore in principle its operation corresponds to a bank of bandpass filters followed by a downconverter. By digital means the down-conversion can be appropriately implemented by a decimation operation. Moreover the direct implementation of a bank of digital filters is not the most convenient solution. This paper deals with efficient approaches to the digital implementation of the DEMUX, namely the block, the per-channel method and the multistage technique [2]-[8]. A coherent demodulation is usually employed in satellite communication in order to achieve the re- <sup>(\*)</sup> Work partly supported by ESTEC contract 6096/84/NL/GM(SC) and Ministero Pubblica Istruzione, Italy. quired bit-error-rate with an acceptable signal-to-noise ratio. It is implicity assumed that the channel does not introduce distortions on the transmitted FDM signal. The performance of a coherent demodulators depends rather critically on the design of the synchronization circuit employed to estimate the received carrier phase and bit synchronization reference from the received signal. Carrier recovery can be achieved in different ways: the M-th power method, the costas loop and decision-directed feedback circuit. With M = 2, the M-th power method is known as a squaring loop [21]. Clock recovery is usually achieved performing non linear operation on the received signal because the signal does not contain discrete spectral lines at the clock frequency. Clock recovery can occur subsequent to or coincident with carrier recovery. In the former case, the recovery circuits operate on the demodulated (not necessary detected) baseband waveform, whereas, in the latter situation, circuits operate directly on the modulated carrier signal. The complexity of the clock recovery circuit could be drastically reduced, if a kind of network clock synchronization is implemented. This requirement is even more stringent in the presence of a high number of low data rate carriers, whose (clock) misalignment at the satellite input may drastically complicate the processing payload. Also the carrier recovery circuit could be simplified, using differential detection schemes, when applicable. In this study we shall consider first the most general structure, where both carrier and clock recovery circuits are required for each demultiplexed channel. Then the benefits introduced by the possibilities of a network synchronization will be investigated. The digital architecture of the studied receiver can be adapted to different digital modulation techniques; nevertheless, we focus here only on the application for QPSK signal, considering the interest of this modulation scheme in satellite digital communications. To achieve receiver synchronization suitable approaches are studied. For the implementation of the carrier recovery circuit a nonlinear estimation technique [10] is considered. This approach has been selected because it achieves a good estimate accuracy, is less sensitive to a finite arithmetic implementation and requires a short and defined acquisition time. Moreover, it can be used with continuous as well as burst mode carriers and a certain degree of integration of the demultiplexer and demod functions can be conceivable. The performance of the squared non-linear estimation method has been also studied in the presence of a frequency uncertainty. In particular it is shown that the number of QPSK symbols to be used to accomplish the carrier phase estimate can be suitably selected to achieve better performance. The clock recovery circuit has been assumed implemented according to the clock estimation method proposed by Gardner [11]. This clock recovery approach has been selected because its estimation operations are independent of carrier phase and some degree of integration of demultiplexer and clock recovery function is possible. In conclusion the multicarrier demodulator described in this paper represents a complete solution for a processing system interfacing FDMA and TDM links on board advanced satellite communication systems, suitable to an implementation by VLSI digital circuits. #### 2. DEMULTIPLEXER From a general point of view, the demultiplexing of uplink FDMA channels (to be subsequently demodulated) can be implemented using three different approaches: (i) per-channel, (ii) block and (iii) multistage. The per-channel method performs the demultiplexing operation essentially by means of a bank of bandpass filters. Selection of each input signal and its translation to a low-frequency band are achieved by a compound operation of digital filtering and decimation (i.e. decrease of the signal sampling rate). The block method implements the demultiplexing by using a set of digital filters ("Polyphase" Network) followed by a "block" processor usually of the FFT type that processes the output signals from the digital filters altogether. This procedure of processing the input FDMA signal to obtain an output TDM one directly derives from studies on "Transmultiplexers" Fig. 1 - Demux by the Analytic Signal Approach: implementation block diagram. $H_i(fT_u)$ : conjugate symmetric part of $\mathbf{H}_i(fT_u)$ $H_i'(fT_u)$ : conjugate antisymmetric part of $\mathbf{H}_i(fT_u)$ $G_i(fT_d)$ : conjugate symmetric part of $G_i(fT_d)$ $G_i'(fT_d)$ : conjugate antisymmetric part of $G_i(fT_d)$ $V_c$ : decimation factor. [4]-[8], which were originally conceived for the TDM-FDM transformation, to convert PCM (time division) signals into analog (FDM) channels and viceversa. The multistage method can be considered as a binary tree of two channels demultiplexers. Each demultiplexing stage performs a lowpass and a highpass filtering with subsequent decimation by a factor of two [8]. In the following we will briefly summarize the main characteristics and performance of the three approaches. ### a) The Analytic Signal approach Within the class of per-channel methods, an effective solution from the implementation complexity point of view is represented by the Analytic Signal (AS) approach [4]. A specific feature of the AS method is to greatly relax the filter specifications in terms of transition bandwidths, thus achieving a lower implementation complexity with respect to the other perchannel approaches. Further, the AS approach leads directly to a high modular structure directly matched to the per-channel implementation of the demodu-Therefore, as it is shown later, integration of the DEMUX and DEMOD functions is conceivable. Another advantage of the AS approach is its high flexibility: differently from the other methods, in the case that some specific applications should benefit from the unequal channel bandwidth, the AS structure could vary on demand the bandwidth assigned to each channel, simply by switching to a suitable new set of DEMUX parameters. The principle of operation of the AS method is illustrated in [5] and will be briefly recalled in the following. The structure of the DEMUX according to the AS method is shown in Fig. 1 [5]. The FDMA input signal, after appropriate analog down-conversion of the received signal to a low frequency range, is sampled according to the sampling theorem at the high-rate frequency $f_u = 1/T_u[s(nT_u)]$ and processed in order to obtain $N_c$ TDM digital signals, each sampled at the low-rate frequency $f_d = 1/T_d$ , $N_c$ being the number of multiplexed channels $[x_i(mT_d)]$ . In Fig. 1 $H_i(fT_u)$ , $H'_{i}(fT_{u})$ represent the conjugate symmetric and antisymmetric parts, respectively, of the high-rate complex bandpass filter $\mathbf{H}_i(fT_u)$ which can be regarded as a frequency translated version of a low-pass prototype $\mathbf{H}(f T_u)$ such that [5]: $$\mathbf{H}_{i}(fT_{u}) = H_{i}(fT_{u}) + jH'_{i}(fT_{u}) =$$ $$= \mathbf{H}[2\pi (f - iW - W/2)T_{u}]$$ (1) where W is the channel spacing. In the same figure, $G_i(fT_d)$ and $G_i'(fT_d)$ represent the conjugate symmetric and antisymmetric parts, respectively, of the complex low-rate filter $G_i(fT_d)$ which can be defined as [5]: $$\mathbf{G}_{i}(fT_{d}) = G_{i}(fT_{d}) + jG'_{i}(fT_{d}) =$$ $$= \mathbf{G}\{[f - (-1)^{i}W/2]T_{d}\}$$ (2) Thus, each filter $G_i(fT_d)$ is related, according to eq. (2), to a low-pass prototype. It can be noted from eq. (2) that the number of different filters $G_i(fT_d)$ is actually two: one for the odd channels and the other for the even channels. Taking into account eqs. (1) and (2), we have in the frequency domain [5] $$X_i(fT_d) = S[(fT_d + i/2)/N_c]$$ (3) according to the implementation structure shown in Fig. 1. It must be noted that a decimation by a factor equal to the number $N_c$ of multiplexed channels must be used. As shown in Fig. 1 the terms $S(fT_u)$ and $X_i(fT_d)$ represent the spectrum of the input signal and the spectrum of the i-th output of the DEMUX respectively. The pulse shaping filter which is generally used in order to reduce the effects of noise at the receiver and to avoid the intersymbol interference (ISI) at the detection instant can be implemented by cascading the two digital filters $\mathbf{H}_i(fT_u)$ and $\mathbf{G}_i(fT_d)$ . The high-rate filter $\mathbf{H}_i(fT_u)$ is essentially a band-pass filter, thus the desired pulse-shaping function can be implemented by the low-rate filter $\mathbf{G}_i(fT_d)$ . For example it will be shown later that 40% cosine roll-off factor pulse shaping filter [9] equally shared between the transmitter and receiver can be easily integrated in the DEMUX lowering the overall implementation complexity. From the implementation structure of the analytic signal method (Fig. 1) it can be noted that only processing of real quantities is required. The illustration of the frequency dechannellization performed by the AS method is reported in Fig. 2. It must be pointed out that the analytic signal method has been outlined on the basis of ideal filtering masks; however in real applications there are nonzero transition bands for the filters $G_i(fT_d)$ and wider than the channel spacing W transition bands for the filters $H_i(fT_u)$ [5]. This situation gives rise to more relaxed filter specifications and thus reduces the overall system complexity (i.e. the overall number of operations). The overall number of multiplications required per input channel and per second is given by: $$M_{AS} = (L_G + L_H/2)2W (4a)$$ where $L_G$ and $L_H$ are the number of coefficients of the low-rate filters $\mathbf{G}_i$ and of the high-rate filters $\mathbf{H}_i$ respectively. Eq. (4a) takes into account the symmetric properties of the $\mathbf{H}_i$ filters of the 'mirror' channels in order to halve the corresponding number of operations. The overall number of multiplications can be estimated as a function of the channel spacing W, the number of channels $N_c$ and the filtering bandwidth B as [13]: $$M_{AS} = K W^{2} \frac{[W(N_{c} + 4) - 2B(N_{c} + 2)]}{[(W - B)(W - 2B)]}$$ (4b) where K is given by: $$K = -2 \operatorname{Log} \left[ 5 \, \delta_1 \, \delta_2 \right] / 3 \tag{5}$$ Fig. 2 - Illustration of the frequency demultiplexing performed by the Analytic Signal method. - a) FDM input signal spectrum $S(fT_u)$ ; - b) and c) Frequency response of the high rate channel filter $H_i(fT_u)$ ; - d) and e) Spectra of the filtered FDM signal: - f) and g) Spectra of the complex signal obtained by decimation over $N_c$ ; - h) and i) Frequency response of the low-pass prototype $G_i(fT_d)$ ; - m) and n) Spectra of the complex demultiplexed signal: - p) and q) Recovered baseband spectra $X_i(fT_d)$ . $\label{eq:Fig.3} \textbf{Fig. 3} - \text{Demultiplexer implemented by a digital polyphase network} \\ \text{and a FFT processor.}$ The terms $\delta_1$ and $\delta_2$ denote the overall acceptable in band and the out-of-band ripples respectively derived according to given system specifications; for example a filter design procedure is reported in [13]. It results from eqs. 4 that for specified values of B and $N_c$ an optimum value of the channel spacing $W_0$ can be found in order to achieve the lowest $M_{AS}$ . However, taking into account that for the subsequent demodulation operation an integer number of samples per symbol is convenient a suboptimum value of W closest to $W_0$ is generally used. To this end, a suitable choice of the DEMUX output sampling frequency 2W turned out to be 3 samples/symbol, i.e. W = 3R/4 with R the bit transmission rate. ## b) The Fast Fourier Transform with Polyphase Network Approach (FFT) The main principle of this approach is to share the same lowpass filter amongst all the FDMA channels. In particular this method permits the implementation of the demultiplexing using a polyphase network and a Fast Fourier Transform (FFT) processor [6]. The block diagram of the demultiplexer, reported in Fig. 3, is implemented by a cascade of a set of digital filters and a FFT processor. The set of $N_c$ filters is obtained by shifting a basic lowpass complex filter function along the frequency axis. The frequency response of this low pass prototype is shown in Fig. 4d). The transfer function H(z) of this filter establishes a relation between the z-transforms of the filter input and output sequences assumed to have the same sampling rate $f_u$ . By assuming a FIR filter with $KN_c$ (K integer) coefficients, we can write: $$H(z) = \sum_{i=0}^{KN_c - 1} a_i z^{-1} = \sum_{i=0}^{N_c - 1} z^{-n} H_n(z^{N_c})$$ (6) with: $$H_n(z^{N_c}) = \sum_{0}^{K-1} a_{mN+n} z^{-mN_c}$$ (7) The filter H(z) can be implemented by a network with $N_c$ paths, as shown in Fig. 3, which is called a polyphase network because each path has a frequency response which approximates that of a pure phase shift [3], [6]. The phase shifts are constant in frequency and are integer multiples of $2\pi/N_c$ . A change in sampling frequency by a factor of $N_c$ can be introduced, thus allowing the circuit in the different paths of the network to operate at the low frequency $f_d$ . $$\begin{bmatrix} B_{0}(z) \\ B_{1}(z) \\ \vdots \\ B_{N_{c}-1}(z) \end{bmatrix} = \begin{bmatrix} 1 & 1 & \dots & 1 \\ 1 & V & \dots & V^{N_{c}-1} \\ \vdots & \vdots & \ddots & \vdots \\ 1 & V^{N_{c}-1} & \dots & V^{(N_{c}-1)^{2}} \end{bmatrix} \begin{bmatrix} H_{0}(z^{N_{c}}) \\ z^{-1} & H_{1}(z^{N_{c}}) \\ \vdots & \vdots \\ z^{-(N_{c}-1)} & H_{N_{c}-1}(z^{N_{c}}) \end{bmatrix}$$ (10) The set of filters employed to implement the demultiplexing is formed by $N_c$ filters which cover the band 0 to $f_u/2$ ; the frequency response of these channel filters is sketched in Fig. 4. It is clear that this response can be obtained by shifting the basic complex lowpass filter function shown in Fig. 4d) along the frequency axis by integer multiple of $f_d/2$ . **Fig. 4** - Frequency dechannelization performed by the polyphase network with a FFT processor approach. a) Input FDM signals; b) and c) Frequency response of the channel filters; d) Basic filter response. From H(z), the basic filter z-transfer function, a translation in frequency by $(mf_u/2N_c)$ , m integer, appears as a change in the variable from z to z exp $[j2\pi m/N_c]$ . Thus the filter with index m has a transfer function $B_m(z)$ given by: $$B_m(z) = H[z \exp(j2\pi m/N_c]$$ (8) By applying the previously introduced decomposition of H(z) this becomes: $$B_m(z) = \sum_{0}^{N_c - 1} z^{-n} \exp(-j 2\pi m n / N_c) H_n(z^{N_c}) \quad (9)$$ By allowing for the fact that the functions $H_n(z^{N_c})$ are the same for all the filters $B_m(z)$ , $m=0,1\ldots N_c^{-1}$ , a factorization can be introduced which results in the following matrix equation where $$V = \exp(-j 2\pi/N_c)$$ . The square matrix is a DFT matrix. Thus the set of filters is realized by forming a cascade of the polyphase network and a discrete Fourier transform processor. The overall number of multiplications required per second and per channel $M_{\rm FFT}$ , when the DEMUX is implemented according to the Fast Fourier Transform with polyphase network approach, is now evaluated. The polyphase network, as previously outlined, is composed by $N_c$ digital filters, generated from a basic low-pass prototype. The number of coefficients of this filter can be given by: $$L_{\text{FT}} = 2/3 \text{ Log} \left[ \frac{1}{(10\delta_1 \delta_2)} \right] 2N_c W/(W - 2B)$$ (11) where W is the channel spacing, B is the (one-sided) filtering bandwidth, $N_c$ is the number of input channels, $\delta_1$ is the acceptable in-band filtering ripple and $\delta_2$ is the required out-of-band filtering ripple. Each filter of the polyphase network operates at the rate $f_d$ and has a number of coefficients given by: $$L_f = L_{\rm FT}/N_c \tag{12}$$ The FFT computer, connected in cascade with the polyphase network, operates at the rate $f_d$ and requires a number of real multiplications per second given by: $$M_{\rm FT} = 8WN_c \log_2 N_c \tag{13}$$ Hence, the overall number of real multiplications required per second and per channel can be derived as: $$M_{\rm FFT} = [L_f + 4\log_2 N_c] 2W \tag{14}$$ From (11), (13) and (14), the optimum channel spacing $W_0$ can be determined, in order to minimize the Fig. 5 - Multistage spectra and filters. Fig. 6 - Multistage implementation block structure. required number of multiplications $M_{\rm FFT}$ , from which a practically near optimum value for W can be chosen again leading to W=3R/4. #### c) The Multistage approach As in the Polyphase approach the multistage method is rather attractive whenever the number $N_c$ of demultiplexer channels is a power of two [8]. The signal is split in two bands by half band-filters and decimated by two. Either filter output is again split by two filters and decimated, leading to a division in 4 bands. After L stages of filtering and decimating, the channels obtained are $2^L$ . The filters are supposed to be complex in order to allow a very large transition bandwidth which leads to a small number of taps for each filter. Hence, a reduced amount of processing, even if complex, is required taking also into account that about half coefficients in the half-band filter with odd number of coefficients are of zero value. The spectrum of the signal is represented in Fig. 5, while a block diagram is shown in Fig. 6. It is important to notice that the structure (quite similar to a binary tree) is very modular because the filters are replicated at each stage. The structure guarantees also automatically a certain degree of flexibility because it is possible to obtain, from the intermediate stages, channels with different bandwidth. It is worthwhile to notice that the modularity brings some advantages in terms of redundancy in the sense that the same filter could be used in every stage, provided that the processing rate is sufficiently fast. It should be observed that, following the last stage of the tree structure, a pulse shaping filter (not shown in Fig. 6) is necessary to limit the bandwidth of each demultiplexer channel and to implement the pulse shaping filter. The overall implementation complexity in terms of multiplications required per channel and per second $M_{\rm MS}$ is given by: $$M_{\rm MS} = \{ [(N_F + 1)/2 + 1][\log_2 N_c - 1/2] + N_G \} 2W$$ (15) where $N_F$ and $N_G$ denote the number of coefficients of the complex half-band filters and last filter of the tree (including the required pulse-shaping function) respectively. As for the other methods the number of coefficients $N_F$ and $N_G$ can be estimated from the overall acceptable in band and out-of-band ripples, the filter bandwidth and the channel spacing. From an implementation point of view, some considerations and conclusions can be drawn about the three approaches, based upon both theoretical features and results from studies for ESA [13]: - i) per-channel methods generally have higher computational complexity, smaller finite precision arithmetic sensitivity, greater flexibility, smaller control circuit complexity; - ii) block methods have lower computational complexity, higher finite-precision arithmetic sensitivity, smaller flexibility, greater control circuit complexity; - iii) multistage methods have computational complexity comparable with that of block methods, finite-precision arithmetic sensitivity and control circuit complexity comparable to per-channel methods, and intermediate degree of flexibility. On the basis of obtained results, the finite-precision arithmetic implementation of the block methods generally requires 2-to-3 bits more than the other methods, due to the greater finite-precision arithmetic sensitivity of the FFT block processor. The control circuit complexity for the block method may be comparable with (or in some limit cases even greater than) the complexity of the computational part, although a custom VLSI implementation of the control part might remove this drawback. Flexibility may be a critical issue for the demultiplexer. In some advanced applications different transmission rates are foreseen and likely the Multicarrier Demodulator will have to operate on many of them during the satellite lifetime, to allow for reconfigurability of traffic pattern. The transmission rates are in general multiple of the smallest one by factors as 2, 3, 5 and combinations of them. If we require the demultiplexer to operate at different rates for a fixed processed bandwidth the number of channels $N_c$ is inversely proportional to the transmission rate. Thus we can observe that: - i) block methods are able to operate only at a fixed value of $N_c$ (i.e. number of points of the block processor), therefore a specialised demultiplexer is required for each individual transmission rate. - ii) per-channel structures are well suited to variations of transmission rates and number of processed channels $N_c$ : it is only required to vary the filter characteristics (i.e. coefficients) and decimation factor, using only the necessary $(N_c)$ branches of the structure designed for the highest possible value of $N_c$ (lowest data rate). Moreover, the per-channel methods allow to process channels with different transmission rates within the same demultiplexer, as the $N_c$ paths are substantially independent. - iii) multistage structures have an intermediate degree of flexibility, as it allows variations of the transmission data rate, although limited to powers of two. #### 3. DIGITAL DEMODULATOR This section considers the digital implementation of a demodulator suitable for QPSK signals. In particular, the carrier and clock recovery approach necessary to perform a coherent demodulation of the demultiplexed QPSK signals are described. The benefits introduced by the possibility of network clock synchronization in terms of a reduction of the overal MCD implementation complexity will be investigated in sect. 4. a) Nonlinear estimation method of QPSK-modulated carrier phase. The block diagram of the phase estimation considered here is shown in Fig. 7. Its principle of operation will be briefly described in the following. Let the estimation period be $T_E$ and let it encompass $N_E$ mary symbols (each T-second long), where $T_E = N_E T$ . Suppose we wish to estimate the phase at the midpoint of the estimation interval and we let $N_E = 2N + 1$ , where N is the number of signal intervals before and after the interval where the phase is to be estimated [10]. It is evident that a delay equal to NT is necessary to make the estimation system physically realizable. In this contest and in presence of additive white Gaussian noise (AWGN) and zero frequency uncertainty, Fig. 7 with the dotted box eliminated (so that $x'_n = x_n, y'_n = y_n$ ) represents the optimal (maximum likelihood) estimator for m = 1, which corresponds to an unmodulated carrier. Obviously, if the carrier is phase modulated to one of m discrete phases (m equals 4 for QPSK signals), the above linear estimator is useless since during each successive symbol the phase takes on a different value. Suppose, however, that in the dotted box we insert the two-dimensional (complex) nonlinear function: $$x'_{n} + y'_{n} = F(p_{n})e^{jm\Phi_{n}}$$ (16) where $$F(p_n) = (x_n^2 + y_n^2)$$ and $\Phi_n = \tan^{-1}(y_n/x_n)$ . In words, for each symbol we perform a rectangular-to-polar transformation, multiply phase $\Phi_n$ by m, perform an arbitrary nonlinear transformation on $p_n$ . Finally we perform a polar-to-rectangular transformation on the result. We avoid describing the nonlinearity in this manner in Fig. 7, because in a practical implementation it becomes a read-only-memory, transforming a quantized 2-dimensional vector into another such vector. Multiplying the phase by m, along with the final operation of dividing the $\tan^{-1}$ function by m, gives Fig. 7 - General structure of the carrier phase nonlinear estimation system. $x_n$ : in-phase component of the received QPSK signal; $y_n$ : quadrature component of the received QPSK signal.