

(An ISO 3297: 2007 Certified Organization) Website: <u>www.ijircce.com</u> Vol. 5, Special Issue 3, April 2017

# A VLSI-Efficient Signed Magnitude Comparator for { 2<sup>n</sup>-1, 2<sup>n</sup>, 2<sup>n+1</sup>-1} RNS using Kogge-Stone Adder

N.Mohana Priya<sup>1</sup>, M.Uma Maheswari<sup>2</sup>, G. Akila<sup>3</sup>

ME [Applied Electronics], Meenakshi College of Engineering, Chennai, India<sup>1</sup>

Assistant Professor, Meenakshi College of Engineering, Chennai, India<sup>2,3</sup>

**ABSTRACT:** Comparison of residue denotes in signed Residue Number System (RNS) involves sign detection and magnitude comparison. Both are difficult operations in RNS. This paper proposes a new signed magnitude comparator for the three- moduli set RNS  $\{2^{n}-1, 2^{n}, 2^{n+1}-1\}$ . Two subrange identifiers are computed to simplify sign detection and accelerate the magnitude comparison without requiring full reverse conversion, large modulo adders or lookup tables. Kogge stone adder(KSA) is a parallel prefix form carry look ahead adder. It generates carry in O (logn) time is widely considered as the fastest adder and is widely used in the industry for high performance arithmetic circuits. In KSA, carries are computed fast by computing them in parallel at the cost of increased area. In our proposed results to decrease area, delay and power consumption.

**KEY WORDS:** Moduli set, RNS, KSA, Carry look ahead adder.

### **I.INTRODUCTION**

Residue Number System (RNS) has found to be beneficial for implementing fixed-point application-Specific digital signal processing functions such as digital filter, cryptography, discrete mathematical transforms, etc. owing to its inherent parallelism, modularity and localized carry propagation in addition, subtraction and multiplication.

The straightforward method to perform magnitude comparison in RNS is to convert the operands back into weighted binary representation by Chinese remainder theorem(CRT) for evaluation. This direct method is highly inefficient as the residue-to-binary conversion requires large modulo arithmetic and reduction operations. Over the years, several techniques have been proposed to compare the magnitudes of two residue numbers. The main techniques used are core function, parity check and partial reverse conversion. The core function method requires an iterative descent andlift process to compute the critical core values.

Comparator sketch featuring wide-range applications and high-speed operation using only conventional digital CMOS cells. Our comparator exploits a novel scalable parallel prefix structure that leverages the comparison outcome of the most significant bit, further bitwise toward the least significant bit only when the compared bits are equal. This method reduces dynamic power dissipation by eliminating unwanted transitions in a parallel prefix structure that generates the N-bit comparison result after  $(log_4 N)+(log_{16}N)+4$  CMOS gate delays.



(An ISO 3297: 2007 Certified Organization)

Website: <u>www.ijircce.com</u>





Fig 1 Algorithm for 8-bit comparison

Fig.1 shows the parallel prefix structure of 8 bit comparison. The parallel prefix structure is designed for 16,32 and 64 bit architectures and the reports from the Xilinx tool concludes that for every bit range twice the delay, memory, LUT and power has not twice up to the mark. But, In the proposed design of my project, each and every part in the parallel prefix structure will be recovered by universal logic (multiplexer) and the obtained results will be compared with existed design for the same device specifications. By performing this changes in the architecture will leads to reduction in power consumption and in delay parameters.

In this paper, a new signed magnitude comparison algorithm for the three-moduli set  $\{2n-1, 2n, 2n+1-1\}$  is proposed. This moduli set has efficient modulo operations, sign detector and residue-to-binary converter by using kogge stone adder. The proposed algorithm avoids full magnitude evaluation and independent sign detector and has completely eliminated large modulo adders, which leads to a significant reduction in area, delay and power consumption. They get accurate equally comparable dynamic range and the same number of bits for each modulus.

### **II. EXISTING SYSTEM**

A.ZERO CROSSING DETECTOR Zero Crossing Detector in detail with two variable circuits. In the starting paragraphs of the tutorial, you will learn zero crossing detector. Its working principle and theory behind the scene in easy to knowing their words. Towards the center of this tutorial, you will learn about 2 usages of zero crossing detector which are time marker generator and phase meter. Towards the last of article, we have drawn one more circuit diagram of zero crossing detector. A zero crossing detector is a comparator with the reference level set at zero. It is used for removing the zero crossings of AC signals and it can be made from an <u>operational amplifier</u> with an input voltage at its positive input. The input voltage is positive, the output voltage is also a positive value, when the input voltage is negative, the output voltage is also a negative value. The magnitude of the output voltage is a property of the operational amplifier and its power supply.

### B. A 32-BIT PARALLEL PREFIX TREE USING ZERO CROSSING DETECTOR AS DECISION MODULE

A 32-bit-comparator using zero detector as decision module was considered. In the comparators architectural overview, in this 1<sup>st</sup> case the A and B are 32-bit inputs 31 down to 0 [N-1 down to 0]. For this comparison resolution module A and B are the inputs which are compared from MSB to LSB. Here A31 = 0 and B31=0 i.e., A and B are equal, therefore left bus = right bus=0. Further towards LSB i.e. comparing (MSB-1) bits A30=1 and B30 =1 since A and B are equal left bus =right bus=0. Coming proceeding towards right i.e. comparing (MSB-2) bits A29=0 and B29=1 since A and B are unequal for the first time then left bus=0 and right bus=1. Further comparisons were deleted because A and B are unequal. The remaining bits in left and right buses were filled with zeros. And further to perform the OR operation for buses and finds comparator result.

### C.ZERO CROSSING DETECTOR USING MAGNITUDE COMPARATOR

This comparator make use of novel scalable parallel prefix constructs strategic advantage by comparing Most Significant Bit (MSB) outcomes which is scheduled bit wise towards the Least Significant Bit (LSB).Detector as



(An ISO 3297: 2007 Certified Organization)

Website: <u>www.ijircce.com</u>

#### Vol. 5, Special Issue 3, April 2017

decision module. This comparator reduces the power and area needed. The comparison proceeds from MSB to LSB by comparing A and B, if they are equal 0 is copied on to left and right buses without consideration bits are compared. The process continues till comparison resolution module finds the first un-equality. In case of first not sufficient present bit are compared from A is placed on to the left bus and bit from B is placed in the right bus. The rest of the comparisons were omitted and o s' were addition for both left and right bus contents. In decision module the contents of left bus and right bus are logically or-individually to choose whether A>B, AB otherwise A<B.Proposed N-Bit-Comparator with Zero Detector as decision moduleAs shown in the figure



Fig 2 Proposed N-Bit-Comparator with Zero Detector as decision module

#### D.ZERO CROSSING DETECTOR AND COMPARATOR

Comparators are basic operational amplifier circuits that compare two voltages simultaneously and switches the output according to the comparison. We can say zero crossing detection circuit is a comparator example an integrated adder/subtracter/zero-detector in which the zero detection completes well before the sum or variation is known. Previous zero detectors either required the addition to be there before they could complete, or were not well integrated with the ALU. Sums in half-adder form can be calculated very fastly (with the delay of a half adder), till now they have enough structure so that many of the properties of the final sum can be easily detected. Our zero detector is faster than any previously explained, needs only a small amount of extra circuitry in the ALU, and adds little or nothing to the overall delay of the ALU.

#### **III.PROPOSED SYSTEM**

In our proposed the RNS magnitude comparator can be used as a dsp applications by replacement of binary adder as a kogge stone adder. First and the second conditionscan be detected by the group generate signal gn:0 and the grouppropagate signal pn:0, respectively of an (n+1)-bit binaryparallel-prefix adder of A and B, where pn:0 is the bitwise ANDof the bit propagate signals  $iii \ p = a \oplus b$  for i=0 to n. Thus, theinput signal *cin*can be generated by a logical OR of gn:0 and pn:0, and merged into the (n+1)-bit parallel-prefix adder of A + B for QX computation, as shown in the architecture for n = 4. PX is generated by an n-bit carry save adder (CSA) with  $x_1, x_2$  and the least significant n bits of  $B, 2n \ n \ 1 \ 0 \ B \ bb- = L$ , as inputs of the outputs of the CSA are summed with *c* in as the input carry to an n-bit binary adder, without generating the carryoutput. Fig shows the architecture for the generation of PX and QX.







(An ISO 3297: 2007 Certified Organization) Website: <u>www.ijircce.com</u> Vol. 5, Special Issue 3, April 2017



Fig. 4 Architecture of the merged *cin*generator and parallel prefix kogge stone adderfor QX for n = 4.

Fig. 4 shows the architecture of the merged *cin*generator and parallel prefix kogge stone adder for QX for n = 4 signed magnitudecomparator after the computation of PX, PY, QX and QY. Theparity bit *s* is generated by XNORing the MSB of PX and theMSB of PY. This parity bit is used as a select input to a 2-to-1multiplexer. When sign  $(X\%) \neq$  sign (Y%), s = 0 and the output'X% > Y%' of the multiplexer is given by the MSB of PY.Otherwise the states of this and other flags 'X% < Y%'' and 'X% = Y%'' are determined by the outputs of the binarycomparators.

Two *n*-bit binary comparators are used to compare *PX* with *PY* and *x*1 with *y*1, and one (*n*+1)-bit binary comparatoris used to compare *QX* with *QY*. When sign (*X*%) = sign (*Y*%), *s* =1 and the output of the multiplexer '*X*%>*Y*%' is generated by acombinational logic circuit consisting of two AND and one ORgates. Meantime, the output flag '*X*%= *Y*%' is generated byANDing the outputs, '*PX* = *PY*', '*QX* = *QY*' and '*x*1 = *y*1', of the comparators. Since one and only one of the three output flagscan be asserted, '*X*%<*Y*%' can be generated by a two-input NORgate with '*X*%>*Y*%' and '*X*% = *Y*%' as its inputs. Fig. 5 shows the Binary comparators and combinational logic circuits of proposed signed RNS magnitude comparator.



Fig. 5Binary comparators and combinational logic circuits of proposed signedRNS magnitude comparator.

#### **KOGGE-STONE ADDER**

The Kogge–Stone adder is a parallel prefix form carry look-ahead adder. Other parallel prefix adders include the Brent-Kung adder, the Han Carlson adder, and the fastest known variation, the Lynch-Swartzlander Spanning Tree adder. ... However, wiring congestion is often a problem for Kogge–Stone adders. The complete functioning of KSA can be easily comprehended by analyzing it in terms of three distinct parts:



(An ISO 3297: 2007 Certified Organization) Website: <u>www.iiircce.com</u>

Vol. 5, Special Issue 3, April 2017

#### A.PRE PROCESSING

This step involves computation of generate and propagate signals corresponding too each pair of bits in A and B. These signals are given by the logic equations below:pi = Ai xor Bi gi = Ai and Bi

### **B.CARRY LOOK AHEAD NETWORK**

This block differentiates KSA from other adders and is the main force behind its high performance. This step involves computation of carries corresponding to each bitIt uses group propagate and generate as intermediate signals which are given by the logic equations below:

Pi:j = Pi:k+1 and Pk:jGi:j = Gi:k+1 or (Pi:k+1 and Gk:j)

#### **C.POST PROCESSING**

This is the final step and is common to all adders of this family (carry look ahead). It involves computation of sum bits. Sum bits are computed by the logic given below: Si = pi xor Ci-1

#### IV. EXPERIMENTAL RESULTS AND COMPARISON

The overall hardware cost and critical path delay of our proposed sign RNS magnitude comparator are A = 2AP + 2AQ + ABC + Agates and t = tP + tBC + tmux + tgates, respectively, where the subscripts *P*, *Q*, *BC*, *mux* and *gates* denote *PX* (or *PY*), *QX* (or *QY*), binary comparator, 2:1 multiplexer and logic gates, respectively. Specifically, *tBC* refers to the delay of the (*n*+1)-bit binary comparator and *t*comb = tAND3 + tOR3 + tAND4 + tNOR2.

The proposed comparator is compared against the unsigned RNS magnitude comparator [9] for the three-moduli set  $\{2n-1, 2n, 2n+1\}$  which has the same bit length for each residue and accurate smaller dynamic range. Sign detection module and glue logic are added to convert it into a signed RNS magnitude comparator.

The three designs are described in Verilog and synthesized and optimized for minimum delay by Synopsys Design Compiler version E-2010.12-SP4 using STM CMOS 65nm standard cell library. The areas in  $\mu m^2$  and critical path delays in *ns* are compared in Table I. On average, our proposed comparator saves the areas of the unsigned and signed RNS comparators of by 20.9 % and 40.8%, and reduces their critical path delays by 25.7% and 27.0%, respectively.

| Ν  | Unsigned[9] |             | Signed[9] |             | This  |
|----|-------------|-------------|-----------|-------------|-------|
|    | Delay       | Area        | Delay     | Area        | Delay |
|    | (ns)        | $(\mu m^2)$ | (ns)      | $(\mu m^2)$ | (ns)  |
| 4  | 0.89        | 1550.1      | 0.90      | 2078.4      | 0.72  |
| 8  | 1.13        | 3548.4      | 1.14      | 4804.8      | 0.88  |
| 12 | 1.26        | 5704.4      | 1.31      | 7662.7      | 0.90  |
| 16 | 1.36        | 8014.2      | 1.38      | 10656.8     | 0.96  |
| 20 | 1.44        | 9019.9      | 1.47      | 11796.9     | 1.02  |
| 32 | 1.56        | 10111.1     | 1.55      | 12875.1     | 1.11  |
| 64 | 1.66        | 11221.1     | 1.68      | 13345.2     | 1.24  |
|    |             |             |           |             |       |
|    |             |             |           |             |       |

Table I: Comparison of areas ( $\mu m2$ ) and critical path delays (ns).

The power consumptions of all designs are simulated by Synopsys Prime Time PX. Monte Carlo power simulation is performed by applying randomly generated vectors at a rate determined by the slowest designs for each n in Table I until the error of average power of each design is bounded below 2% with 99% confidence interval. The leakage power



(An ISO 3297: 2007 Certified Organization)

Website: <u>www.ijircce.com</u>

#### Vol. 5, Special Issue 3, April 2017

in  $\mu$ Wand total power in *m*Wof each RNS comparator are compared in Table II. On average, the proposed comparator consumes 20.70% and 41.23% lesser power than and its extension for signed RNS, respectively.

| Ν  | Unsigned[9] |       | Signed[9] |       | This    |
|----|-------------|-------|-----------|-------|---------|
|    | Leakage     | total | Leakage   | Total | Leakage |
| 4  | 82.66       | 1.12  | 113.36    | 1.57  | 60.09   |
| 8  | 189.83      | 2.86  | 257.48    | 3.88  | 147.77  |
| 12 | 291.19      | 4.52  | 388.32    | 6.11  | 248.43  |
| 16 | 415.71      | 6.34  | 545.95    | 8.31  | 294.31  |
| 20 | 443.03      | 7.00  | 557.18    | 9.24  | 339.41  |
| 32 | 473.16      | 8.00  | 577.18    | 10.0  | 420.5   |
| 64 | 511.11      | 9.11  | 600.22    | 12.1  | 510.1   |
|    |             |       |           |       |         |
|    |             |       |           |       |         |

Table II: Comparison of leakage power  $(\mu W)$  and total power (mW).

#### **V. CONCLUSION**

A novel low complexity, high-speed and low-power adderbased sign integer magnitude comparator is proposed for  $\{2n-1, 2n, 2n+1-1\}$  RNS. It consists of two *n*-bit CSAs, two *n*-bit and two (n+1)-bit kogge stone adders, two *n*-bit and one (n+1)-bit binary comparators, a 2-1 multiplexer and a few basic gates.

Our synthesis results show that the proposed signed magnitude comparator is even notably faster, smaller and less power hungry than the latest unsigned RNS magnitude comparator designed for an equally balanced but with accurate lower dynamic range three-moduli set.

#### REFERENCES

[1] N. S. Szabo and R. I. Tanka, *Residue Arithmetic and Its Applications to Computer Technology*. New York:McGraw-Hill, 1967.

[2] J. Chen and J. Hu, "Energy-efficient digital signal processing via voltage overscaling-based residue number system," IEEE Trans. VLSI Syst., vol.

21, no. 7, pp. 1322-1332, Jul. 2013.

[3] D. Schinianakis and T. Stouraitis, "Multifunction residue architectures for cryptography," *IEEE Trans. Cir. and Syst.-I*, vol. 61, no. 4, pp. 1156-1169, Apr. 2014.

[4] D. D. Miller, R. E. Altschul, J. R. King and J. N. Polky, "Analysis of the residue class core function of Akushskii, Burcev and Pak," in *Residue Number System Arithmetic, Modren Applications in Digital Signal Processing*, M. A. Soderstrand, W. C. Jenkins, G. A. Jullien, and F. J.Taylor, Eds. IEEE Press: New York, 1985, paper 7-2, pp. 390-401.

[5] A. Bellaour and M. I. Elmasry, "Low-Power Digital VLSI Design Circuits and Systems," Norwood, MA: Kluwer, 1995.

[6] B. Parhami, "Efficient hamming weight comparators for binary vectors based on accumulative and up/down parallel counters," *IEEE Trans.Circuits Syst.*, vol.56, no. 2, pp. 167–171, Feb. 2009.

[7] D. V. Ponomarev, G. Kucuk, O. Ergin, and K. Ghose, "Energy efficient comparators for superscalar datapaths," IEEE Trans. Comput., vol. 53, no. 7, pp.892–904, Jul 2004

[8] D. norris, "Comparator circuit," U.S. Patent 5 534 844, Apr. 3, 1995.

[9] F.Frustaci, S.Perri, M.Lanuzza and P.Carsonello, "Energy-efficient single clock cycle binary comparator," Int .J.Circuit Theory Appl., vol 40, no.3, pp.237-246, March.2012.

[10] H. J. R. Liu and H. Yao, "High-Performance VLSI Signal Processing Innovative Architectures and Algorithms", vol. 2. Piscataway, NJ: IEEE Press, 1998 [11] M. Lu and J. Chiang, "A novel division algorithm for the residue number system," *IEEE Trans. Comput.*, vol. 41, no. 8, pp. 1026-1032, Aug. 1992.

[12] B. Zarei, M. Askarzadeh, N. Derakhshanfard, and M. Hosseinzadeh, "A high-speed residue number comparator for the 3-moduli set  $\{2n-1, 2n+1, 2n+3\}$ ," in *Proc. Int. Symp. Signals Syst. and Electro.*, Nanjing, China, Sept.2010, pp. 1-4.

[13] G. Dimauro, S. Impedovo, and G. Pirlo, "A new technique for fast number comparison in residue number system," *IEEE Trans. Compt.*, vol. 42, No.5, pp. 608-612, May 1993.

[14] K. W. Glass, "Digital comparator circuit", U.S. Patent 5 260 680, Feb.13,1992

[15] S. Abdel-Hafeez, "Single rail domino logic for four-phase clocking scheme," U.S. Patent 6 265 899, Oct. 20, 2001.

[16] V. G. Oklobdzija, "An algorithmic and novel design of a leading zero detector circuit: Comparison with logic synthesis," IEEE Trans. Very Large Scale Integr (VLSI) Syst., vol. 2, no. 1, pp. 124–128, Mar. 1994.