

# International Journal of Innovative Research in Computer and Communication Engineering

(An ISO 3297: 2007 Certified Organization)

Website: <u>www.ijircce.com</u>
Vol. 5, Issue 7, July 2017

### Review Paper on FIR Filter based on Vedic Multiplier and Kogge Stone Adder

Seema Patidar and Prof. Alok Kumar

M. Tech. Scholar, Dept. of Electronics and Communication, Sagar Institute of Research Technology & Science (SIRTS), Bhopal, India

Assistant Professor, Dept. of Electronics and Communication, Sagar Institute of Research Technology & Science (SIRTS), Bhopal, India

**ABSTRACT**: Multiplication is an important function in arithmetic operations. A CPU (central processing unit) devotes a considerable amount of processing time in performing arithmetic operations. Multiplication requires substantially more hard-ware resources and processing time than addition and sub-traction. Digital signal processors (DSPs) are the technology that is omnipresent in engineering Discipline. Fast multiplication is very important in DSPs for digital filter, convolution, Fourier transforms etc. In this proposed research work an attempt will make for making a novel FIR filter using Vedic multiplier modified Kogge stone adder. The implementation FIR filter based on Vedic multiplier will have not only fast response but also having less number of component, area and path delay.

KEYWORDS: - Vedic Multiplier, FIR Filter, Modified Kogge Stone Adder

#### I. INTRODUCTION

Several attempts have, therefore, been made to develop dedicated and reconfigurable architectures for realization of FIR filter in application specific integrated circuits (ASIC) and field-programmable gate arrays (FPGA) platforms. Systolic designs represent an attractive architectural paradigm for efficient hardware implementation of computation-intensive DSP applications, being sup-ported by the features like simplicity, regularity and modularity of structure [1-2]. In addition, they also possess significant potential to yield high-throughput rate by exploiting high-level of con-currency using pipelining or parallel processing or both [3]. To utilize the advantages of systolic processing, several algorithms and architectures have been suggested for systemization of FIR filters [4-5]. However, the multipliers in these structures re-quire a large portion of the chip-area, and consequently enforce limitation on the maximum possible number of processing elements (PEs) that can be accommodated and the highest order of the filter that can be realized. The multiplier-less distributed arithmetic (DA)-based technique has gained substantial popularity, in recent years, for its high-throughput processing capability and increased regularity which results in cost-effective and area-time efficient computing structures [6]. The main operations required for DA-based computation of inner product are a sequence of lookup table (LUT) accesses followed by shift-accumulation operations of the LUT output. DA-based computation is well suited for FPGA realization, because the LUT as well as the shift-add operations, can be efficiently mapped to the LUT-based FPGA logic structures [7].

In FIR filtering, one of the convolving sequences is derived from the input samples while the other sequence is derived from the fixed impulse response coefficients of the filter. This behavior of the FIR filter makes it possible to use DA-based technique for memory-based realization. It yields faster output com-pared with the multiplier-accumulator-based designs because it stores the precomputed partial results in the memory elements [8], which can be read out and accumulated to obtain the de-sired result. The memory requirement of DA-based implementation for FIR filters, however, increases exponentially with the filter order. DA was first introducing [9] and was further developed [10] for efficient implementation of digital filters. Attempts are made to use offset-bi-nary coding [11] to reduce the ROM size by a factor of 2. An LUT-less adderbased DA approach has been suggested by Yoo and Anderson, where memory-space is reduced at the cost of additional adders [12]. Memory-partitioning and the multiple memory-bank approach along with flexible multibit data-access



## International Journal of Innovative Research in Computer and Communication Engineering

(An ISO 3297: 2007 Certified Organization)

Website: <u>www.ijircce.com</u>
Vol. 5, Issue 7, July 2017

mechanisms are suggested for FIR filtering and inner-product.

Multiplication is an important fundamental function in arithmetic operations. Multiplication-based operations such as Multiply and Accumulate(MAC) unit and inner products are some of the frequently used Computation- Intensive Arithmetic Functions currently implemented in many Digital Signal Processing (DSP) applications such as convolution, Fast Fourier Transform(FFT), filter circuits and in microprocessors in its arithmetic and logic unit (ALU). Since multiplication dominates the execution time of most DSP algorithms, so there is a need of high speed multiplier. Currently, multiplication time is still the dominant factor in determining the instruction cycle time of a DSP chip.

In this work we have put into effect a high speed Vedic multiplier using barrel shifter. The sutra was implemented by modified design of "Nikhilam Sutra" due to its feature of reducing the number of partial products. The barrel shifter is used at different levels of designs to reduce the delay when compared to conventional multipliers. The hardware implementation of Vedic multiplier is using barrel shifter contributes to adequate improvement of the speed.

In many DSP algorithms, the multiplier lies in the critical delay path and ultimately determines the performance of algorithm. The speed of multiplication operation is great importance in DSP as well as in general processor. In past multiplication was implemented with a sequence of addition, subtraction and shift operations. There have been many algorithms proposals to perform the multiplication, and each offering different advantages and having in terms of speed, circuit complexity, area and power consumption.

The multiplier is a fairly large block of a computing system. For multiplication algorithms performed in DSP applications latency and throughput are the two major concerns from delay perspective. Latency is the real delay of computing a function, a measure of how long the inputs to a device are stable is the final result available on outputs. Throughput is the measure of how many multiplications can be performed in a given period of time multiplier is not only a high delay block but also a major source of power dissipation. That's why if one also aims to minimize power consumption, it is of great interest to reduce the delay by using various delay optimizations.

Advanced multipliers are the center parts of all the computerized signal processors (DSPs) and the rate of the DSP is generally controlled by the velocity of its multipliers. Two most basic duplication calculations followed in the computerized equipment are exhibit increase calculation and Booth augmentation calculation. The calculation time taken by the exhibit multiplier is relatively less on the grounds that the halfway items are ascertained autonomously in parallel. The postponement connected with the exhibit multiplier is the time taken by the signs to spread through the entryways that shape the Multiplication cluster. Corner increase is another vital augmentation calculation. Extensive corner clusters are required for fast duplication and exponential operations which thus require expansive halfway aggregate and incomplete convey registers. Duplication of two n-bit operands utilizing a radix-4 corner recording multiplier requires roughly n/(2m) clock cycles to create the minimum noteworthy portion of the last item, where m is the quantity of Booth recorder snake stages. Hence, an extensive spread deferral is connected with this case.

#### II. LITERATURE REVIEW

Deepak Kumar Patel et al. [1], Speed and area are now a day's one of the fundamental design issues in digital era. To increase speed, while doing the multiplication or addition operations, has always been a basic requirement of designing of advanced system and application. Carry Select Adder (CSA) is a fastest adder used in many processors to accomplish fast arithmetic function. Many different adder architecture designs have been developed to increase the efficiency of the adder. It is very commonly known that per second any processors performed millions of work functions in semiconductor industry. So when we do designing of multipliers, one of the main standards is performing speed that should be taken in the mind. In this paper, we propose a technique for designing of FIR filter using multiplier based on compressor and carry select adder. Performance of all adder designs is implemented for 16, 32 and 64 bit circuits. These structures are synthesized on Xilinx device family.

B. Madhu Latha et al. [2], a 8-bit Vedic multiplier is enhanced as far as transmission deferral when contrast and the additional unsurprising multipliers. We have utilized 8-bit barrel shifter which desires for stand out clock cycle for "n" measure of movements in our anticipated configuration. The course of action is executed and checked utilizing FPGA and ISE Simulator. The focal part was executed on Xilinx Spartan-6 family xc6s1x75T-3-fgg676 FPGA. The transmission deferral complexity was excerpted from the blend report and static timing report as well. The basic configuration may achieve engendering postponement of 6.781ns by method for barrel shifter in base determination module and multiplier.



## International Journal of Innovative Research in Computer and Communication Engineering

(An ISO 3297: 2007 Certified Organization)

Website: <a href="https://www.ijircce.com">www.ijircce.com</a>
Vol. 5, Issue 7, July 2017

A Murali et al. [3], execution of Vedic multiplier is upgraded for spread postponement when contrasted and other ordinary multiplier like exhibit multiplier, Braun multiplier, altered corner multiplier and Wallace tree multiplier. For math duplications different Vedic augmentation methods are utilized. It has been found that Urdhva tiryakbhyam Sutra is most productive Sutra, giving least defer for increase of a wide range of numbers, either little numbers or huge numbers. In our configuration we have used 8-bit barrel shifter which requires stand out clock for "n" number of movements. The configuration is actualized and confirmed utilizing FPGA and Mentor Graphics Simulators. The center was actualized on Xilinx Spartan-3E crew. The engendering postponement examination was separated from the amalgamation report and static timing report also. The configuration could accomplish proliferation deferral of 6.771ns utilizing barrel shifter as a part of base choice module and multiplier.

Mrs. Toni J. Billore et al. [4], this paper portrays the usage of a 8-bit Vedic multiplier utilizing quick viper improved as a part of terms of proliferation postponement when contrasted and ordinary multiplier. In our outline of 8 bit Vedic multiplier utilizing quick snake, we have used 8-bit barrel shifter which requires one and only clock cycle for "n" number of movements. The configuration of 8 bit Vedic multiplier utilizing barrel shifter is executed and confirmed utilizing FPGA and ISE Simulator. The center utilized here was actualized on Altera Cyclone® II 2C20 FPGA gadget programming. The proliferation postponement between 8 bit Vedic multiplier utilizing barrel shifter utilizing barrel shifter and utilizing quick snake examination was removed from the union report and static timing report too. The configuration which is executed here could accomplish spread deferral of 6.781ns utilizing barrel shifter obstruct as a part of base determination module and multiplier of building design utilized. In our undertaking, we make a correlation between execution investigation of 8 bit Vedic multiplier utilizing barrel shifter and utilizing quick viper.

Pavan Kumar et al. [5], This paper describes the implementation of an 8-bit Vedic multiplier enhanced in terms of propagation delay when compared with conventional multiplier like array multiplier, Braun multiplier, modified booth multiplier and Wallace tree multiplier. In our design we have utilized 8-bit barrel shifter which requires only one clock cycle for 'n' number of shifts. The design is implemented and verified using FPGA and ISE Simulator. The core was implemented on Xilinx Spartan-6 family xc6s1x75T-3-fgg676 FPGA. The propagation delay comparison was extracted from the synthesis report and static timing report as well. The design could achieve propagation delay of 6.781ns using barrel shifter in base selection module and multiplier.

#### III. VEDIC MULTIPLIER

As specified prior, Vedic Mathematics can be isolated into 16 unique sutras to perform scientific counts. Among these the Urdhwa Tiryakbhyam Sutra is one of the most exceedingly favored calculations for performing increase. The calculation is sufficiently able to beem ployed for the duplication of whole numbers and also binary numbers. The expression "Urdhwa Tiryakbhyam" started from 2Sanskrit words Urdhwa and Tiryakbhyam which mean "vertically" and "transversely" respectively. It depends on a novel idea through which the era of every single fractional item should be possible with the simultaneous expansion of these halfway items. The calculation can be summed up for n x n bit number. Since the incomplete items and their totals are figured in parallel, the multiplier is free of the clock recurrence of the processor. In this way the multiplier will require the same measure of time to figure the item and henceforth is free of the clock recurrence.

The net advantage is that it reduces the need of microprocessors to operate at increasingly high clock frequencies. While a higher clock frequency generally results in increased processing power, its disadvantage is that it also increases power dissipation which results in higher device operating temperatures. The processing power of multiplier can easily be increased by increasing the input and output data bus widths since it has a quite a regular structure. Due to its regular structure, it can be easily layout in a silicon chip. The Multiplier has the advantage that as the number of bits increases, gate delay and area increases very slowly as compared to other multipliers. Therefore it is time, space and power efficient.

To illustrate this multiplication scheme, let us consider the multiplication of two decimal numbers (14×12).

#### Example- 14×12

The right hand most digit of the multiplicand, the first number (14) i.e., 4 is multiplied by the right hand most digit of the multiplier, the second number (12) i.e., 2. The product 4 X 2 = 8 forms the right hand most part of the answer.



# International Journal of Innovative Research in Computer and Communication Engineering

(An ISO 3297: 2007 Certified Organization)

Website: <u>www.ijircce.com</u>
Vol. 5, Issue 7, July 2017

$$\begin{array}{c|c}
1 & 4 \\
1 & 2 \\
\hline
& 8
\end{array}$$

Now, diagonally multiply the first digit of the multiplicand (14) i.e., 4 and second digit of the multiplier (12)i.e., 1 (answer 4 X 1=4); then multiply the second digit of the multiplicand i.e., 1 and first digit of the multiplier i.e., 2 (answer 1 X 2 = 2); add these two i.e., 4 + 2 = 6. It gives the next, i.e., second digit of the answer. Hence swer is 6.

$$\frac{1}{1}$$
  $\times$   $\frac{4}{2}$ 

Now, multiply the second digit of the multiplicand i.e., 1 and second digit of the multiplier i.e., 1 vertically, i.e.,  $1 \times 1 = 1$ . It gives the left hand most part of the answer. Thus the answer is 16 8.

$$\begin{array}{c|ccccc}
 & 1 & 4 \\
\hline
 & 1 & 2 \\
\hline
 & 1 & 6 & 8
\end{array}$$

o Thus the answer is 16 8.

### IV. VEDIC MULTIPLIER USING KS ADDER

The multiplication of two numbers is done by using Urdhwa Triyakbhyam. Here first the least significant bits of the two digits are multiplied. Then the intermediate digits are cross multi-plied and added together. After this the most significant digits are multiplied. For the 16X16 bit multiplication small block of 2X2 or 4X4 or 8X8 multiplier were used in parallel to make the process easy and efficient.

In our proposed method the high speed carry select adder is replaced by the carry select adder along with Kogge Stone (KS) adder which claims to provide a better speed and less propagation delay. Here we have used four multiplier of 8 bit to perform 16 bit multiplication. The method used is the addition of all partial product formed by the cross multiplication of one bit with another. The LSB bits of first multiplier P1 (7-0) gives the LSB bits Q (7-0) of the final output. Another bits of first multiplier P1 (15-8) are added in series with LSB 8 bits of second multiplier to form the 16 bits, which in turn get added with 16 bits of third multiplier by using KS Adder. The LSB bits of the output of KS adder forms the Q (15-8) bits of the final output. The remaining 8 bit P2(15-8) is then added with the left 8 bits of KS output to from 16 bits, which is then added with 16 bits of the fourth multiplier by using KS 2 adder. The output from KS 2 adder forms the Q (31-16) bits. This is how the 32bit output is achieved in the less possible time.



Figure 1: Logic Diagram of Vedic Multiplier using Kogge Stone Adder



## International Journal of Innovative Research in Computer and Communication Engineering

(An ISO 3297: 2007 Certified Organization)

Website: <u>www.ijircce.com</u>
Vol. 5, Issue 7, July 2017

#### V. SIMULATION RESULT

All the designing and experiment regarding algorithm that we have mentioned in this paper is being developed on Xilinx 14.1i updated version. Xilinx 9.2i has couple of the striking features such as low memory requirement, fast debugging, and low cost. The latest release of ISE<sup>TM</sup> (Integrated Software Environment) design tool provides the low memory requirement approximate 27 percentage low. ISE 14.1i that provides advanced tools like smart compile technology with better usage of their computing hardware provides faster timing closure and higher quality of results for a better time to designing solution. ISE 14.1i Xilinx tools permits greater flexibility for designs which leverage embedded processors. The ISE 14.1i Design suite is accompanied by the release of chip scope Pro<sup>TM</sup> 14.1i debug and verification software. By the aid of that software we debug the program easily. Also included is the newest release of the chip scope Pro Serial IO Tool kit, providing simplified debugging of high-speed serial IO designs for Virtex-4 FX and Virtex-5 LXT and SXT FPGAs. With the help of this tool we can develop in the area of communication as well as in the area of signal processing and VLSI low power designing. To simplify multi rate DSP and DHT designs with a large number of clocks typically found in wireless and video applications, ISE 14.1i software features breakthrough advancements in place and route and clock algorithm offering up to a 15 percent performance advantage. Xilinx 14.1i Provides the low memory requirement while providing expanded support for Microsoft windows Vista, Microsoft Windows XP x64, and Red Hat Enterprise WS 5.0 32-bit operating systems.



Figure 2: RTL View of 8-bit Vedic Multiplier



Figure 3: RTL View of 4-bit Vedic Multiplier



# International Journal of Innovative Research in Computer and Communication Engineering

(An ISO 3297: 2007 Certified Organization)

Website: <u>www.ijircce.com</u>
Vol. 5, Issue 7, July 2017



Figure 4: RTL View of 2-bit Vedic Multiplier

#### VI. CONCLUSION

The high speed implementation of such a multiplier has wide range of applications in image processing, arithmetic logic unit and VLSI signal processing. The proposed 8x8 Vedic multiplier architecture has been designed and synthesized using on Xilinx software. The proposed Vedic Multiplier with modified Kogge stone adder is compared with the existing Vedic multiplier using Carry select adder along with Common Boolean Logic and can be inferred that proposed architecture is faster compared to existing Vedic multiplier. In future the proposed multiplier performance parameters can be improved by high level pipelining operations and applied in signal processing applications like image processing and video processing

### REFERENCES

- [1] Deepak Kumar Patel, Raksha Chouksey and Dr. Minal Saxena, "Design of Fast FIR Filter Using Compressor and Carry Select Adder", 2016 3rd International Conference on Signal Processing and Integrated Networks (SPIN).
- [2] G. Gokhale and P. D. Bahirgonde, "Design of Vedic Multiplier using Area-Efficient Carry Select Adder", 4th IEEE International Conference on Advances in Computing, Communications and Informatics (ICACCI-2015), Kochi, August 10-13, 2015, India.
- [3] G. Gokhale and Mr. S. R. Gokhale, "Design of Area and Delay Efficient Vedic Multiplier Using Carry Select Adder", 4th IEEE International Conference on Advances in Computing, Communications and Informatics (ICACCI-2015), Kochi, August 10-13, 2015, India.
- [4] Pavan Kumar, Saiprasad Goud A, and A Radhika had published their research with the title "FPGA Implementation of high speed 8-bit Vedic multiplier using barrel shifter", 978-1-4673-6150-7/13 IEEE.
- [5] B.Madhu Lathal, B. Nageswar Rao, published their research with title "Design and Implementation of High Speed 8-Bit Vedic Multiplier on FPGA" International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, Vol. 3, Issue 8, August 2014.
- [6] A Murali, G Vijaya Padma, T Saritha, published their research with title "An Optimized Implementation of Vedic Multiplier Using Barrel Shifter in FPGA Technology", Journal of Innovative Engineering 2014, 2(2).
- [7] Sweta Khatri, Ghanshyam Jangid, "FPGA Implementation of 64-bit fast multiplier using barrel shifter" Vol. 2 Issue VII, July 2014 ISSN: 2321-9653
- [8] Toni J.Billore, D.R.Rotake, "FPGA implementation of high speed 8 bit Vedic Multiplier using Fast adders" Journal of VLSI and Signal Processing, Volume 4, Issue 3, Ver. II (May-Jun. 2014), PP 54-59 e-ISSN: 2319 4200, p-ISSN No.: 2319 4197.
- [9] S. S. Kerur, Prakash Narchi, Jayashree C N, Harish M Kittur and Girish V A, "Implementation of Vedic Multiplier for Digital Signal processing" International Conference on VLSI, Communication & Instrumentation (ICVCI) 2011.
- [10] Vaibhav Jindal, Mr. Navaid Zafar Rizvi, Dinesh Kumar Singh "VHDL Code of Vedic Multiplierwith Minimum Delay Architecture" National Conference on Synergetic Trends in engineering and Technology (STET-2014) International Journal of Engineering and Technical Research ISSN: 2321-0869, Special Issue.
- [11] Bhavin D Marul, Altaf Darvadiya "VHDL Implementation of 8-Bit Vedic Multiplier Using Barrel Shifter" International Journal for Scientific Research & Development Vol. 2, Issue 01, 2014 | ISSN (online): 2321-0613.