





## INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH

IN COMPUTER & COMMUNICATION ENGINEERING

Volume 11, Issue 12, December 2023



**Impact Factor: 8.379** 











| Volume 11, Issue 12, December 2023 |

| DOI: 10.15680/IJIRCCE.2023.1112027 |

# Study of Discrete Cosine Transform for Digital Signal Processing Application using CORDIC Algorithm

Srikant Bhusan<sup>1</sup>, Prof. Suresh S. Gawande<sup>2</sup>

M. Tech. Scholar, Department of Electronics and Communication, RKDF College of Engineering, Bhabha University, Bhopal, India<sup>1</sup>

Guide, Department of Electronics and Communication, RKDF College of Engineering, Bhabha University, Bhopal, India<sup>2</sup>

ABSTRACT:- Arithmetic coding compress is array of data very close to the size of the number of all possible permutations, which explains why compression has a theoretical limit. The process of compression is reduced to narrowing down the limits of chosen interval LOW and HIGH on every step, but the compression is achieved by choosing of the shortest fraction. Low-power design is one of the most significant challenges to maximize battery life time in portable devices and to save the energy during system process. Discrete Cosine Transform (DCT) is widely used in image and video compression process. Here in this paper, we review on low power Discrete Cosine Transform architecture by using varies methods. Discrete Cosine Transform (DCT) is most popular method used today in video compression systems. A number of algorithms have been proposed for implementation of the DCT. Loeffler (1989) has specified a new class of 1D-DCT using only 29 additions and 11 multiplications. To implement such an algorithm, one or more than one multipliers have to be integrated. This process requires a high occupation of silicon area. Arithmetic distribution is generally used for such algorithms. The coding for reconfigurable 8-point Discrete Cosine Transform (DCT) has been done using VHDL, under Xilinx FPGA.

**KEYWORDS**: - Discrete Cosine Transform (DCT), Inverse Discrete Cosine Transform (IDCT), Very High Speed Integrated Circuit Hardware Description Language (VHDL)

#### I. INTRODUCTION

Discrete Cosine Transform (DCT) is the most widely used transformation technique in image and video compression standards. Major applications of DCT are teleconferencing, remote medical consultation, facsimile transmission, satellite imaging and picture phones. It is used in the compression (redundancy reduction) of a wide range of signals such as speech, TV signals, colour print images, infrared images, synthetic aperture radar (SAR) images and surface texture [1, 2]. An image contains large amount of data which requires large amount of memory for its storage and also it is not convenient to transmit over limited channels. To solve this, the image data is to be compressed. Image compression is a technique which reduces the data in either lossless or lossy way. Lossy compression method compresses the image in different sizes depending on the quality of image needed for a particular application. For image compression, many standards were framed. JPEG is the widely used image compression standard which uses DCT to transform the image from spatial domain to frequency domain [3]. For the compression of still images, Joint Photographic Experts Group (JPEG) standards are used. For compression of motion video, Moving Picture Experts Group (MPEG) standards are used. All these compression standards need DCT for its operation. So, there is a need to design architecture for DCT which can be used for image compression. DCT computation involves huge computation and so the architecture requires larger area which leads to more power dissipation.



e-ISSN: 2320-9801, p-ISSN: 2320-9798| <a href="https://www.ijircce.com">www.ijircce.com</a> | Impact Factor: 8.379 | Monthly Peer Reviewed & Referred Journal |

| Volume 11, Issue 12, December 2023 |

| DOI: 10.15680/IJIRCCE.2023.1112027 |



Figure 1: DCT

Many works have been carried out to minimize the multiplications in its architecture because it consumes more power and area as compared to additions. Also all the applications need low power with high speed without quality degradation [4, 5].

Few such areas include on-demand-video, broadcasting etc. It presents a challenge for the traditional video coding paradigms to fulfill the requirements posed by these applications. So, there is a need for the low cost and power encoding device possibly at the expense of slightly complex decoder. Additional challenge arises while trying to achieve the efficiency as of those achieved by the traditional coding techniques, like those of MPEG-x or H.26x when the complexity shifts from encoder to decoder.

#### II. LITERATURE REVIEW

Rih-Lung Chung et al. [1], recently, much research has focused on the design of biped robots with stable and smooth walking ability, identical to human beings, and thus, in the coming years, biped robots will accomplish rescue or exploration tasks in challenging environments. To achieve this goal, one of the important problems is to design a chip for real-time calculation of moving length and rotation angle of the biped robot. This paper presents an efficient and accurate coordinate rotation digital computer (CORDIC)-based efficient chip design to calculate the moving length and rotation angle for each step of the biped robot. In a previous work, the hardware cost of the accurate CORDIC-based algorithm of biped robots was primarily limited by the scale-factor architecture. To solve this problem, a binomial approximation was carefully employed for computing the scale-factor. In doing so, the CORDIC-based architecture can achieve similar accuracy but with fewer iterations, thus reducing hardware cost. Hence, incorporating CORDIC-based architecture with binomial approximation, pipelined architecture, and hardware sharing machines, this paper proposes a novel efficient and accurate CORDIC-based chip design by using an iterative pipelining architecture for biped robots. In this design, only low-complexity shift and add operators were used for realizing efficient hardware architecture and achieving the real-time computation of lengths and angles for biped robots. Compared with current designs, this work reduced hardware cost by 7.2%, decreased average errors by 94.5%, and improved average executing performance by 31.5%, when computing ten angles of biped robots.

**Neha K. Nawandar et al. [2],** Most of the digital signal processing applications performs operations like multiplication, addition, square-root calculation, solving linear equations etc. The physical implementation of these operations consumes a lot of hardware and, software implementation consumes large memory. Even if they are implemented in hardware, they do not provide high speed, and due to this reason, even today the software implementation dominates hardware. For realizing operations from basic to very complex ones with less hardware, a Co-ordinate Rotation Digital Computer (CORDIC) proves beneficial. It is capable of performing mathematical operations right from addition to highly complex functions with the help of arithmetic unit and shifters only. This paper gives a brief overview of various existing CORDIC architectures, their working principle, application domain and a comparison of these architectures. Different designs are available as per the target, i.e. high accuracy and precision, low area, low latency, hardware efficient, low power, reconfigurability, etc. that can be used as per the application in which the architecture needs to be employed.

Choudhary Sadhana et al. [3], in FPGAs framework, memory is one of the major restricting component for handling enormous information. Then, FPGAs have limited on-chip memory, subsequently it requires proficient utilization of assets all together handle framework flaw like as force imperatives, size and on-request execution. There are a few methods of on-chip information pressure has been examined and considered by the analysts, and it will keep going to create. Minimization of intensity utilization and asset utilization, contributes towards the acknowledgment of power efficient elevated level information handling on FPGAs. In this paper, our fundamental intension is to show the viability of ABRC in terms of FPGA usage angles. ABRC don't use any look into table which permits decrease in

#### International Journal of Innovative Research in Computer and Communication Engineering



| e-ISSN: 2320-9801, p-ISSN: 2320-9798| www.ijircce.com | |Impact Factor: 8.379 | Monthly Peer Reviewed & Referred Journal |

| Volume 11, Issue 12, December 2023 |

| DOI: 10.15680/IJIRCCE.2023.1112027 |

entropy encoder of the memory utilization and give ordinary component in request to deal with compromise between accuracy of likelihood estimation and the speed of likelihood adaption. In the result investigation area, we furnished the examination with the existing procedures of coding, for example, JPEG, MQ-coder (i.e., JPEG2000 standard) and so on Xilinx FPGA, the considered ABRC engineering gives decreased memory, burns-through less power and similar working recurrence.

Tsounis et al. [4], in this paper, we assess the blunder strength of a picture information pressure IP center, a FPGA-based quickening agent of the CCSDS 121.0-B-2 calculation used to pack the ESA PROBA-3 ASPIICS Coronagraph System Payload picture information. We have improved a shortcoming infusion stage recently proposed for the SEU assessment of FPGA delicate processor centers to interface with the objective picture information pressure IP center and compute the needed for disappointment investigation picture quality measurements. Through a broad shortcoming infusion crusade, we break down the weakness of the picture pressure center against Single Event Upsets (SEU) in a SRAM FPGA arrangement memory. The delicate blunders are arranged and assessed relying upon their belongings in the activity of the pressure center and the nature of the remade pictures dependent on the underlying comparability list metric (SSIM). The test deficiency infusion results exhibit mistake flexibility characteristic to the picture pressure calculation usage that can be abused to tradeoff a satisfactory lossless pressure execution debasement or an immaterial impact on pressure loyalty for critical investment funds in FPGA asset use (23% LUTs and 17% FFs) utilizing a specific insurance of the pressure center modules.

IS Morina et al. [5], sound record size is generally bigger when contrasted with documents with text design. Enormous documents can cause different impediments as huge space necessities for capacity and a long enough time in the delivery cycle. Document pressure is one arrangement that should be possible to defeat the issue of huge record sizes. Math coding is one calculation that can be utilized to pack sound records. The number-crunching coding calculation encodes the sound record and changes one column of info images with a skimming point number and gets the yield of the encoding as various qualities more prominent than 0 and more modest than 1. The cycle of pressure and decompression of sound records in this examination is done against a few wave documents. Wave documents are standard sound record designs created by Microsoft and IBM that are put away utilizing PCM (Pulse Code Modulation) coding. The wave record pressure proportion acquired in this examination was 16.12 percent with a normal pressure measure season of 45.89 seconds, while the normal decompression time was 0.32 seconds.

Jiajia Chen et al. [6], with the introduction of high efficiency video coding (HEVC) standard which provides super compression efficiency, there has been a lot of research works on integer transform matrices that can provide good approximation to the discrete cosine transform (DCT) used in HEVC. Not only maintaining the coding performance, the hardware and power of the circuit to implement the derived integer DCT (Int-DCT) needs to be minimized. To address these multiple design considerations, a new multi-objective optimization algorithm is proposed in this paper to search for efficient Int-DCT matrix, which has the coding performance as close as possible to the transform in HEVC but implemented with reduced hardware and power. Experimental results show that the approximated Int-DCT matrix generated by the proposed algorithm can achieve almost the same coding performance as the transforms in HEVC measured in terms of BjØntegaard Delta rate. Meanwhile, the experiments demonstrate that the proposed 16-point Int-DCT can produce at least 15.5% and 26.8% lower circuit area in FPGA and ASIC respectively, compared with other state-of-the-art Int-DCT realizations which can provide similar coding performance.

S. U. Uvaysov et al. [7], the technique for information lossless pressure with starter arranging continuously is considered in the paper. The pressure technique depends on the examination of the recurrence dissemination of the approaching information stream, the determination of the consistent by arranging and, based on this, ensuing pressure. The blends of arranging and information pressure proposed in the article permits saving handling time and progressively deal with the organization regulator load. The equipment calculation execution on FPGA for arranging and packing information during stream handling of data is thought of. The arrangement makes it conceivable to execute a calculation as an IP center, with the capacity to adjust it to the qualities of tackling its undertaking of packing information, along these lines expanding framework execution. This calculation can be utilized to make implanted applications with restricted processing assets and time-basic necessities. The gadget, in light of the strategy considered, demonstrated stable activity in the errand of handling information from a multichannel arrangement of fast sensors. This calculation can be applied in taking care of issues of creation the broadcast communications organizations of appropriated control frameworks, information preparing subsystems Internet of Things and Internet of Robotic Things.

Linbin Chen Jie Han et al. [8], CORDIC or CO-ordinate Rotation Digital Computer is a quick, straightforward, intelligible and capable calculation which is utilized for enhanced Digital Signal Processing applications. In compatibility of velocity and exactness prerequisites of today's applications, we set forward variable emphases CORDIC calculation. In this calculation, to support speed we can diminish number of emphases in CORDIC calculation for particular exactness. This upgrades proficiency of customary CORDIC calculation which we have used

#### International Journal of Innovative Research in Computer and Communication Engineering



| e-ISSN: 2320-9801, p-ISSN: 2320-9798| www.ijircce.com | |Impact Factor: 8.379 | Monthly Peer Reviewed & Referred Journal |

| Volume 11, Issue 12, December 2023 ||

| DOI: 10.15680/IJIRCCE.2023.1112027 |

to figure Discrete Cosine Transform for picture preparing. One Dimensional Discrete Cosine Transform is executed by utilizing just 6 CORDIC squares which needs just 6 multipliers. Due to the straightforwardness in equipment rate of picture handling on FPGA is raised. Further increment in velocity can be accomplished by simultaneously preparing number of large scale pieces of a midst of DCT:

Mamatha I et al. [9], Discrete Fourier Transform is generally utilized as a part of sign preparing for unearthly investigation, sifting, picture upgrade, OFDM and so forth. Cyclic convolution based methodology is one of the strategies utilized for registering DFT. Utilizing this approach a N point DFT can be registered utilizing four sets of [(M-1)/2]-point cyclic convolution where M is an odd number and N=4M. This work proposes a design for convolution based DFT and its FPGA usage. Proposed design includes a pre-preparing component, systolic exhibit and a post handling stage. Handling component of systolic cluster utilizes a label bit to choose the kind of operation (expansion/subtraction) on the info signals. Proposed engineering is reproduced for 28 point DFT utilizing ModelSim 6.5 and blended utilizing Xilinx ISE10.1 utilizing Vertex 5 xc5vfx100t-3ff1738 FPGA as the objective gadget and can work at a greatest recurrence of 224.9MHz. The execution examination is done regarding equipment use and calculation time and contrasted and existing comparable models. Further, as the convolution based DCT has two systolic clusters like that of DFT, a bound together engineering is proposed for 1D DFT/1D DCT.

**L. Chen et al. [10],** Low-control design is a champion amongst the most basic challenges to help battery life in adaptable contraptions and to save the essentialness in the midst of system operation. In this paper, we propose a low-control DCT auxiliary arranging using a balanced multiplier-less CORDIC number juggling.

the proposed fabricating plan does not perform math operations of pointless bits in the midst of the CORDIC figuring. The test outcomes exhibit that we can diminish up to 26.1% power spread without deal of the last DCT results. Furthermore, the pace of the proposed basic arranging is extended around 10%. The proposed low-control DCT auxiliary designing can be associated with client contraptions and flexible sight and sound structures requiring high throughput and low-control.

#### **Problem Formulation**

The main goal of this thesis is to present a new high-performance, resource-efficient CORDIC algorithm which has an optimum trade-off between the computation time and hardware area for DSP applications. Modifications in the architecture of CORDIC algorithm will be suggested to reduce the hardware area and number of iterations needed for the algorithm to converge. The main requirements of real-time applications are high-throughput and low-latency. The fully-pipelined unrolled architecture will be proposed to achieve the high-throughput. The high-radix CORDIC algorithm will be proposed to reduce the iterations and computational time of the CORDIC algorithm. A scaling-free rotation will be employed in high-radix CORDIC algorithm to overcome the problem of a variable scale factor. Finally, the CORDIC algorithm will be configured in a circular rotation mode to perform the twiddle multiplication. A variable-length DCT algorithm will be implemented on FPGA using our proposed CORDIC algorithm.

#### III. DISCRETE COSINE TRANSFORM

A discrete cosine transform (DCT) express a finite sequence of data points in expressions of a sum of cosine functions oscillating at different frequencies. DCTs are mainly important to numerous applications in science and engineering, from lossy compression of audio(e.g.-MP3) and image(e.g. JPEG) (where small and high frequency components can be rejected), to spectral method for the numerical solution of partial differential equations. The use of cosine function instead of sine is critical for compression, since it turns out (as explained below) that fewer cosine functions are required to approximate a typical signal, where for differential equations cosines function express a particular choice of boundary conditions.

e-ISSN: 2320-9801, p-ISSN: 2320-9798 | www.ijircce.com | | Impact Factor: 8.379 | Monthly Peer Reviewed & Referred Journal |

#### | Volume 11, Issue 12, December 2023 ||

#### | DOI: 10.15680/IJIRCCE.2023.1112027 |



Figure 2: 8-point Discrete Cosine Transform

#### DCT output:

$$F(0) = 0.5(f(0) + f(1) + f(2) + f(3) + f(4) + f(5) + f(6) + f(7))\cos\frac{\pi}{4}$$

$$F(1) = 0.5[\{(f(0) - f(7)\}\cos\frac{\pi}{16} + \{f(1) - f(6)\}\cos\frac{3\pi}{16} + \{f(2) - f(5)\}\cos\frac{5\pi}{16} + \{f(3) + f(4)\}\cos\frac{7\pi}{16}]\}$$

$$F(2) = 0.5[\{(f(0) - f(3) - f(4) + f(7)\}\cos\frac{2\pi}{16} + \{f(1) - f(2) - f(5) + f(6)\}\cos\frac{6\pi}{16}]\}$$

$$F(3) = 0.5[\{(f(0) - f(7)\}\cos\frac{3\pi}{16} + \{f(6) - f(1)\}\cos\frac{7\pi}{16} + f(5) - f(2)\}\cos\frac{\pi}{16} + \{f(4) + f(3)\}\cos\frac{5\pi}{16}]\}$$

$$F(4) = 0.5[\{(f(0) + f(3) + f(4) + f(7) - f(1) - f(2) - f(5) - f(6))\cos\frac{\pi}{4}]\}$$

$$F(5) = 0.5[\{(f(0) - f(7))\cos\frac{5\pi}{16} + \{f(6) - f(1)\}\cos\frac{\pi}{16} + \{f(2) - f(5)\}\cos\frac{7\pi}{16} + \{f(3) + f(4)\}\cos\frac{3\pi}{16}]\}$$

$$F(6) = 0.5[\{(f(0) - f(7))\cos\frac{6\pi}{16} + \{f(6) - f(1)\}\cos\frac{5\pi}{16} + \{f(2) - f(5)\}\cos\frac{2\pi}{16}]\}$$

$$F(7) = 0.5[\{(f(0) - f(7))\cos\frac{7\pi}{16} + \{f(6) - f(1)\}\cos\frac{5\pi}{16} + \{f(2) - f(5)\}\cos\frac{3\pi}{16} + \{f(4) + f(3)\}\cos\frac{\pi}{16}]\}$$

#### IV. CORDIC ALGORITHM

The our proposed radix-8 CORDIC algorithm takes n/3 iterations to compute the total rotation and three additional iterations to compensate for the scale factor. The following points are discussed.

- Equations for the iteration, are proposed for the radix-8 CORDIC algorithm for circular rotation mode. The proposed equations process three bits of input angle in each iteration.
- The SRT division method is derived for radix-8 CORDIC algorithm to determine the selection function  $\sigma i$  and the convergence of the radix-8 CORDIC algorithm for an input angle  $z0 \in [-\pi/2,\pi/2]$ . The convergence of the radix-8 CORDIC algorithm is proved using the method of induction in two parts for i = 0 and i > 0.

#### International Journal of Innovative Research in Computer and Communication Engineering



e-ISSN: 2320-9801, p-ISSN: 2320-9798| <a href="https://www.ijircce.com">www.ijircce.com</a> | Impact Factor: 8.379 | Monthly Peer Reviewed & Referred Journal |

| Volume 11, Issue 12, December 2023 |

| DOI: 10.15680/IJIRCCE.2023.1112027 |

- A traditional way to compensate the scale factor is that all possible scale factors are precomputed and stored on a ROM and later it is compensated using shift and add operations. New methods is presented wherein the scale factor of each iteration is approximated using 5-bits and later shift and add operations are performed directly based on the value of the scale factor. This method does not require any ROM to store the precomputed scale factor.
- Fully-pipelined, unfolded architectures, redundant arithmetic and area-efficient, are proposed to compute the radix-8 CORDIC algorithm iterations. Redundant arithmetic is implemented to achieve high-throughput, whereas area-efficient architecture is proposed to reduce the total area. Unfolded pipelined architecture has separate hardware for each iteration/stage, so all stages can compute in parallel. This helps to increase the throughput of the our proposed CORDIC algorithm.

#### V. CONCLUSION

The radix-2 CORDIC algorithm exhibits a constant scale factor that can be treated as a system gain for many applications. However, for the high-radix CORDIC algorithm and advanced hybrid CORDIC algorithm [18], the estimation and compensation of the scale factor is main problem. Each iteration scales up the rotating vector by a factor, which must be compensated to get the correct value of the rotating vector. High-radix and advanced hybrid CORDIC algorithms have a scale factor that depends on the selection function through input angle. A large ROM and several adders are required to store and compensate for the precomputed variable scale factor, which results in considerable overhead for many real-time applications. Additional iteration(s) may be required to compensate for such a complicated scale factor.

#### REFERENCES

- [1] Rih-Lung Chung, Yen Hsueh, Shih-Lun Chen and Patricia Angela R. Abu, "Efficient and Accurate CORDIC Pipelined Architecture Chip Design Based on Binomial Approximation for Biped Robot", Vol. 11, Issue 11, *Electronics* 2022.
- [2] Neha K. Nawandar and Vishal R. Satpute, "A study and comparison of Co-ordinate Rotation DIgital Computer (CORDIC) architectures", arXiv:2211.04053v1, 2022.
- [3] Choudhary Sadhana and Sarika Raga, "A Comparative Analysis at Binary Arithmetic Coders on FPGA System", International Conference on Industry 4.0 Technology, 136-140, IEEE 2020.
- [4] Tsounis, M.Psarakis, "Analyzing the Resilience to SEUs of an 'Image-Data' Compression Core in a COTS SRAM FPGA", NASA/ESA Conference, Colchester, UK, pp. 17-24, 2019.
- [5] IS Morina and PDP Silitonga, "Compression and Decompression of Audio Files Using the Arithmetic Coding Method" 6-1, Scientific Journal-of-Informatics, 2019
- [6] Jiajia Chen, Shumin Liu, Gelei Deng and Susanto Rahardja, "Hardware Efficient Integer Discrete Cosine Transform for Efficient Image/Video Compression", IEEE Access, Vol. 07, 2019.
- [7] S. U. Uvaysov, V. A. Kokovin, and S.S.Uvaysova, "Real-time sorting and lossless compression of data on FPGA," 2018 MWENT, Moscow, pp. 1-5, 2018.
- [8] Linbin Chen Jie Han; Weiqiang Liu; Fabrizio Lombardi, "Algorithm and Design of a Fully Parallel Approximate Coordinate Rotation Digital Computer (CORDIC)", IEEE Transactions on Multi-Scale Computing Systems, Vol. 3, Issue 3, PP. 139-151, IEEE 2017.
- [9] Mamatha I, Nikhita Raj J, ShikhaTripathi, Sudarshan TSB, "Systolic Architecture Implementation of 1D DFT and 1D DCT". International Conference on IEEE 2016.
- [10] L. Chen J. Han W. Liu F. Lombardi "On the design of approximate restoring dividers for error-tolerant applications", IEEE Trans. Comput. vol. 65 no. 8 pp. 2522-2533 Aug. 2016.
- [11] H. Jiang J. Han F. Lombardi "A comparative review and evaluation of approximate adders" Proc. 25th Great Lakes Symp. VLSI pp. 343-348, IEEE 2015.
- [12] Teena Susan Elias and Dhanusha P B, "Area Efficient Fully Parallel Distributed Arithmetic Architecture for One-Dimensional Discrete Cosine Transform", 2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT) IEEE 2014.
- [13] Uma Sadhvi Potluri and Arjuna Madanayake, "Improved 8-Point Approximate DCT for Image and Video Compression Requiring Only 14 Additions", IEEE Transactions On Circuits And Systems—I: Regular Papers, 2014.
- [14] E. JebamalarLeavline, S.Megala and D.Asir Antony Gnana Singh, "CORDIC Iterations Based Architecture for Low Power and High Quality DCT", 2014 International Conference on Recent Trends in Information Technology 978-1-4799-4989-2/14/\$31.00 © 2014 IEEE.
- [15] K.Kalyani, D.Sellathambi and S. Rajaram, "Reconfigurable FFT using CORDIC based architecture for MIMO-OFDM receivers", Thiagarajar College of Engineering, Madurai-2014.













### INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH

IN COMPUTER & COMMUNICATION ENGINEERING







📵 9940 572 462 🔯 6381 907 438 🖂 ijircce@gmail.com

