# Design of Low Power 2-D Discrete Wavelet Transform using Hybrid Encoded Booth Multiplier

## S.Saravanan

Professor, Department of Electrical and Electronics Engineering, Muthayammal Engineering College, Tamilnadu, India

Abstract- The design of low power high performance 2-D Discrete Wavelet Transform (DWT) unit is presented in this paper. A low power multiplier with hybrid encoding scheme is proposed to reduce the power consumption compared with other common multipliers. Multiplication is the main arithmetic operation used in the lifting scheme of the DWT and the proposed method reduces the total power requirements. The lifting step and multiplier are designed and synthesized using XILINX SPARTRN 3E Field Programmable Gate Array (FPGA). The power consumption of the DWT with hybrid encoded Booth multiplier is compared with existing array and Booth multiplier. The simulation results show the total power dissipation of the DWT with hybrid multiplier saves 90% and 76% power compared with array and Booth multiplier.

*Index terms*- Discrete Wavelet Transform, Low power, VLSI Design, Booth Multiplier, Hybrid encoding.

#### **I.INTRODUCTION**

The growing popularity of portable and multimedia devices such as video phones and note books has motivated the research to design low power VLSI circuits in the recent past. The real time implementation of image processing system is expected to consume high computational power and high data throughput rate which limits the use of general purpose processors. Recently, Wavelet-based video coding technique has gained much attention. Based on this technique, Discrete Wavelet Transform (DWT) has been widely applied in many different fields of audio and video signal processing. DWT supports features like progressive image transmission, ease of compressed image manipulation, region of interest. In general, DWT can be implemented by direct convolution and some DWT architectures implemented by filter convolution have been proposed. The high algorithmic

performance of the 2D DWT in image compression justifies its use as the kernel of both the JPEG2000 still image compression standard and the MPEG-4 texture coding standard. The JPEG2000 can compress images 100 times smaller than the original image.

With this compression ratio, the reconstructed image of the JPEG2000 still provides good visual quality. However, such an implementation suffers from the disadvantage of a large number of computations and a large storage resource. A new scheme, termed the lifting scheme which reduces the number of computation, has been proposed for the DWT. Lifting scheme has several advantages, including in-place computation of the wavelet coefficients, integer-tointeger wavelet transform, and etc. Up to now, some 2-D DWT VLSI implementations have been proposed based on the lifting scheme. Due to the computational complexity of the DWT there has been a lot of focus on developing fast algorithms with high efficiency and low hardware cost. DSP algorithms have traditionally been used by dedicated DSP chips or Application Specific Integrated Circuits (ASIC). In the last few years the viability of Field Programmable Gate Arrays (FPGAs) has provided an alternative to their sequential counterparts in executing DSP algorithms. It is well known that if the number of switching activities increases dynamic power consumption also increases. This has further motivated the design of low power VLSI circuits, with new concepts. It is also clear that the reduction in power consumption and enhancement in the circuit design are expected to pose challenges in implementing digital image processing system, in which multiplication is the key computations. In the recent past, the researchers proposed various design methodologies on dynamic power reduction by minimizing the switching activities.

## II DIMENTIONAL DISCRETE WAVELET TRANSFORM

A 2-D DWT can be seen as a 1-D wavelet transform along the rows and then a 1-D wavelet transform along the columns. The 2-D DWT operates in a straightforward manner by inserting arrav transposition between the two 1-D DWT. The rows of the array are processed first with only one level of decomposition. This essentially divides the array into two vertical halves, with the first half storing the average coefficients, while the second vertical half stores the detail coefficients. This process is repeated again with the columns, resulting in four sub-bands within the array defined by filter output.

The LL sub-band represents an approximation of the original image, the LL1 sub-band can be considered as a 2:1 sub-sampled (both horizontally and vertically) version of the original image. The other three sub-bands HL1, LH1, and HH1 contain higher frequency detail information (mostly local discontinuities in the edges of the image). This process is repeated for as many levels of decomposition as are desired. The 2-D DWT is a multilevel decomposition technique.

A line-based architecture scan input image row by row manner to produce the wavelet coefficients. However, a block-based architecture scans the input image block-by-block and produces the wavelet coefficients for each block. Consequently, the main difference between the two methods is the selected image traversal method (based on complete image rows or on blocks). The line based architecture needs only few lines of the image to be stored, whereas traditional methods almost need the whole image (or tile) to be memorized. Thus, this technique does not require extra memory or external memory to store the intermediate data. Instead, some internal buffers are used to store the intermediate data, and the required memory size is proportional to image width or height. Our goal mainly focuses on the high-performance, low power consumption and hardware size.

## III HYBRID ENCODED LOW POWER MULTIPLIER

The process of the hybrid encoded multiplier is divided into encoder selection, partial product generation and partial product compression. The multiplier and the multiplicand are stored in register M1 and M2, the number of 1's in the multiplier is checked by the bit checker. Based on the number of 1's the encoder selector selects either proposed hybrid encoder or modified encoder. A clock gating circuit is used to avoid the simultaneous operation of the two encoders. In the partial product compression the partial products are compressed using 4:2 compressor and row bypassing can be used when the entire row of the PP is zero.



This is done by freezing the adder while the above condition occurs. This is expected to reduce the switching activity and hence power consumption. A column bypassing provision is provided at the final adder tree to avoid the unwanted addition operation. The detection logic circuit is used to detect the effective data range. If the part of the input data does not make any impact in the final computing results then the data controlling circuit freezes that portion to avoid unnecessary switching transitions. A glue circuit controls the carry and sign extension unit which manage the sign bit.

#### IV X-POWER OVERVIEW

XPower is the power-analysis software available for programmable logic design. It enables to interactively and automatically analyze power consumption for Xilinx FPGA and CPLD devices. XPower includes both interactive (xpower) and batch (xpwr) applications. Earlier in the design flow than ever, the total device power, power per-net, routed, partially routed or unrouted designs can be analyzed, all driven from a comprehensive graphic interface or command-line driven batch-mode. XPower also reads VCD simulation data from the ModelSim family of HDL simulators to set estimation stimulus, reducing setup time, as well as from the additional simulators listed in simulator support. Xpower tool flow is shown in Fig. 1.

Xpower files: XPower recognizes the following file types.

Design files: Design file is an NCD (FPGA device) or CXT (CPLD device) file that contains information about the design.

Physical constraints file: A Physical Constraints File (PCF) is a text file containing two separate sections: a section for those physical constraints created by the mapper and a section for physical constraints entered by the user. Temperature, voltage, max delay and time graphs are read from the PCF.

Settings file: Specifies a settings file (\*\_xpwr.xml) to be used by XPower. A settings file is an XML-based file that represents the current state of the power data, constrained by the reporting options that already have specified. This file is generated by XPower when settings are saved and is used to restore the settings.

Simulation file: Specifies a simulation file (\*.vcd) to be used by XPower. This file is the output of a simulation run on the design. IEEE standard VCD files are accepted for input of simulation data.

### V RESULTS AND DISCUSSIONS

The hardware codes for the proposed architectures were written in verilog-hardware description language and evaluated under ModelSim simulation tool. The architectures were synthesized by Xilinx FPGA express tools and evaluated on the Xilinx Spartan 3e FPGA The typical evaluation for the performance of the architectures of 2-D DWT includes the delay, power consumption and device utilization. The architectures of array multiplier, Booth multiplier, and proposed hybrid multiplier are simulated, implemented. Fig 9 and Fig 10 shows the device utilization and power consumption of the proposed hybrid encoded Booth multiplier.

The performance report of array multiplier, Booth multiplier and proposed hybrid multiplier are shown in Table 1.

| Table 1. | Performance | analysis | of various | multipliers |
|----------|-------------|----------|------------|-------------|
|          |             |          |            |             |

| •                 |                |                    |                 |  |  |  |
|-------------------|----------------|--------------------|-----------------|--|--|--|
|                   | Power<br>in mW | Device utilization |                 |  |  |  |
| Multipliers       |                | Number of          | Number of 4     |  |  |  |
|                   |                | Slices in %        | input LUTs in % |  |  |  |
| Array             | 88             | 36                 | 33              |  |  |  |
| Booth             | 75             | 18                 | 18              |  |  |  |
| Hybrid<br>encoded | 69             | 9                  | 8               |  |  |  |

The power consumption, delay and device utilization of array multiplier, Booth multiplier and proposed hybrid encoded multiplier are compared. From the results it is clear that the proposed hybrid multiplier's performance is better compared with other multipliers.

The architecture of 128×128 DWT has been synthesized by Xilinx FPGA Express tools, written in Verilog and synthesized on the Xilinx Spartan 3e FPGA. The power consumption, delay and device utilization of DWT with array multiplier, Booth multiplier and proposed hybrid encoded multiplier are compared in Table 2. From the results it is clear that DWT with the proposed hybrid multiplier's performance is better compared with other multipliers.

Table 2. Performance analysis of DWT with various multipliers

|                | Power<br>in<br>mW | Device utilization          |                                   |  |
|----------------|-------------------|-----------------------------|-----------------------------------|--|
| Multipliers    |                   | Number<br>of Slices<br>in % | Number of 4<br>input LUTs in<br>% |  |
| Array          | 2904              | 66                          | 59                                |  |
| Booth          | 652               | 22                          | 19                                |  |
| Hybrid encoded | 292               | 7                           | 6                                 |  |

# VI CONCLUSION

The design of low power high performance 2-D Discrete Wavelet Transform (DWT) unit is presented in this paper. A low power multiplier with hybrid encoding scheme is proposed to improve the power dissipation compared with other common multipliers. This paper implements a lifting step function used in second generation DWT. Multiplication is the main arithmetic operation used in the lifting scheme and the proposed method reduces the total power requirements. The lifting step and multiplier was designed and synthesized using XILINX SPARTRN 3e Field Programmable Gate array (FPGA). The power consumption of the DWT with hybrid multiplier is compared with existing array and Booth multiplier. The simulation results show the total power dissipation of the DWT with hybrid multiplier saves 90% and 76% power compared with array and Booth multiplier.

#### REFERENCES

- Unsal.O and Koren.I, "System-level poweraware design techniques in real-time systems," Proceedings of the IEEE, vol.91, no.7, pp.1-15, 2003.
- [2] Taubman.D.S, and Marcellin.M.W, "JPEG2000: Image Compression Fundamentals, Standards and Practice, Norwell, MA: Kluwer, 2002.
- [3] Dai.Q, Chen.X, and Lin.C, "A novel VLSI architecture for multi-dimensional discrete wavelet transform," IEEE Transaction on Circuits System and Video Technology., vol. 14, no. 8, pp. 1105–1110, Aug. 2004
- [4] Sweldens.G, "The Lifting Scheme: A Custom-Design Construction of Biorthogonal Wavelet," Applied and Computational Harmonic Analysis, vol. 3, pp. 186-200, 1996
- [5] Andra.K, Chakrabati.C and Acharya.T, "A VLSI Architecture for lifting-based forward and inverse wavelet transform," IEEE Transaction on Signal Processing, vol. 50, no. 4, pp.966-977, 2002.
- [6] S.Saravanan and M.Madheswaran, "Design of Low Power Multiplier with Reduced Spurious Transition Activity Technique for Efficient Neural Network", Journal of Computer Applications, Vol.1, Issue.4, pp. 8-13, 2008.
- [7] S.Saravanan and M.Madheswaran, "Design and analysis of a spurious switching suppression technique equipped low power multiplier with hybrid encoding scheme", International Journal of Computer Science and Information Security, Vol.6, No.3, pp.73-78, 2009.
- [8] S.Saravanan and M.Madheswaran, "Modified multiply and accumulate unit with hybrid encoded reduced transition activity technique equipped multiplier and low power 0.13μm adder for image processing applications", International Journal of Computer Applications, Vol.1, No.9, pp.57-62, 2010.
- [9] M.Madheswaran and S.Saravanan, "Design and Analysis of Low Power Multiply and Accumulate Unit Using Pixel Properties Reusability Technique for Image Processing Systems", ICTACT journal on Image and Video Processing, Vol.3, No.1, pp. 459-466, 2012.
- [10] C.Balasubramanian, S.Saravanan, K.G.Srinivasagan and K. Duraiswamy," Automatic Segmentation of Brain Tumor from

MR Image Using Region Growing Technique" Life Science Journal, Vol.10, No.2, 2013.

- [11] A.Sathishkumar and S. Saravanan, "Analysis and Design of Low Power High Speed Dynamic Latch Comparator using CMOS, "International Journal of Scientific and Engineering Research, Vol. 5, No.5, pp.173-177, 2014.
- [12] A.M. Nirmala and S. Saravanan, "A Study on Clustering Technique on Matlab, "International Journal of Science and Research, Vol. 3, No.11, pp.1497-1502, 2014.
- [13] S.Saravanan , V.M.Senthilkumar and Aksa David, "Power Reduction in SRAM- Based Processor Units Using 7T HETTs, "International Journal of Soft Computing, Vol. 10, No.2, pp.211-217, 2014.
- [14] A.Aayathullah, P.SaravanaKumar, A.Sathish Kumar, S.Saravanan, "Design and Analysis of Low Power Digital Signal Processor Architecture for WSN Using Folded Tree", International Journal of Innovative Research in Science, Engineering and Technology, Vol. 4, Special Issue 6, pp. 411-421, May 2015.
- [15] P.C.Shebil, S. Selvarasu, S. Saravanan, T.Kowsalya, "A Feasible Solution for both DAB and DAB+ Audio Decoding", International Journal of Innovative Research in Science, Engineering and Technology, Vol. 4, Special Issue 6, pp. 828-835, May 2015.
- [16] T.Jeeva, G.Mohanraj, A.Sathish Kumar, S.Saravanan., "A Low Transition Test Pattern Generation of Multiple SIC Vectors Based on Bist Schemes", International Journal of Innovative Research in Science, Engineering and Technology, Vol. 4, Special Issue 6, pp. 422-427, May 2015.
- [17] S.Saravanan, V.M.Senthil Kumar, R.Kavipriya, R.Devika, S.Divya, G.K.Gowri Shankari, "IG-FinFET Based 6T-SRAM Design Using SVL Technique", International Journal of Innovative Research in Science, Engineering and Technology, Vol. 4, Special Issue 6, pp. 445-452, May 2015.
- [18] A.Sathish Kumar,S. Saravanan,R Sakthivel, "Design of modified explicit pulse data- closeto-output flip-flop", ARPN Journal of Engineering and Applied Sciences, vol.10, pp.10361-10366, 2015

- [19] K.Raja, S.Saravanan, Sunanda Saga, "Clock-Gated Double-Edge Triggered Flip-Flop for Effectual Power Reduction", International Journal of Innovative Research in Science, Engineering and Technology, Vol. 4, Special Issue 6, pp. 513- 519, May 2015.
- [20] V.M.SenthilKumar, S.Saravanan, Yazhini, "Removal of Asynchronous Data Sampling Error in DET Half Static and Clock-Gated D Type Flip Flop", International Journal of Innovative Research in Science, Engineering and Technology, Vol. 4, Special Issue 6, pp. 562-568, May 2015.
- [21] S.Manoj Kumar, P.SaravanaKumar, A.Sathish Kumar, S.Saravanan, "Early Tag Access for Reducing L1 Data Cache Energy for Memory Access", International Journal of Innovative Research in Science, Engineering and Technology, Vol. 4, Special Issue 6, pp. 428-437, May 2015.
- [22] S.Saravanan, V.M.SenthilKumar, Anitha, "Modified Design to Enhance NBTI Recovery in Conventional SRAM Cell", International Journal of Innovative Research in Science, Engineering and Technology, Vol. 4, Special Issue 6, pp. 569-574, May 2015.
- [23] R. Thillaikkarasi, S. Saravanan and P. Timple Dhivya, "MPEG Video Compression Using DPCM and Neural Networks", International Journal of Advances in Engineering and Emerging Technology, vol.6, No.2, pp.108-112, 2015
- [24] S.Selvarasu, S.Saravanan, "Power Optimization in DRAM using FinFET", International Journal of Innovative Research in Science, Engineering and Technology, Vol. 4, Special Issue 6, pp. 910-915, May 2015.
- [25] Krishnamoorthy Raja, Siddhan Saravanan, "A New Clock Gated Flip Flop for Pipelining Architecture", Journal of Circuits and Systems, DOI: 10.4236/cs.2016.78119, pp.1361-1368, 2016
- [26] S. Saravanan, K. Raja, "Shared Processing Element Architecture for An Area and Power Efficient FIR Filter Design using Double Base Number System" Asian Journal of Research in Social Sciences and Humanities, Vol. 6, No. 8, August 2016, pp. 2513-2520.

- [27] S.Saravanan, V.M.Senthilkumar, "Design of a Reduced Carry Propagation Adder using FinFET" Asian Journal of Information Technology, Vol.15, No.11, 2016, pp.1670-1677.
- [28] S.Selvarasu, S.Saravanan, "Reducing Power Consumption in Low Power 13T SRAM Cells Using Adiabatic logic, Taga Journal, Vol.14, pp.3413-3427, 2018.
- [29] S.Saravanan, A.Sathishkumar, "A Low Noise Dynamic Comparator with Offset Calibration for CMOS Image Sensor Architecture", Journal of Circuits, Systems, and Computers, Vol.28, Issue 2, pp.1-12, 2019.
- [30] C. Ananth, M. Karthikeyan, N. Mohananthini, S. Saravanan, M. Swathisriranjani," Multiple Watermarking for Images using Back-Propagation Neural Network and DWT," International Journal of Research Engineering and Advanced Technology, Vol.9, Issue.1, pp.4088-4093, 2019
- [31]P. Priyadharshini, R. Thillaikkarasi, S. Saravanan," Image Segmentation Using Classification of Radial Basis Function of Neural Network in Brain Tumor Detection," International Research Journal of Engineering And Technology, Vol.5, Issue.12, Pp.989-992, 2019
- [32] R. Valarmathi, S. Saravanan," Exudate characterization to diagnose diabetic retinopathy using generalized method" Journal of Ambient Intelligence and Humanized Computing, https://doi.org/10.1007/s12652-019-01617-3, 2019
- [33] R. Thillaikkarasi, S. Saravanan," An Enhancement of Deep Learning Algorithm for Brain Tumor Segmentation Using Kernel Based CNN with M-SVM," Journal of Medical Systems, https://doi.org/10.1007/s10916-019-1223-7, 2019
- [34] S. Selvarasu, S. Saravanan," Hybrid on-chip soft computing model for performance evaluation of 6T SRAM cell using 45-nm technology," Springer Journal of Soft Computing, https://doi.org/10.1007/s00500-019-04581-4, 2019
- [35] S.Saravanan, K.Raja," Design of a Low Power ECG Signal Processor for Wearable Health System-Review and Implementation Issues, in

11th IEEE International Conference on Intelligent Systems and Control, Organized by Karpagam College of Engineering, 5-6, January, 2017.

- [36] S.Saravanan and M.Madheswaran, "Design of hybrid encoded Booth multiplier with reduced switching activity technique and low power 0.13µm adder for DSP block in wireless sensor node," in Proceeding of International Conference on Wireless Communication and Sensor Computing, SSN College of Engineering, 2010.
- [37] S.Saravanan, K.Raja," Design of a Spike Detector for Fully Integrated Neuromodulation SoC, in 11th IEEE International Conference on Intelligent Systems and Control, Organized by Karpagam College of Engineering, 5-6, January, 2017.
- [38] S.Saravanan and M.Madheswaran," Design of low power, high performance area efficient shannon based adder cell for neural network training", 2009 International Conference on Control, Automation, Communication and Energy Conservation, Organized by Kongu Engineering College, 4-6 June, 2009.
- [39] S.Saravanan and M.Madheswaran, "Design of low power multiplier with reduced spurious transition activity technique for wireless sensor network," in Proceeding of Fourth International Conference on Wireless Communication & Sensor Networks, Indian Institute of Information Technology, 2008.
- [40] V.M.Senthil Kumar, S.Saravanan and Chinju M.Sunny, "Design of delay buffer using shift registers for asynchronous data sampling", 2014 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2014], 20-21, March, 2014.
- [41] S.Saravanan and M.Madheswaran, "Design and analysis of a hybrid encoded low power multiplier with reduced transition activity technique", ICWET '10: Proceedings of the International Conference and Workshop on Emerging Trends in Technology, February 2010, Pages 986–990, https://doi.org/10.1145/1741906.1742136.
- [42] S.Saravanan, "Design and Analysis of FIR Filter Using Low Power Multiplier" International Journal of Engineering Technology Research & Management, Vol.4, Issue.2, pp.1-8, 2000.