Hindawi Journal of Electrical and Computer Engineering Volume 2019, Article ID 9029526, 11 pages https://doi.org/10.1155/2019/9029526



## Research Article

# An Efficient Design of DCT Approximation Based on Quantum Dot Cellular Automata (QCA) Technology

# Ismail Gassoumi, Lamjed Touil, Bouraoui Ouni, and Abdellatif Mtibaa

<sup>1</sup>Laboratory of Electronics and Microelectronics, University of Monastir, Monastir, Tunisia

Correspondence should be addressed to Ismail Gassoumi; gassoumiismail@gmail.com

Received 29 November 2018; Accepted 5 September 2019; Published 2 October 2019

Academic Editor: Amir Sabbagh Molahosseini

Copyright © 2019 Ismail Gassoumi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Optimization for power is one of the most important design objectives in modern digital image processing applications. The DCT is considered to be one of the most essential techniques in image and video compression systems, and consequently a number of extensive works had been carried out by researchers on the power optimization. On the other hand, quantum-dot cellular automata (QCA) can present a novel opportunity for the design of highly parallel architectures and algorithms for improving the performance of image and video processing systems. Furthermore, it has considerable advantages in comparison with CMOS technology, such as extremely low power dissipation, high operating frequency, and a small size. Therefore, in this study, the authors propose a multiplier-less DCT architecture in QCA technology. The proposed design provides high circuit performance, very low power consumption, and very low dimension outperform to the existing conventional structures. The QCADesigner tool has been utilized for QCA circuit design and functional verification of all designs in this work. QCAPro, a very widespread power estimator tool, is applied to estimate the power dissipation of the proposed circuit. The suggested design has 53% improvement in terms of power over the conventional solution. The outcome of this work can clearly open up a new window of opportunity for low power image processing systems.

## 1. Introduction

In the last years, marked researches have been made in many transform techniques like fast Fourier transform (FFT), discrete cosine transform (DCT), and discrete wavelet transform (DWT), which are extensively used in various digital signal processing (DSP) applications [1, 2]. FFT is an essential transform in DSP with applications in signal filtering, frequency analysis, and compression. DWT is a widely used time-frequency method for the analysis of nonstationary signals. The DCT has widely been exploited for real-life data compression. DCT is better than others in some applications like data compression. It has energy compaction and decorrelation properties which makes it very close to the Karhunen–Loeve Transform (KLT). Thus, the DCT is preferable for data compression applications. It is

an essential conversion between time and frequency domains in various applications of speech and image processing, communication systems, and signal [3]. Therefore, it is used to map an image space into a frequency. DCT is extensively used in several image and video compression standards such as JPEG [4], MPEG-1 [5], MPEG-2 [6], H.261 [7], H.263 [8], and others [9, 10]. The implementation of the DCT algorithm is not efficient due to its floating-point calculations and complex loops. In fact, floating-point algorithms are slow in software and require more silicon in hardware implementation [11]. However, the DCT should be calculated in a very short time. In this context, in the last few years, a large number of DCT approximations have been proposed to decrease the complexity of this transform [12-14]. Indeed, the request for higher quality video has increased because of the enormous amount of electronic

<sup>&</sup>lt;sup>2</sup>Higher Institute of Technological Studies of Sousse, Monastir, Tunisia

<sup>&</sup>lt;sup>3</sup>Networked Objects Control & Communication Systems Lab, University of Sousse, Sousse, Tunisia

devices that process digital video in even higher resolutions. Thus, power optimization and area minimization are the two principal research areas in very large-scale integrated circuit (VLSI) design for embedded and handheld devices which employ various image processing algorithms. Up to now, complementary metal oxide semiconductor- (CMOS-) based VLSI technology is extensively used to improve the quality of image processing systems. However, traditional transistors cannot get much smaller than their current size, which causes a large impact on the speed, performance, and power consumption of future designs. The challenges created by this trend could be partially met by innovative technologies, proposed as alternatives to the classic CMOS. Presently, single electron transistor (SET), tunnel field-effect transistor (FET), carbon nanotube (CNT), and silicon nanowire transistor are being used as an alternative to conventional VLSI technology [15, 16]. Among them, quantum-dot cellular automata (QCA) is one of the most promising solutions to design ultra-low-power and very high-speed digital circuits [17, 18]. QCA technology offers a revolutionary approach to computing at the nanolevel. The use of QCA on the nanoscale has a promising future because of its ability to achieve high performance in terms of device density, clock frequency, and power consumption. In this focus, QCA offers potential advantages of ultra-low power dissipation. It is expected to achieve a very high device density of 1012 device/cm<sup>2</sup> and switching speeds of 10 ps and a power dissipation of 100 W/cm<sup>2</sup> [19]. Consequently, an efficient design of circuits based on this new technology would lead to the reduction of computational complexity and power consumption. These benefits can make the proposed QCA method useful for image processing applications applied on portable communication devices where low power consumption is demanded in today's world. Recently, some efforts have been made towards the design of QCA logic circuits for image processing applications such as MAC operation [20], BinDCT [21], image steganography [22], morphological edge detection [23], thresholding [24], noise removal [25], and morphological erosion and dilation [26]. The above scenario motivates us to investigate a new low-power DCT architecture based on QCA technology.

In this paper, we first present an optimal structure of adder circuit using three inputs XOR gate and three inputs majority gate which is used to design an eight-bit ripple carry adder (RCA) circuit. Furthermore, an efficient QCA D flip-flop (DFF) circuit is designed, and then the PIPO shift register circuit is designed using this DFF circuit as the building block. The designed RCA and PIPO shift register are used to achieve QCA DCT architecture. Power dissipation of the proposed DCT design has been estimated. Reliability of the proposed QCA circuit has also been explored.

The remainder of this paper is organized as follows: Section 2 provides the background of DCT algorithm. Section 3 presents an overview of the QCA. Section 4 discusses the DCT power optimization by QCA technology. Section 5 shows the discussions and results of the proposed DCT architecture. Finally, conclusions are drawn in Section 6.

## 2. DCT Algorithm

The discrete cosine transform (DCT) plays a critical role in image and video compression due to its near-optimal decorrelation efficiency [3]. The DCT is similar to the discrete Fourier transform (DFT). It is used to compress both color and gray scale images. The main advantage of image transformation using DCT is the suppression of redundancy between neighbouring pixels. Indeed, DCT approximation with low bit rates and low computational complexity is preferred. In this area, significant research works have been devoted for reducing the computation complexity of DCT transform [13, 27-34]. In ref. [13], a low power DCT architecture is proposed. It requires only sixteen additions. It has lower computational complexity. Also, a low complexity orthogonal 8\*8 transform matrix for fast image compression is proposed in [33]. It requires only fourteen additions and two shift operations. A new matrix for DCT, which requires only 12 additions, is reported in [34]. It achieves a low power consumption while implementing in hardware. Besides, several studies have been carried out to improve the performance of the DCT module and then reduce the complexity of the treatment [35, 36]. Otherwise, power consumption presents a fundamental problem when designing embedded video applications. Furthermore, embedded and handheld devices face necessary issues related to energy constraints as a result of their sizes and weights. This truth stimulates designers to search for new solutions to grant low power consumption for video processing applications. QCA technology is motivated by its applications in low-power electronic design. It has attracted important attention. In this paper, we have used the digital architecture (Figure 1) proposed in [34]. It can be implemented quite easily using adders and Parallel-In Parallel-Out (PIPO) shift registers.

## 3. QCA Fundamentals

The QCA approach, introduced in 1993 by Lent et al. [18], is able to replace devices based on field-effect transistor (FET) on nanoscale. Generally, QCA cells are classified into various types: metal islands, nanomagnetics, semiconductors, and molecular structures. In the QCA technology, data are transmitted through polarization based on binary information encoding in quantum-dot cells. This nanotechnology was conceived based on some of Landauer's ideas regarding energy efficient and robust digital devices [37]. It consists of an array of cells. Each cell contains four quantum dots at the corner of a square which can hold a single electron per dot. Only two electrons diametrically opposite are injected into a cell due to Coulomb interaction [38]. Through Coulombic effects, two possible polarizations (labelled -1 and 1) can be shaped. These polarizations are represented by binary "0" and binary "1" as shown in Figures 2 and 3, which shows the propagation of logic "0" and logic "1", respectively, from input to the output in QCA binary wires due to the Coulombic repulsion. Generally, in neighbouring cells, the coulombic interaction between electrons is used to



FIGURE 1: Structure of DCT [34].



FIGURE 2: Two different polarizations of the quantum-dot cell.



FIGURE 3: QCA binary wires.

implement many logic functions which are controlled by the clocking mechanism [39].

3.1. Logic Gates. A majority and inverter gates are the fundamental logic gates in the QCA implementations which are composed of some QCA cells. Several types of inverter and majority gates are shown in Figure 4. In the inverter gate, the output is the inverse of the input. Furthermore, the majority gate acts as an AND gate and OR gate just by setting one input permanently to 0 or 1. It has a logical function that can be expressed by the following equation:

$$MV(a,b,c) = AB + BC + AC.$$
 (1)

3.2. QCA Clocking. The clocking system is an important factor for the dynamics of QCA. Its principal functions are the synchronization of data flows and the implementation of adiabatic cell operation which enable QCA circuits with high energy efficiency [40]. Generally, QCA clocking is presented with four different phases which are switch, hold, release, and relax as illustrated in Figure 5. During the switch phase, in which actual computations are occurred, the barriers are raised and a cell is affected by the polarization of its adjacent cells and a distinctive polarity is obtained. During the hold phase, the barriers are high and the polarization of the cell is retained. During the release phase, the barriers are lowered and the cell loses the polarity. During the relax phase, the cell is non-polarized [41].

3.3. Crossovers in QCA. In this field, two approaches are used to traverse two wires in QCA (multilayer crossovers and coplanar crossings). Multilayer QCA circuits consume huge less area than coplanar circuits. However, it may be expensive and difficult to manufacture. In this paper, we use the former crossover approach in designing our DCT architecture since the second technique yields high cost due to fabrication issue. It requires two cell types (regular and rotated cells) as shown in Figure 6(a). It has already been applied in several studies [37, 42].



FIGURE 4: Inverter gate (a) and majority gate (b, c).



FIGURE 5: QCA clock zones.



FIGURE 6: Signal crossover schemes: (a) coplanar crossing and (b) multilayer crossing.

## 4. QCA Implementation of the DCT

In this section, we present a new DCT architecture based on QCA technology to mitigate the computational complexity and power consumption issues. This configuration is composed of two stages (stage 1 and stage 2). The submodules utilized in designing our DCT architecture are eight-bit adders and PIPO shift registers to store the results generated by these adders. Thus, reducing the number of cell count and area in these components will make more contribution to achieve low power.

4.1. Study of Stage 1. This stage is composed of eight 8-bit full adders and eight 8-bit PIPO shift registers.

4.1.1. Eight-Bit Adder. The adder circuit plays an important role in the arithmetic circuits. Recently, several attempts

have been made to implement efficient adder circuits in the QCA technology [43–50]. Therefore, the XOR gate [51] can easily be used in the synthesis of adder designs. In this subsection, we propose a novel QCA adder circuit based on majority gates. The inputs are A, B, and  $C_{\rm in}$ . The outputs are Carry-out (Cout) and Sum. The outputs for the full adder are, respectively, given by the following equations:

$$Carry = M(A, B, C_{in}), \tag{2}$$

$$Sum = XOR_3(A, B, C_{in}).$$
 (3)

The QCA layout for the proposed full adder is depicted in Figure 7. It consists of one majority gate and one three-input exclusive-OR gate. According to QCADesigner software (version 2.0.3), the design consists of 45 cells and covers an area of  $0.04 \,\mu\text{m}^2$ . The proposed design provides correct outputs after a delay of two clock phases as depicted



FIGURE 7: Proposed QCA layout of the FA circuit.

in the achieved simulation waveform in Figure 8. The eight-bit adder performs computing function of the proposed DCT architecture. Here, an eight-bit ripple carry adder can be constructed by cascading eight copies of the proposed full adder circuit in series (Figure 9(a)). In order to perform a correct addition in parallel, added cells may be applied to the inputs and outputs in different clock zones for circuit synchronization. The ripple carry adder (RCA) layout in size of eight bit is indicated in Figure 9(b). This design uses 526 cells in its structure which requires 9 clock phases to generate the final output.

4.1.2. QCA 8-Bit PIPO. In this subsection, the design of the proposed 8-bit PIPO shift register is explained. The basic building block of a PIPO shift register is the flip-flop, mainly a D-type flip-flop. Figure 10 illustrates the proposed QCA flip-flop. It can be built using majority and inverter gates. The logic equation of the D flip-flop is represented by the following equation:

$$Q_{(t)} = \text{CLk.D} + \overline{\text{CLk.}}Q_{(t-1)}, \tag{4}$$

Here, the input "D" is only copied to the output "Q" when the clock input is active. The proposed design includes 42 cells with an area of  $0.04\,\mu\text{m}^2$ . It takes five clock periods for the inputs to reach the output and first meaningful output comes on sixth clock. Figure 11 presents the simulation results of the QCA D flip-flop.

Figures 12 and 13 show, respectively, the schematic and the QCA layout of the proposed eight-bit PIPO shift register. It consists of eight QCA D flip-flops which are connected together by a clock signal. Here, the input data are D0, D1, . . ., D7 which are parallally loaded into the register coincident. The outputs data of this design are Q0, Q1, . . ., Q7 which are parallally available at the output of each D flip-flop. The proposed QCA layout is composed of 407 cells with an area of  $0.52\,\mu\text{m}^2$ . It has a critical path length of 35 clock zones.

4.2. Study of Stage 2. This stage is composed of eight 8-bit full adders and four 8-bit PIPO shift registers. The same

full-adder and PIPO shift register proposed in the first stage have been used in this stage.

## 5. Results and Discussions

The implementation and the simulation of the proposed designs are achieved by using QCADesigner 2.0.3 tool [52]. Here, an investigation into these designs in semiconductor QCA technology is provided. The parameters used for the simulation are as follows: cell width = 18 nm, cell height = 18 nm, cell-to-cell spacing = 2 nm, dot diameter = 5 nm, number of samples = 12.800, convergence tolerance = 0.001, radius of effect = 80 nm, relative permittivity = 12.9, clock high = 9.8 E-22J, clock low=3.8 E-23J, clock amplitude factor = 2, layer separation = 11.5 nm, and maximum iterations per sample = 100. The spacing between two wires is two cells wide and the cell count in one clock zone is two at least. In this design, the coplanar wire method has been used.

The comparison of the proposed QCA submodules with previously reported designs in terms of circuit complexity are shown in Tables 1–4, respectively.

The proposed subcircuits of QCA DCT approximation have lower computational complexity and better performances compared to the existing ones. As shown in Table 1, the designed full adder has an improvement of 78%, 85%, and 75% in terms of cell complexity, extent, and delay, correspondingly, compared with the design in [53]. Compared with the design in [49], the proposed full adder has an advancement of 8.16% and 50% in terms of cell complexity and delay, respectively. Table 2 shows that the proposed design of the 8-bit adder has reduced 33% cell count, 5.3% area, and 65% delay as compared with the circuit in [47]. In addition, the cell count, area, and delay of the designed QCA D flip-flop are considerably improved compared to the QCA circuits in [21, 56-58], as listed in Table 3. Table 4 summarizes the comparative results, which indicates that the designed eight-bit PIPO exhibits considerable superiority over the existing in [21] in terms of cell count and area by 27% and 29%, respectively. So, the proposed submodules can directly contribute to the low power DCT design.

Since there is no electrical current in QCA computations, the power consumption of the proposed design is much lower than the classical-based solution. Here, we employed QCAPro software [59] in order to calculate the power dissipation of the proposed DCT design. The consumption of the entire system is valuing 0.091 mW. This value is considerably lower than that existing in the literature and based on CMOS technology [34, 60, 61]. According to Table 5, it is found that the proposed architecture involves nearly 53% less power dissipation than the presented one in [34]. Therefore, the proposed design can operate at a higher frequency (higher than 1 GHz) than the conventional solution. The performances gained indicate that the proposed module could be a good candidate for numerous video and image applications. Consequently, this architecture can be useful for future high-definition video applications. It enables meeting the real time constraints of the most recent high-resolution video formats.



FIGURE 8: Simulated input-output waveform of the proposed FA circuit.



FIGURE 9: Proposed (a) logical diagrams and (b) QCA layouts of 8-bit parallel binary full adder.



FIGURE 10: Proposed (a) logical diagram and (b) QCA layout of D flip-flop.



Figure 11: Simulated input-output waveform of the proposed D flip-flop.



FIGURE 12: Block diagram of the proposed 8-bit PIPO shift register.



FIGURE 13: QCA layout of the 8-bit PIPO shift register.

| Circuit         | Cell count | Area (μm²) | Clock no. cycle | Crossover type            |
|-----------------|------------|------------|-----------------|---------------------------|
| Full adder [43] | 135        | 0.14       | 1.25            | Multilayer                |
| Full adder [44] | 93         | 0.087      | 1               | Multilayer                |
| Full adder [45] | 73         | 0.080      | 0.75            | Multilayer                |
| Full adder [46] | 220        | 0.36       | 3               | Coplanar                  |
| Full adder [53] | 206        | 0.28       | 2               | Not required              |
| Full adder [47] | 102        | 0.097      | 2               | Coplanar                  |
| Full adder [43] | 59         | 0.043      | 1               | Coplanar (clocking based) |
| Full adder [49] | 49         | 0.04       | 1               | Coplanar (clocking based) |
| Proposed adder  | 45         | 0.04       | 0.5             | Coplanar                  |

TABLE 1: Comparison of the proposed adder with the previous works.

TABLE 2: Comparison of the proposed 8-bit adder with the previous works.

| Circuit         | Cell count | Area (μm²) | Clock no. cycle | Crossover type |
|-----------------|------------|------------|-----------------|----------------|
| Full adder [54] | 1782       | 1.49       | 10              | Multilayer     |
| Full adder [47] | 789        | 0.948      | 10              | Multilayer     |
| Full adder [55] | 517        | 0.59       | 10              | Multilayer     |
| Full adder [48] | 572        | 0.492      | 11              | Coplanar       |
| Proposed adder  | 526        | 0.89       | 3.5             | Coplanar       |

TABLE 3: Comparison of the proposed D flip-flop with the previous works.

| Circuit        | Cell count | Area (μm²) | Clock no. cycle | Crossover type |
|----------------|------------|------------|-----------------|----------------|
| DFF [56]       | 66         | 0.08       | 1.5             | Coplanar       |
| DFF [57]       | 49         | 0.05       | 1               | Not required   |
| DFF [58]       | 46         | 0.03       | 0.75            | Not required   |
| DFF [21]       | 46         | 0.05       | 1.25            | Not required   |
| Proposed adder | 42         | 0.04       | 1.5             | Not required   |

TABLE 4: Comparison of the proposed 8-bit PIPO with the previous works.

| Circuit        | Cell count | Area (μm²) | Clock no. cycle | Crossover type |
|----------------|------------|------------|-----------------|----------------|
| PIPO [21]      | 562        | 0.74       | N.A             | Not required   |
| Proposed adder | 407        | 0.52       | 35              | Not required   |

TABLE 5: Comparison of the proposed DCT with the previous works.

| Transform          | Power (mW) |
|--------------------|------------|
| Transform in [60]  | 29.78      |
| Transform in [61]  | 12.4       |
| Transform in [34]  | 0.1954     |
| Proposed transform | 0.091      |

In this way, with the advances being made both in QCA technology and the ever-increasing computational requirements of image treatment, this work can clearly open up a new window of opportunity in this scope.

The effect of temperature variations on polarization of output cell in the proposed DCT design has been investigated. It is taken at different temperatures and the effect is depicted in Figure 14. According to this figure, it is clear that the DCT circuit works efficiently between 1 K and 6 K. Over 6 K, the output polariation drops dramatically and the design starts malfunctioning.

#### 6. Conclusion

Area minimization and low power are the two indispensable requirements for portable multimedia devices, which use several image processing algorithms. The QCA technology offers several advantages such as very low power dissipation, high functional density, and improved computing speed (in terahertz) and facilitates further miniaturisation in nanoscale. In this paper, a novel design of DCT approximation in the QCA technology has been presented. The proposed design consumes 0.091 mW power. The operating frequency



FIGURE 14: The effect of temperature variations on polarization of the output cell in the proposed design.

of this architecture can exceed 1 THz. This work provides high circuit performance, very low power consumption and very low dimension as compared with traditional VLSI technology. The outcome of this work can clearly open up a new window of opportunity for low power video designs. Future extensions, such as various applications based on this QCA DCT, could be investigated.

## **Data Availability**

The data used to support the findings of this study are available from the corresponding author upon request.

## **Conflicts of Interest**

The authors declare that they have no conflicts of interest.

## References

- [1] A. Gupta, S. D. Joshi, and P. Singh, "On the approximate discrete KLT of fractional Brownian motion and applications," *Journal of the Franklin Institute*, vol. 355, no. 17, pp. 8989–9016, 2018.
- [2] P. Singh, "Novel Fourier quadrature transforms and analytic signal representations for nonlinear and non-stationary timeseries analysis," *Royal Society Open Science*, vol. 5, no. 11, Article ID 181131, 2018.
- [3] N. Ahmed, T. Natarajan, and K. R. Rao, "Discrete cosine transform," *IEEE Transactions on Computers*, vol. 23, no. 1, pp. 90–93, 1974.
- [4] W. B. Pennebaker and J. L. Mitchell, JPEG Still Image Data Compression Standard, Van Nostrand Reinhold, New York, NY, USA, 1992.
- [5] N. Roma and L. Sousa, "Efficient hybrid DCT-domain algorithm for video spatial downscaling," EURASIP Journal on Advances in Signal Processing, vol. 2007, no. 1, 2007.

- [6] International Organisation for Standardisation, ISO/IEC JTC1/SC29/WG11: Generic Coding of Moving Pictures and Associated Audio Information—Part 2: Video, International Organisation for Standardisation, Geneva, Switzerland, 1994.
- [7] International Telecommunication Union, ITU-T Recommendation H. 261 Version 1: Video Codec for Audiovisual Services at P X 64 kbits, International Telecommunication Union (ITU-T), Geneva, Switzerland, 1990.
- [8] International Telecommunication Union, ITU-T Recommendation H. 263 Version 1: Video Coding for Low Bit Rate Communication, International Telecommunication Union (ITU-T), Geneva, Switzerland, 1995.
- [9] International Telecommunication Union, ITU-T Recommendation H. 264 Version 1: Advanced Video Coding for Generic Audio-Visual Services, International Telecommunication Union (ITU-T), Geneva, Switzerland, 2003.
- [10] T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra, "Overview of the H.264/AVC video coding standard," *IEEE Transactions on Circuits and Systems for Video Technology*, vol. 13, no. 7, pp. 560–576, 2003.
- [11] A. Turneo, M. Monchiero, G. Palermo, F. Ferrandi, and D. Sciuto, "A pipelined fast 2D-DCT accelerator for FPGAbased SoCs," in *Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI '07)*, pp. 331–336, Porto Alegre, Brazil, March 2007.
- [12] R. J. Cintra and F. M. Bayer, "A DCT approximation for image compression," *IEEE Signal Processing Letters*, vol. 18, no. 10, pp. 579–582, 2011.
- [13] N. Brahimi and S. Bouguezel, "An efficient fast integer DCT transform for images compression with 16 additions only," in *Proceedings of the International Workshop on Systems, Signal Processing and their Applications*, pp. 71–74, Tipaza, Algeria, May 2011.
- [14] K. Lengwehasatit and A. Ortega, "Scalable variable complexity approximate forward DCT," *IEEE Transactions on Circuits* and Systems for Video Technology, vol. 14, no. 11, pp. 1236– 1248, 2004.
- [15] K. Bernstein, R. K. Cavin, W. Porod, A. Seabaugh, and J. Welser, "Device and architecture outlook for beyond CMOS switches," *Proceedings of the IEEE*, vol. 98, no. 12, pp. 2169– 2184, 2010.
- [16] D. Rairigh, Limits of Cmos Technology Scaling and Technologies Beyond-Cmos, IEEE, Piscataway, NJ, USA, 2006.
- [17] G. L. Snider, A. O. Orlov, I. Amlani et al., "Quantum-dot cellular automata: review and recent experiments (invited)," *Journal of Applied Physics*, vol. 85, no. 8, pp. 4283–4285, 1999.
- [18] C. S. Lent, P. D. Tougaw, W. Porod, and G. H. Bernstein, "Quantum cellular automata," *Nanotechnology*, vol. 4, no. 1, pp. 49–57, 1993.
- [19] K. Walus, A. Vetteth, G. Jullien, and V. Dimitrov, "Ram design using quantum-dot cellular automata," in *Proceedings* of the Technical Proceedings of the 2003 Nanotechnology Conference and Trade Show, vol. 2, pp. 160–163, Cambridge, MA, USA, February 2003.
- [20] G. Ismail, T. Lamjed, and O. Bouraoui, "Design of efficient quantum-dot cellular automata (QCA) multiply accumulate (MAC) unit with power dissipation analysis," *IET Circuits*, *Devices & Systems*, vol. 13, no. 4, pp. 534–543, 2019.
- [21] L. Touil, I. Gassoumi, R. Laajimi, and B. Ouni, "Efficient design of BinDCT in quantum-dot cellular automata (QCA) technology," *IET Image Processing*, vol. 12, no. 6, pp. 1020– 1030, 2018.
- [22] D. Bikash, C. D. Jadav, and D. Debashis, "Reversible logic-based image steganography using quantum dot cellular

- automata for secure nanocommunication," *IET Circuits*, *Devices & Systems*, vol. 11, no. 1, pp. 1–10, 2017.
- [23] O. Liolis, V. S. Kalogeiton, D. P. Papadopoulos, G. C. Sirakoulis, V. Mardiris, and A. Gasteratos, "Morphological edge detector implemented in quantum cellular automata," in *Proceedings of the 2013 IEEE International Conference on Imaging Systems and Techniques (IST)*, pp. 406–409, Beijing, China, October 2013.
- [24] B. Sen, A. S. Anand, T. Adak, and B. K. Sikdar, "Thresholding using quantum-dot cellular automata," in *Proceedings of the* 2011 International Conference on Innovations in Information Technology, pp. 356–360, IIT, Abu Dhabi, UAE, April 2011.
- [25] P. Z. Qadir, S. J. Ahmad, and M. A. Peer, "Quantum-dot cellular automata: theory and application," in *Proceedings of* the 2013 International Conference on Machine Intelligence Research and Advancement, pp. 540–544, Katra, India, December 2013.
- [26] V. Mardiris and V. Chatzis, Image Processing Algorithms Implementation Using Quantum Cellular Automata, Springer International Publishing, Berlin, Germany, 2014.
- [27] M. N. Haggag, M. El-Sharkawy, and G. Fahmy, "Efficient fast multiplication-free integer transformation for the 2-D DCT H.265 standard," in *Proceedings of the 2010 IEEE International Conference on Image Processing*, pp. 3769–3772, Hong Kong, China, September 2010.
- [28] F. M. Bayer, U. S. Potluri, A. Madanayake, and R. J. Cintra, "Multiplierless approximate 4-point DCT VLSI architectures for transform block coding," *Electronics Letters*, vol. 49, no. 24, pp. 1532–1534, 2013.
- [29] K. A. Wahid, M. Martuza, M. Das, and C. McCrosky, "Efficient hardware implementation of 8×8 integer cosine transforms for multiple video codecs," *Journal of Real-Time Image Processing*, vol. 8, no. 4, pp. 403–410, 2013.
- [30] P. K. Meher, S. Y. Park, B. K. Mohanty, K. S. Lim, and C. Yeo, "Efficient integer DCT architectures for HEVC," *IEEE Transactions on Circuits and Systems for Video Technology*, vol. 24, no. 1, pp. 168–178, 2014.
- [31] F. M. Bayer, R. J. Cintra, A. Edirisuriya, and A. Madanayake, "A digital hardware fast algorithm and FPGA-based prototype for a novel 16-point approximate DCT for image compression applications," *Measurement Science and Technology*, vol. 23, no. 11, Article ID 114010, 2012.
- [32] D. Vaithiyanathan and R. Seshasayanan, "Low power DCT architecture for image compression," in *Proceeding of the International Conference on Advanced Computing and Communication Systems (ICACCS)*, pp. 1–6, Coimbatore, Tamil Nadu, India, December 2013.
- [33] R. K. Senapati, U. C. Pati, and K. K. Mahapatra, "A low complexity orthogonal 8 × 8 transform matrix for fast image compression," in *Proceeding of the Annual IEEE India Conference*, pp. 1–4, Kolkata, India, December 2010.
- [34] V. Dhandapani and S. Ramachandran, "Area and power efficient DCT architecture for image compression," *EURASIP Journal on Advances in Signal Processing*, vol. 2014, no. 1, 2014.
- [35] M. Jridi, A. Alfalou, and P. K. Meher, "A generalized algorithm and reconfigurable architecture for efficient and scalable orthogonal approximation of DCT," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 62, no. 2, pp. 449–457, 2015.
- [36] S. Bouguezel, M. O. Ahmad, and M. N. S. Swamy, "Binary discrete cosine and hartley transforms," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 60, no. 4, pp. 989–1002, 2013.

- [37] C. S. Lent and G. L. Snider, "The development of quantum-dot cellular automata," in Field-Coupled Nanocomputing: Paradigms, Progress, and Perspectives, N. G. Anderson and S. Bhanja, Eds., pp. 3–20, Springer Berlin Heidelberg, Berlin, Germany, 2014.
- [38] P. D. Tougaw and C. S. Lent, "Logical devices implemented using quantum cellular automata," *Journal of Applied Physics*, vol. 75, no. 3, pp. 1818–1825, 1994.
- [39] C. S. Lent and B. Isaksen, "Clocked molecular quantum-dot cellular automata," *IEEE Transactions on Electron Devices*, vol. 50, no. 9, pp. 1890–1896, 2003.
- [40] K. Walus and G. A. Jullien, "Design tools for an emerging SoC technology: quantum-dot cellular automata," *Proceedings of the IEEE*, vol. 94, no. 6, pp. 1225–1244, 2006.
- [41] C. S. Lent, M. Liu, and Y. Lu, "Bennett clocking of quantum-dot cellular automata and the limits to binary logic scaling," Nanotechnology, vol. 17, no. 16, pp. 4240–4251, 2006.
- [42] L. Lu, W. Liu, O. Neill, and E. E. Swartzlander, "QCA systolic array design," *IEEE Transactions on Computers*, vol. 62, no. 3, pp. 548–560, 2013.
- [43] H. Cho and E. E. Swartzlander, "Adder designs and analyses for quantum-dot cellular automata," *IEEE Transactions On Nanotechnology*, vol. 6, no. 3, pp. 374–383, 2007.
- [44] R. Zhang, K. Walus, W. Wang, and G. A. Jullien, "Performance comparison of quantumdot cellular automata adders," in *Proceedings of the 2005 IEEE International Symposium on Circuits and Systems*, pp. 2522–2526, Kobe, Japan, May 2005.
- [45] H. Cho and E. E. Swartzlander, "Adder and multiplier design in quantum-dot cellular automata," *IEEE Transactions on Computers*, vol. 58, no. 6, pp. 721–727, 2009.
- [46] K. Kim, K. Wu, and R. Karri, "The robust QCA adder designs using composable QCA building blocks," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 26, no. 1, pp. 176–183, 2007.
- [47] I. Hänninen and J. Takala, "Binary adders on quantum-dot cellular automata," *Journal of Signal Processing Systems*, vol. 58, no. 1, pp. 87–103, 2010.
- [48] D. Abedi, G. Jaberipur, and M. Sangsefidi, "Coplanar full adder in quantum-dot cellular automata via clock-zone-based crossover," *IEEE Transactions on Nanotechnology*, vol. 14, no. 3, pp. 497–504, 2015.
- [49] T. N. Sasamal, A. K. Singh, and A. Mohan, "An optimal design of full adder based on 5-input majority gate in coplanar quantum-dot cellular automata," *Optik*, vol. 127, no. 20, pp. 8576–8591, 2016.
- [50] G. Singh, B. Raj, and R. K. Sarin, "Design and performance analysis of a new efficient coplanar quantum-dot cellular automata adder," *Indian Journal of Pure & Applied Physics*, vol. 55, pp. 97–103, 2017.
- [51] G. Singh, R. K. Sarin, and B. Raj, "A novel robust exclusive-or function implementation in QCA nanotechnology with energy dissipation analysis," *Journal of Computational Electronics*, vol. 15, no. 2, pp. 455–465, 2016.
- [52] K. Walus, T. J. Dysart, G. A. Jullien, and R. A. Budiman, "QCADesigner: a rapid design and simulation tool for quantum-dot cellular automata," *IEEE Transactions On Nanotechnology*, vol. 3, no. 1, pp. 26–31, 2004.
- [53] N. Kandasamy, F. Ahmad, and N. Telagam, "Shannon logic based novel QCA full adder design with energy dissipation analysis," *International Journal of Theoretical Physics*, vol. 57, no. 12, pp. 3702–3715, 2018.
- [54] V. Pudi and K. Sridharan, "Low complexity design of ripple carry and brent-kung adders in QCA," *IEEE Transactions on Nanotechnology*, vol. 11, no. 1, pp. 105–119, 2012.

- [55] M. Mohammadi, M. Mohammadi, and S. Gorgin, "An efficient design of full adder in quantum-dot cellular automata (QCA) technology," *Microelectronics Journal*, vol. 50, pp. 35–43, 2016.
- [56] A. Vetteth, K. Walus, V. S. Dimitrov, and G. A. Jullien, Quantum-Dot Cellular Automata of Flip-Flops, ATIPS Laboratory 2500 University Drive, Calgary, Canada, 2003.
- [57] S. Hashemi and K. Navi, "New robust QCA D flip flop and memory structures," *Microelectronics Journal*, vol. 43, no. 12, pp. 929–940, 2012.
- [58] A. Rezaei and H. Saharkhiz, "Design of low power random number generators for quantum-dot cellular automata," *International Journal of Nano Dimension*, vol. 7, no. 4, pp. 308–320, 2016.
- [59] S. Srivastava, "QCAPro—an error-power estimation tool for QCA circuit design," in *Proceedings of the International Symposium of Circuits and Systems*, pp. 2377–2380, Rio de Janeiro, Brazil, May 2011.
- [60] P. K. Meher, S. Y. Park, B. K. Mohanty, K. S. Lim, and C. Yeo, "Efficient Integer Dct Architectures For Hevc," *IEEE Transactions On Circuits And Systems For Video Technology*, vol. 24, no. 1, 2014.
- [61] C.-Y. Li, Y.-H. Chen, T.-Y. Chang, and J.-N. Chen, "A probabilistic estimation bias circuit for fixed-width Booth multiplier and its DCT applications," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 58, no. 4, pp. 215–219, 2011.



















Submit your manuscripts at www.hindawi.com











International Journal of Antennas and

Propagation











