# Power Efficient, High Performance SRAM array in 90nm CMOS Process

Ajay Kumar Singh<sup>\*</sup>, Mah Meng Seong,

Faculty of Engineering & Technology, Multimedia University Jalan Ayer Keroh Lama, 75450-Melaka-Malaysia

\*aks\_1993@yahoo.co.uk

#### Abstract:

Memory arrays are an essential building block in any digital system. This paper presents the implementation of an SRAM array to avoid the half selected column disturbance when the cell has separate write signal (data aware 9T cell). The array of different size is simulated in terms of power, delay and process variation with and without peripheral circuits and results are compared with the conventional 6T cell array. The proposed array consumes lower power compared to the 6T during read/write and hold mode. The power reduction is due to forbidden discharging at bit-lines during write operation, control of leakage current due to proper array implementation and lower voltage drop on read bit-line. The write delay is improved due to separate write signal. The read delay is larger than 6T array which can be reduced by independently optimizing the read path or using read/write multiplexer at the local bit line due to signal HD in the array. During hold mode maximum 43% power saving is achieved compared to the 6T array. The proposed array implementation shows less variation with the threshold voltage.

[Key word: SRAM cell, Data aware, Memory Array, Power Consumption, Leakage current]

### Introduction

Embedded memory is a non-stand-alone memory which is integrated on-chip memory to accomplish intended functions. High-performance and low power embedded memory is a key component in System-on-Chip (SoC) because of its high-speed, wide bus-width capability and reliability [1-3]. Since, in the modern SoC era, memory becomes an important and essential IP requirement, several new on-chip memory implementations have been reported in the literature to improve the chip's area and energy efficiency [4-8]. Embedded SRAM is widely used due to its high speed and compatibility with standard processes. As the technology moves into deep sub-micron region, leakage current in SRAM cell is dominant factor for consuming larger power on the chip. Leakage power consumption also restricts the application of SRAM cell in larger

memory array. To control the leakage current many new SRAM cells architecture have been proposed by researchers [9-15]. Besides the cell leakage the bit line leakage is another dominating factor for power consumption. The overall bit-line power consumption is data dependent. Many data-aware cells have been reported in the literature to control the bit line power consumption. Yen-Jen Chang et al [16] has proposed new cell to reduce the power consumption during write-0 operation. А differential-data-aware power-supplied (D<sup>2</sup>AP) 8T SRAM cell is proposed by Meng-Fan Chang et al [17]. Yei-Wei Chiu et al [18] have proposed 8T single-ended cross-point data-aware cell to perform write operation. This cell reduces bit line power consumption but due to single ended write operation write 1 operation degraded. Meng-Fan Chang et al [19] have further proposed a 130mV

SRAM to perform the write operation based on the data to write in the cell. The main drawback of this design is that they have used two control signals to perform the write operation which imposes extra hardware burden. Recently Ming-Hsien Tu et al [20] have proposed a single-ended disturb free 9T SRAM cell. This cell reduces power consumption but again write 1 operation is degraded. We have proposed [21] a data aware 9T cell to reduce the power consumption during write, read and hold mode. The proposed cell enhances the write ability and gives faster response.

In this paper a new approach has been adopted to avoid the half column selected disturbance in the data aware cell in which write operation is performed by separate write signal (WS) instead of WL. The array of different size is implemented with and without peripheral circuits to study the power consumption, delay and process variation. The proposed array consumes lower power due to power efficient cell as well as reduction of leakage current due to effective implementation of array. The improved write access delay is due to separate write signal and its effective implementation. The degraded read delay in the array can be improved by individually optimizing the read path in the array. The comparison is done with 6T array.

The remainder of the paper is organized as follows; section 2 discusses the implementation of the array using 9T cell. Section 3 gives the simulation results of the arrays in terms of power, delay, leakage current and process variation. Finally, section IV concludes the paper.

## Architecture of the SRAM Array

The general architecture of the array is shown in Fig.1. In the array implementation, we have used our earlier designed 9T data aware cell [21]. Since, cell uses separate signal WS for write operation, hence to avoid any half-column selected cell problem we have adopted column based approach. In this approach, write signal (WS) is routed parallel to BL. Since each column has separate write signal, there is no disturbance in unselected columns during write operation and no unnecessary reading due to separate read port. In the read operation, since WS is not toggling, half select disturbance does not arise in the array. The cell has separate read and write ports which avoids the use of read/write multiplier at the local bit-line level and simplifies the local evaluation circuit. Read word-line (RWL) driver can be significantly reduced in size due to single-ended cell read stack. Each block in the array will be controlled by local WS which is connected with global WS signal by tristate WS buffers or pass transistor. Due to local WS, local bit-line control logic is not required. The global WS generator is placed in columnmultiplexers block and can be used to control the write data. The input signals for the global WS generator are write data and column select signal as shown in Fig. 1. For larger array implementation, divided bit-lines and word-line approach can be used. Read and write paths can be independently optimized by sharing the read and write bit-lines across different number of bits.

# 3. Simulation Results and Discussion

The layout of the data aware 9T array and 6T array with and without peripheral circuits were drawn using Microwind 3 VLSI CAD tools and simulated using BSIM 4 models (for 90-nm CMOS technology). Fig. 2 shows the layout of the data aware 9T and 6T arrays with peripheral devices. The simulated overall power consumption with and without peripheral devices during write/read operation and hold mode are given in table 3(a) and table 3(b) respectively. During write operation, power saving in 9T data aware array is 19% for  $2^4$ x1 array and 2% for  $2^7$ x1 array compared to the 6T array in presence of peripheral devices whereas without peripheral devices, power saving is 35.5% for  $2^4x1$  and 10.4% for  $2^7x1$  array. The power saving is achieved due to forbidden discharging of the bit-lines for selected cell in the 9T data aware array. The maximum 9.7% power saving is achieved during read operation in presence of peripheral devices for the array size of  $2^4x1$ . As the array size increases, power saving reduces. For array of  $2^7x1$ , only 1.7% power efficiency is achieved compared to the 6T array due to larger parasitic capacitance. The power saving varies from 42.45% to 61.3% when no peripheral devices are included in the array during hold mode because of lower leakage current from write bit-lines and lower discharging activity at RBL. The lower leakage current in the array shows the effectiveness of the proposed array implementation technique. The power consumption in the array can be further reduced by reducing the ON period of the write signal WS.



Fig. 1: Architecture of the SRAM array



Fig. 2: Layout of the arrays

Table3 (a): Overall Power Consumption for different array size including peripheral devices (at V<sub>th</sub>=0.25V)

|               | Array Power Comparison (µW) |                       |          |               |       |          |  |
|---------------|-----------------------------|-----------------------|----------|---------------|-------|----------|--|
| Arr           | 6T                          |                       |          | Data aware 9T |       |          |  |
| ay<br>Size    | Wri<br>te                   | Read Hol d Write Read |          | Hold          |       |          |  |
| $2^4$ x 1     | 4.3<br>8                    | 11.34                 | 2.62     | 3.55          | 10.24 | 2.07     |  |
| $2^{6}x$<br>1 | 7.7<br>7                    | 28.18                 | 5.92     | 7.40          | 27.65 | 4.38     |  |
| $2^7 x$<br>1  | 14.<br>66                   | 39.6<br>4             | 8.9<br>3 | 14.3<br>7     | 38.97 | 6.0<br>6 |  |

Table 3(b): Power consumption for different array size without peripheral devices ( $V_{th}=0.25V$ )

| Cell Array Power Comparison (µW) |    |               |  |  |
|----------------------------------|----|---------------|--|--|
| Array                            | 6T | Data aware 9T |  |  |

| Size        | Write | Read   | Hold  | Write | Read   | Hold  |
|-------------|-------|--------|-------|-------|--------|-------|
| 16-<br>bit  | 0.866 | 4.323  | 1.041 | 0.365 | 2.488  | 0.894 |
| 64-<br>bit  | 1.169 | 12.917 | 2.501 | 0.892 | 4.996  | 2.277 |
| 128-<br>bit | 1.917 | 21.790 | 3.544 | 1.717 | 10.297 | 3.244 |

Table 4 gives the power consumption, during write and read operation, of 6T and 9T arrays for different values of threshold voltage. In the 9T array about 50% power saving is achieved as the threshold voltage increases from 0.21V to 0.27V compared to 33.25% saving in 6T array during write operation.

| Table4: Power | Consumption |
|---------------|-------------|
|---------------|-------------|

| Power consumption (µW) |             |        |          |        |  |  |
|------------------------|-------------|--------|----------|--------|--|--|
| Vth                    | Write       |        | Rea      | ıd     |  |  |
|                        | 6T array 9T |        | 6T array | 9T     |  |  |
|                        |             | array  |          | array  |  |  |
| 0.21                   | 22.756      | 24.744 | 52.236   | 42.647 |  |  |
| 0.23                   | 17.157      | 18.114 | 44.125   | 35.174 |  |  |
| 0.25                   | 14.685      | 14.370 | 39.643   | 31.174 |  |  |
| 0.27                   | 15.190      | 12.264 | 37.205   | 28.404 |  |  |

Fig. 3 shows the variation of the write delay in the array against the threshold voltage. In 9T data aware array, write delay changes slightly with threshold voltage due to major role played by write signal WS instead of WL and due to proper

distribution of WS in the array. The increase in delay is only 40% compared to 76% in 6T array as threshold voltage increases from 0.21mV to 0.27mV.



Fig.3: Write delay variation with threshold voltage

The read delay is slightly larger than the 6T array as seen in table 5 due to extra wiring capacitance to connect the storage node to rad pull down transistor. The lower write access delay in the array is due to the forbidden discharging activity at the write bit line.

| Array size             | $2^{4}x1$ | $2^{6}x1$ |
|------------------------|-----------|-----------|
|                        | (ps)      | (ps)      |
| 9T data<br>aware array | 14        | 26        |
| 6T array               | 12        | 22        |

Table 5: Read access delay for SRAM array.

The write operation in the data aware 9T array can be successfully performed for WL signal which has  $T_{ON}=0.01$  ns compared to the  $T_{ON}=0.03$ ns in the 6T array, as given in table 6. This larger write ability is due to dominant role played by write signal WS instead of WL. The other interesting observation is that as the ON time of the signal WL reduces active power consumption also reduces. This reduction is about 19.15% in the data aware 9T array compared to the 0.32% in the 6T array.

Table6: Write Power consumption for different ON time of signal WL

| Write Power Consumption |                        |          |  |  |  |  |
|-------------------------|------------------------|----------|--|--|--|--|
| (μW)                    |                        |          |  |  |  |  |
| Ton (WL)                | Ton (WL) 6T Array Data |          |  |  |  |  |
| (ns)                    |                        | Aware 9T |  |  |  |  |
|                         |                        | Array    |  |  |  |  |
| 1.0                     | 4.383                  | 3.905    |  |  |  |  |
| 0.5                     | 4.379                  | 3.815    |  |  |  |  |
| 0.1                     | 4.374                  | 3.609    |  |  |  |  |
| 0.05                    | 4.372                  | 3.577    |  |  |  |  |
| 0.04                    | 4.371                  | 3.552    |  |  |  |  |
| 0.03                    | 4.369                  | 3.510    |  |  |  |  |
| 0.02                    | _                      | 3.378    |  |  |  |  |
| 0.01                    | _                      | 3.157    |  |  |  |  |

Fig 4 shows the comparison of gate tunneling current in 6T SRAM array and data aware 9T SRAM array at different gate oxide thickness during hold mode. It is observed that as the gate oxide thickness reduces the gate tunneling current increases in both arrays. The gate tunneling current in the data aware array is approximately 40% lower than the 6T array which results in lower leakage power consumption. This is again due to proper implementation of the array.



Ajay Kumar Singh, IJECS Volume 2 Issue 9 September, 2013 Page No. 2848-2855

Fig 5 shows the power consumption of the arrays during hold mode at different temperature. The

lower increase in power consumption of 9T array than 6T array is due to lower leakage current.



Fig.5: Hold power consumption at different temperature

Table 7 gives the simulated delay for 6T and 9T data aware arrays for different process corners. It is observed that for FP SN, no write operation is performed either in 6T or 9T array because of degraded writeability. The 9T data aware array is also unable to perform the write operation in the

selected cell for slow PMOS and slow NMOS. The write operation in the selected cell is performed faster than the 6T array for SPFN and FPFN. Read delay is also reduced drastically in the 9T array for same combination of transistors.

| Type of Array |                  |          |                 |     |  |
|---------------|------------------|----------|-----------------|-----|--|
|               | Write Delay (ps) |          | Read delay (ps) |     |  |
| MOS           | 9T data aware    | 6T array | 9T data aware   | 6Т  |  |
| Standard      | 22               | 22       | 17              | 10  |  |
| MOS           | 23               | 23       | 17              | 19  |  |
| S-PMOS        |                  | 29       | 23              | 47  |  |
| S-NMOS        | -                | 29       | 23              | 47  |  |
| F-PMOS        | _                | _        | 21              | 48  |  |
| S-NMOS        |                  |          | 21              | -10 |  |
| S-PMOS        | 7                | 15       | 14              | 9   |  |

Table 7: Delay for various process corners

| F-NMOS |   |    |    |    |
|--------|---|----|----|----|
| F-PMOS | 9 | 17 | 15 | 12 |
| F-NMOS |   |    |    |    |

## 4. Conclusion

A new array implementation is proposed in the cell to avoid the half-column selected problem when array uses a cell in which write operation is performed by separate signal. The array consumes less power than the 6T array for read/write operation due to lower leakage current and proper array implementation. The gate tunneling leakage current is approximately 40% lower than the 6T array case. The degraded read delay in the 9T array can be improved by optimizing the read path.

### **References:**

- Lawrence T. Clark, Eric J. Hoffman, Jay Miller, Manish Biyani, Yuyun Liao, An Embedded 32b Microprocessor Core for Low-Power and High-Performance Applications, IEEE J. Solid-State Circuits, .36, [2001], 1599-1607.
- Kangmin Lee, Se-Joong Lee and Hoi-Jun Yoo, Low-Power Network-on-Chip for High-Performance SoC Design, IEEE Trans. On Very Large Scale Integration (VLSI) Systems, 14, [2006], 148-160.
- Zil Shoo, Meng Wong, Compiler-assisted high performance and low power optimizations for embedded systems, 2010 Doctoral Dissertation, Hong Kong Polytechnic University, ISBN: 978-1-124-11859-8.
- Chih-Chi Cheng, Chao-Tsung Huang, Ching-Yeh Chen, Chung-Jr Lian, and Liang-Gee Chen, On-Chip Memory Optimization Scheme for VLSI Implementation of Line-Based Two-Dimensional Discrete Wavelet Transform, IEEE Trans. On Circuits and Systems for Video Technology, 17, [2007], 814-822.
- Cai Xia Liu, Zhi Bin Zhang, Feng Qi Wei and Xiao Deong Xu, Design and Implement of Sharable Multi-Channel On-Chip Memory for Embedded CMP System, Advanced Materials Research, 217-218,[2011], 1147-1152.
- 6. Xiangyu Dong, Yuan Xie, Naveen Muralimanohar, and Norman P. Jouppi, Simple but Effective Heterogeneous Main Memory with On-Chip Memory Controller Support,

SC10 [Nov.2010], New Orleans, Louisiana, USA 978-1-4244-7558-2.

- 7. T.S. Rajesh Kumar, R. Govindaranjan and C.P. RaviKumar, On-Chip memory architecture framework for DSP processor-based embedded system on chip, ACM Trans. On Embedded Computing Systems (TECS), 11, [2012], 5.1-5.25.
- 8. Chih-Hsun Chou, Fong Pong and Nian Feng Tzeng, Speedy FPGA-based packet classifiers with low on-chip memory requirements, Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA'12), New York, NY USA, 11-20, ISBN: 978-1-4503-1157-7.
- 9. Z. Liu and V. Kursun, Characterization of novel nine-transistor SRAM cell IEEE Trans. Very Large Scale Integration (VLSI) Systems, 16, [2008], 488-492.
- Sheng Lin, Yong Bin Kim and Fabrizio Lombardi, A Low Leakage 9T SRAM Cell for Ultra-Low Power Operation, Proceedings of GLSVLSI'08, [2008], Orlando, Florida USA.
- 11. M. H. Tu, J. Y. Lin, M. C. Tsai, S. J. Jou, C. T. Chuang, Single-Ended Subthreshold SRAM with Asymmetrical Write/Read-Assist IEEE Transactions on CAS-I, 57, [2010], 3039-3047.
- Adam Teman, Lidor Pergament, Omar Cohen and Alexander Fish, A 250mV 8Kb 40nm Ultra-Low Power 9T Supply Feedback SRAM (SF-SRAM), IEEE J. Solid-State Circuits, 46, [2011],.2713-2725..
- Cheng-Hung Lo and Shi-Yu, P-P-N based 10T SRAM cell for low-leakage and resilient subthreshold operation, IEEE J. Solid-State Circuits, .46, [2011], 695-704..
- C.M.R. Prabhu and Ajay Kumar Singh, Novel Eight-Transistor SRAM cell for write power reduction," IEICE Electronics Express (ELEX), 7, [2010], 1175-1181.
- 15. C.M.R.Prabhu and Ajay Kumar Singh, Low-Power Fast (LPF) SRAM cell for write/read operation," IEICE Electronics Express, 8, [2011], 1473-1478.

- Yen-Jen Chang, Feipei Lai, Chia-Lin Yang, Zero-Aware asymmetric SRAM cell for reducing cache power in writing 0, IEEE Trans. On Very Large Scale Integration (VLSI) Systems, .2, [2004], 827-836.
- Meng-Fan Chang, Ju-Jen Wu, Kuang-Ting Chen, Yung-Chi Chen, Yen-Hui Chen, Robin Lee, Hung-Jen Liao and Hiroyuki Yamauchi, A differential data-aware power-supplied (D<sup>2</sup>AP) 8T SRAM cell with expanded write/read stabilities for lower VDDmin applications, IEEE J. Solid-State Circuits, 45, [2010], 1234-1245.
- 18. Yi-Wei Chiu, Jihi-Yu Lin, Ming-Hsien Tu, Shyh-Jye Jou and Ching-Te Chuang, 8T Singleended sub-threshold SRAM with cross-point data-aware write operation, 2011 International symposium on Low Power Electronics and Design (ISLPED), [1-3 August 2011], 169-174.
- 19. Meng-Fan Chang, Shi-Wei Chang, Po-Wei Chou and Wei-Cheng Wu, A 130mV SRAM with expanded write and read margins for subthreshold applications, IEEE J. Solid-State Circuits, .46, [2011], 520-529.
- 20. Ming-Hsien Tu, Jihi-Yu Lin, Ming-Chien Tsai, Chien-Yu Lu, Yuh-Jiun Lin, Meng-Hsueh Wang, Huan-Shun Huang, Kuen-Di Lee, Wei-Chiang (Willis) Shih, Shyh-Jye Jou and Ching-Te Chuang, A single-ended disturb-free 9T subthreshold SRAM with cross-point dataaware write word-line structure, negative bitline and adaptive read operation timing tracing, IEEE Trans. Of solid-state circuits, 47, [2012], 1469-1482.
- 21. Ajay Kumar Singh, Mah Meng Seong, C.M.R Prabhu, A Data Aware (DA) 9T SRAM cell for Low Power Consumption and Improved Stability, Under consideration in International Journal of Circuit Theory and Applications (Wiley Publication) published online 2012