#### PAPER ## A 5-bit 4.2-GS/s Flash ADC in 0.13-μm CMOS Process Ying-Zu LIN<sup>†a)</sup>, Nonmember, Soon-Jyh CHANG<sup>†</sup>, Member, and Yen-Ting LIU<sup>†</sup>, Nonmember **SUMMARY** This paper investigates and analyzes the resistive averaging network and interpolation technique to estimate the power consumption of preamplifier arrays in a flash analog-to-digital converter (ADC). By comparing the relative power consumption of various configurations, flash ADC designers can select the most power efficient architecture when the operation speed and resolution of a flash ADC are specified. Based on the quantitative analysis, a compact 5-bit flash ADC is designed and fabricated in a 0.13- $\mu$ m CMOS process. The proposed ADC consumes 180 mW from a 1.2-V supply and occupies 0.16-mm² active area. Operating at 3.2 GS/s, the ENOB is 4.20 bit and ERBW 1.75 GHz. This ADC achieves FOMs of 2.59 and 2.80 pJ/conversion-step at 3.2 and 4.2 GS/s, respectively. **key words:** flash A/D converter, high-speed data converter, interpolation, resistive averaging network #### 1. Introduction In GS/s analog-to-digital conversion, the flash, also fully parallel, analog-to-digital converter (ADC) is one of the most preferred architectures owing to its high-speed potential and low latency. The resolution of a flash ADC is usually less than 8 since its component number grows exponentially with the resolution. Random offsets induced by component mismatches limit the accuracy of a flash ADC. Large number of components and device mismatches constrain the performance of flash ADCs. Techniques such as interpolation [1]–[3], resistive averaging network [4]–[8] and calibration [9], [10] are developed either to reduce the component number or enhance the accuracy. The basic concept of interpolation is to decrease the number of amplifiers. The resistive averaging network and calibration techniques reduce the random offsets of differential amplifiers. Generally, calibration techniques complicate design and increase hardware overhead. Moreover, additional calibration time is required if foreground calibration schemes are employed [9], [10]. The objective of this paper is to discuss the design principles of compact high-speed ADCs. Therefore, calibration techniques are beyond the scope of our discussion. Previous publications [1]–[8] discuss and explain the characteristics of interpolation and resistive averaging network well. In this paper, the discussion focuses the influence the two techniques have on power consumption. The interpolation technique reduces the number of amplifiers but Manuscript received August 13, 2008. Manuscript revised October 12, 2008. <sup>†</sup>The authors are with the Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan. a) E-mail: tibrius@sscas.ee.ncku.edu.tw DOI: 10.1587/transele.E92.C.258 increases the power consumption of the interpolated amplifiers. On the other hand, the averaging network reduces the power consumption of an averaged amplifier but requires extra dummy amplifiers to maintain boundary conditions. The utilization of the two techniques generates some tradeoffs which confuse flash ADC designers. In the following sections, the effects of the two techniques on the power consumption of a flash ADC are clarified. Based on quantitative analysis and simulations, designers can select the most suitable technique and most power efficient ADC architecture when the speed and resolution are specified. The remainder of this paper is organized as follows: Sect. 2 discusses and analyzes the resistive averaging network and interpolation. In Sect. 3, some modifications are done to make the estimation of power consumption more accurate. Section 4 shows the comparisons of the power consumption of different configurations. Section 5 presents the architecture of the proposed ADC and implementation of the building blocks. Measurement results are shown in Sect. 6. Finally, Sect. 7 is the conclusion. # 2. Interpolation and Resistive Averaging Network in Flash ADC In flash ADCs, preamplifiers are often placed in front of a latch-based comparator to overcome dynamic offsets and enhance regeneration speed [5]. The stage number and connection style of the preamplifiers relate to the overall performance and power consumption of a flash ADC. The small-signal voltage gain and -3-dB bandwidth of the first preamplifier in Fig. 1 can be expressed as [11] $$|A_{v,1}| = \sqrt{\mu_n C_{ox} \left(\frac{W}{L}\right)_1 I_{ss,1}} \times R_{L,1} \text{ and } \omega_{-3 \, dB} = \frac{1}{R_{L,1} C_{eq}}$$ (1) Fig. 1 The schematic of cascaded amplifiers. where $$C_{\text{eq}} = C_{\text{dtot},1} + C_{\text{gs},2} + (1 + |A_{\text{v},2}|) C_{\text{gd},2}.$$ Equation (1) reveals the voltage gain is proportional to the loading resistance and the square root of the bias current. In cascaded amplifiers, the input and output common-mode voltage of each amplifier can be set the same to avoid the utilization of level shift circuits. Reducing the loading resistance and enlarging the bias current concurrently extend the bandwidth without changing the output common-mode voltage. Because the loading resistance has a stronger influence on the voltage gain than the bias current, the voltage gain decreases with the increase of the bandwidth. Consequently, more stages of preamplifiers are necessary to achieve the required gain when the input frequency range increases. For high-speed operation, several stages of high-bandwidth preamplifiers are usually cascaded to provide sufficient gain. In a preamplifier, the differential components such as input transistors and loading resistors must be large enough to overcome random offsets induced by component mismatches. Among all kinds of mismatches, the threshold voltage mismatch dominates [7], [12]. The standard deviation of the offset voltage induced by the threshold voltage mismatch is $$\sigma V_{\rm VTH} = \frac{A_{\rm VTH}}{\sqrt{WL}} \tag{2}$$ where $A_{\rm VTH}$ is a process dependent parameter [11], [12]. According to (1), the bandwidth of a preamplifier is inversely proportional to its output capacitive loading which is roughly proportional to the transistor size. If the outputs of a preamplifier are connected to the inputs of another preamplifier, the input-referred offset caused by the succeeding stage is reduced by a factor of the gain of the preceding one. If each differential pair is designed to contribute the same offset voltage, the device sizes of the succeeding preamplifiers can be scaled down stage by stage to reduce capacitive loading. As a result, the overall power consumption becomes smaller. For example, Fig. 2 illustrates the scaling of cascaded preamplifiers where $\sigma(V_{\rm os,1-3})_{\rm \Delta VTH}$ are the standard deviations of the input-referred threshold voltage offsets of the preamplifiers. Assume the voltage gain of each amplifier is 2 and the transistor channel length is fixed. Due to the amplifying of the 1st-stage preamplifier, the width of the input transistors of the 2nd-stage preamplifier is scaled down by a factor of 4. According to (1), when the loading capacitance becomes a smaller value, enlarging the loading resistance makes the bandwidth of each stage the same. Then reducing the bias current keeps the output common-mode voltage unchanged. If the power consumption of the 1st-stage preamplifier is $p_0$ , those of the 2nd-stage and 3rd-stage are $p_0/4$ and $p_0/16$ , respectively. Under this scaling scenario and neglecting parasitic effects, the total power consumption is saved while the gain, bandwidth and accuracy of the three preamplifiers are controlled the same. #### 2.1 The Case with Conventional Connection Figure 3 shows four configurations of preamplifiers, where there are three stages of preamplifiers in each case. The gray blocks denote preamplifiers and white blocks represent dummies. The arrows indicate signal flows. The sizes of these blocks represent the relative device dimensions of the preamplifiers. Assume the power consumption of the 1st-stage preamplifier is $p_0$ and the gains of the 1st-satge and 2nd-stage are $A_1$ and $A_2$ , respectively. Theoretically, the minimum power consumption of the 2nd-stage preamplifier is $p_0/A_1^2$ and that of the 3rd-stage is $p_0/A_1^2A_2^2$ . Figure 3(a) illustrates the conventional connection style of preamplifiers. In this configuration, the minimum power consumption of the conventional case is expressed as $$P_{\text{CON}} = np_0 \left( 1 + \frac{1}{A_1^2} + \frac{1}{A_1^2 A_2^2} \right) \tag{3}$$ where *n* equals $2^N - 1$ for an *N*-bit ADC. ## 2.2 The Case with Resistive Averaging Network Resistive averaging network is extensively used in flash ADCs to reduce random offsets [4]–[8]. When the output nodes of a preamplifier are connected in an averaging network, as shown in Fig. 4, its output voltages are not only defined by itself but all the amplifiers nearby. Therefore, the random offsets are neutralized, and the accuracy of the **Fig. 2** The power scaling of cascaded preamplifiers. **Fig. 3** Configurations of preamplifier arrays: (a) conventional configuration. (b) Averaging case. (c) Interpolation case. (d) Interpolation + averaging. Fig. 4 An array of preamplifiers connected in the resistive averaging network. averaged preamplifier is enhanced. An averaged preamplifier with small input transistors has the same accuracy as an isolated preamplifier with large ones. From this point of view, the resistive averaging network enhances the speed of a preamplifier because smaller transistor size means smaller capacitive loading. The major penalty of this technique is extra dummy amplifiers used to maintain boundary conditions. Without dummies, the zero crossing point of a preamplifier near the boundaries shifts inward. The shifting of the zero crossing point generates systematic offset. Dummies are added to compensate for the shifting. The number of required dummies depends on the tolerable systematic offset value. This technique reduces the power consumption of each preamplifier but increases the number of the preamplifiers. As depicted in Fig. 3(b), the 1st-stage and 2nd-stage preamplifiers are averaged. The total power consumption of this configuration is given as $$P_{\text{AVG}} = np_0 \left( \frac{1}{k_{\text{al}}^2} + \frac{1}{k_{\text{a2}}^2 A_{\text{al}}^2} + \frac{1}{A_{\text{al}}^2 A_{\text{a2}}^2} \right) + \frac{m_{\text{al}} p_0}{k_{\text{al}}^2} + \frac{m_{\text{a2}} p_0}{k_{\text{a2}}^2 A_{\text{al}}^2}$$ (4) where $k_{\rm a1,2}$ are the averaging efficiency factors and $m_{\rm a1,2}$ are the numbers of dummies. Note $A_{\rm a1,2}$ represent averaged gains which are equal to or smaller than the values without averaging [7]. The averaging efficiency mainly depends on the ratio of the averaging resistance $R_1$ to loading resistance $R_0$ and the number of non-saturated amplifiers [5]. Generally, the smaller the resistance ratio, the larger the efficiency factor is. However, more dummies are required when the resistance ratio becomes smaller. For power reduction, the power consumed by the dummies can not exceed that reduced by the averaging technique. Section 3 will discuss the detailed calculation of the averaging efficiency factor. ### 2.3 The Case with Interpolation Figure 3(c) depicts the interpolation case. The interpolation technique decreases the number of components [1]–[3]. Take a $2\times$ interpolation case as an example. If 2L+1 zero crossing points are required, only L+1 amplifiers are necessary. The other L points are interpolated by all adjacent amplifiers. This technique saves both chip area and routing effort. The number of preamplifiers in a $2\times$ interpolation array is roughly half of the original value. The capacitive loading of each interpolated amplifier, however, becomes larger. Each amplifier equivalently drives two succeeding ones in a $2\times$ interpolation case. The interpolation reduces the number of the preamplifiers but increases the power consumption of each preamplifier. The total power consumption of the $2\times$ interpolation case in Fig. 3(c) can be expressed as $$P_{\text{INTP}} = q_{i1} \left( \frac{n'+3}{4} \right) p_0 + q_{i2} \left( \frac{n'+1}{2} \right) \left( \frac{p_0}{A_1^2} \right) + n' \left( \frac{p_0}{A_1^2 A_2^2} \right)$$ (5) where $q_{i1,2}$ are the loading factors of the 1st-stage and 2nd-stage preamplifiers, respectively. The definition of the loading factor is specified in (12). In this case, n' is $2^N + 1$ for an N-bit ADC. In a loading dominant case, the loading factor is roughly equal to 2. On the other hand, it is roughly 1 when the input capacitances of the succeeding amplifiers can be ignored. Likewise, for power reduction, additional power generated by the interpolated amplifiers can not exceed the power saved by decreasing the number of amplifiers. #### 2.4 The Case with Averaging Plus Interpolation Figure 3(d) depicts the connection style where the resistive averaging network and interpolation are used simultaneously. In this case, the power consumption of dummies becomes more dominant because regular components are fewer. The total power consumption is $$P_{\text{HYBD}} = \left(\frac{n'+3}{4} + m_{\text{h}1}\right) \frac{q_{\text{h}1}p_0}{k_{\text{h}1}^2} + \left(\frac{n'+1}{2} + m_{\text{h}2}\right) \frac{q_{\text{h}2}p_0}{k_{\text{h}2}^2 A_{\text{h}1}^2} + \frac{n'p_0}{A_{\text{h}1}^2 A_{\text{h}2}^2}$$ (6) where $k_{\rm h1,2}$ are the averaging efficiency factors, $m_{\rm h1,2}$ are the numbers of dummies and $q_{\rm h1,2}$ are the loading factors in this case. Compared to (3)–(5), there are more variables in (6). Therefore, more tradeoffs are found in this configuration. Although three-stage cases are analyzed for explanation, these equations can be modified to estimate the power consumption of preamplifier arrays consisting of any number of stages. The values of all the variables in the equations can be obtained from circuit simulations. Nevertheless, the values of some variables are not available in the early design stage. The following section provides an approach for initial estimation. First, Sect. 3 discusses the derivations of averaging related variables such as averaging efficiency factor and the number of dummies. Then a simple estimation of the loading factor is given for hand calculation. Furthermore, a new variable named maximal scaling factor is introduced to make the equations closer to practical cases. # 3. Averaging Related Variables, Loading Factor and Maximum Scaling Factor #### 3.1 Resistive Averaging Related Variables The analysis in [7] is used to help estimate the power consumption of averaging related cases. To determine the number of dummies and value of the averaging efficiency factor, define $\gamma$ as the ratio between the unit reference voltage step $V_{\rm R}$ and the input voltage range $V_{\rm Sat}$ where the differential pair is saturated. We have $$\gamma = \frac{V_{\rm R}}{V_{\rm Sat}} = \frac{V_{\rm R}}{\sqrt{2}V_{\rm OV}} \tag{7}$$ where $V_{\rm OV}$ is the overdrive voltage of the input transistors. $N_{\rm NS}$ , the integer value immediately below $1/\gamma$ , is the number of non-saturated differential pairs in each side when the amplifier is balanced. If $N_{\rm NS}$ dummies are added each side, the systematic offset errors can be completely eliminated. The gain of the differential pair at position 0, as depicted in Fig. 4, can be expressed as [7] $$A_{a,h} = g_{m0}B \sum_{t=-N_{NS}}^{N_{NS}} C^{|t|-1} \frac{1 - (t\gamma)^2}{\sqrt{1 - \frac{(t\gamma)^2}{2}}}$$ where $g_{m0} = \frac{I_{SS}}{V_{OV}}$ , $$B = \frac{R_1}{2} \left( \frac{1 + 2\frac{R_0}{R_1}}{\sqrt{1 + 4\frac{R_0}{R_1}}} - 1 \right)$$ and $C = \frac{2\frac{R_0}{R_1}}{1 + 2\frac{R_0}{R_1}}$ . (8) Note $g_{m0}$ represents the balanced transconductance of the differential pair. Only $2N_{NS} + 1$ non-saturated differential pairs determine the gain of the amplifier. If only the dominant mismatch source, threshold voltage, is considered, the averaging efficiency factor is given as [7] $$k_{\text{a,h}} = \frac{\sigma_{\text{CON}} (V_{\text{OS}})_{\Delta \text{VTH}}}{\sigma_{\text{AVG}} (V_{\text{OS}})_{\Delta \text{VTH}}}$$ $$= \frac{\sum_{t=-N_{\text{NS}}}^{N_{\text{NS}}} C^{|t|-1} \frac{1 - (t\gamma)^2}{\sqrt{1 - \frac{(t\gamma)^2}{2}}}}{\sqrt{1 - \frac{(t\gamma)^2}{2}}}$$ $$= \frac{\sum_{t=-N_{\text{NS}}}^{N_{\text{NS}}} C^{2|t|-2} \left(\frac{1 - (t\gamma)^2}{\sqrt{1 - \frac{(t\gamma)^2}{2}}}\right)^2}$$ (9) which is defined as the ratio of the standard deviation of the threshold voltage offset without averaging $\sigma_{\text{CON}}(V_{\text{OS}})_{\Delta\text{VTH}}$ to the value with averaging $\sigma_{\text{AVG}}(V_{\text{OS}})_{\Delta\text{VTH}}$ . Going through (7)–(9), the maximal number of dummies $2N_{\text{NS}}$ , the values of the averaged gain $A_{\text{a,h}}$ and averaging efficiency factor $k_{\text{a,h}}$ are obtained. These calculation procedures can also be applied to cascaded preamplifiers. Generally, the calculation procedures of the averaging related variables are the same in all stages. However, the value of the reference voltage step $V_{\text{R}}$ changes when the interpolation technique is utilized. In a $2\times$ interpolation case, as depicted in Fig. 3(c), the reference voltage step of the 2nd-stage preamplifier is half of the value of the 1st-stage one. To get closed-form equations, the preamplifier in [7] is modeled based on the square law which is not proper for short channel length transistors. Considering the velocity saturation effect caused by the high electric field in advanced processes, the drain current of a NMOS transistor in saturation region is expressed as [11] $$I_{\rm D} = \frac{1}{2} \mu_{\rm n} C_{\rm ox} \left(\frac{W}{L}\right) \frac{V_{\rm OV}^2}{1 + \left(\frac{\mu_{\rm n}}{2\nu_{\rm Sat}L} + \theta\right) V_{\rm OV}}$$ (10) where the saturation velocity $v_{\rm Sat}$ is approximately equal to $10^5$ m/s at 300 K, and the fitting parameter $\theta$ is roughly $10^{-9}/t_{\rm ox}$ . The typical thickness of the oxide $t_{\rm ox}$ is 2.81 nm in the utilized 0.13- $\mu$ m CMOS process. According to (10), the peak value of balanced transconductance and the input voltage range of a differential pair where transconductance falls to zero are calculated. The transconductance and input range are $$g'_{m0} = \frac{I_{SS}}{2 V_{OV}} \left( 1 + \frac{1}{1 + \left(\frac{\mu_0}{2 \nu_{Sat} L} + \theta\right) V_{OV}} \right)$$ and $V'_{Sat} = \frac{x + \sqrt{1 + (1 + x)^2}}{1 + x} V_{OV}$ where $x = \left(\frac{\mu_0}{2 \nu_{Sat} L} + \theta\right) V_{OV}$ . (11) In the utilized process, $g'_{m0}$ is roughly equal to $0.85~g_{m0}$ , and $V'_{sat}$ is around 1.1 $V_{sat}$ . Replacing $V_{sat}$ with $V'_{sat}$ into (7) generates a new value of $\gamma \cdot g'_{m0}$ and $V'_{sat}$ are also replaced into (8) and (9) to get more accurate averaged gain and averaging efficiency factor. Although (7)–(9) are derived based on the square law model, the replacement of $g'_{m0}$ and $V'_{sat}$ enhances the accuracy of the averaging related variables. #### 3.2 Loading Factor To calculate the loading factor of an *i*th-stage preamplifier, the total drain capacitance of the preamplifier, $C_{\text{dtot},i}$ , and the total gate capacitance of the subsequent stage, $C_{\text{gtot},i+1}$ , must be known. Take a $2\times$ interpolation case as an example. The loading factor of an *i*th-stage preamplifier is defined as $$q_{i,h} = \frac{C_{\text{dtot},i} + 2C_{\text{gtot},i+1}}{C_{\text{dtot},i} + C_{\text{gtot},i+1}}.$$ (12) To get the accurate value of the loading factor, $C_{\text{dtot},i}$ and $C_{\text{gtot},i+1}$ can be directly extracted from circuit simulation. For hand calculation, there is another approach. When the bias conditions of preamplifiers in all stages are the same and the size scaling follows the abovementioned rules, (12) can be rewritten as $$q_{i,h} = \frac{C_{\text{dtot},i} + \frac{2C_{\text{gtot},i}}{A_i^2}}{C_{\text{dtot},i} + \frac{C_{\text{gtot},i}}{A_i^2}}$$ (13) where $A_i$ is the gain of the *i*th-stage preamplifier. Theoretically, the ratio of the gate capacitance to the drain capacitance is 2 when the transistor is in strong inversion region [11]. As a result, the loading factor is around $$q_{i,h} = \frac{1 + 4/A_i^2}{1 + 2/A_i^2}. (14)$$ #### 3.3 Maximum Scaling Factor There is a physical limit on the transistor size, and therefore the power consumption of an amplifier can not be infinitely scaled down. The physical limit here does not mean the minimum available size. Although $A_{\rm VTH}$ is usually set as a static value for simplicity, its value depends on the transistor size in practice. When the transistor size becomes extremely small, $A_{\rm VTH}$ becomes larger than the typical value. Therefore, the power consumption of each preamplifier is set to a lower bound. Take the averaging case as an example. With this modification, (4) is rewritten as $$P_{\text{AVG}} = (n + m_{\text{a1}}) p_0 \times \max \left[ \frac{1}{k_{\text{a1}}^2}, \frac{1}{s_1} \right] + (n + m_{\text{a2}}) p_0$$ $$\times \max \left[ \frac{1}{k_{\text{a2}}^2 A_{\text{a1}}^2}, \frac{1}{s_2} \right] + n p_0 \times \max \left[ \frac{1}{A_{\text{a1}}^2 A_{\text{a2}}^2}, \frac{1}{s_3} \right]$$ (15) where $s_{1-3}$ are the maximum scaling factors of the three stages. These scaling factors prevent the overoptimistic estimation of the total power consumption. The flash ADC designers should make the setting of the scaling factors lead to the minimum power consumption of preamplifiers. If the preamplifiers in all stages are of the same type, the values of the scaling factors are identical. It is difficult to directly get insight into these equations. To apply all the abovementioned rules to a practical case, the selection of the preamplifier configuration of the proposed ADC is given as an example. In the next section, the values of the variable are extracted and calculated, and then the values are replaced into the equations. Finally, the most power efficient configuration is selected according to the quantitative comparisons. #### 4. Comparisons of Power Consumption In the proposed ADC, the cascaded preamplifiers and the first differential pair of the current-mode flip-flop are designed to provide an overall voltage gain of 10 V/V and a bandwidth of 3 GHz. In this $0.13\text{-}\mu\text{m}$ CMOS process, four stages of preamplifiers are necessary to accomplish the gain and bandwidth requirement. Therefore, the equations are extended for four-stage cases. In the averaging related cases, the first two preamplifier arrays are averaged. In the interpolation cases, the first three arrays are interpolated. Table 1 lists basic terms such as unit reference voltage step, overdrive voltage and gain. The values of these terms are obtained either from specifications or circuit simulations. Replacing the required terms into (7)–(9) gets the values of the averaging related variables such as the number of dummies and averaging efficiency factor. Although **Table 1** List of variables and specified values. | | Variable | Value (Unit) | |-------------------------------------|-------------|------------------| | Resolution | N | 4 ~ 8 (bit) | | Overdrive Voltage | $V_{ m ov}$ | 0.15 (V) | | Unit Reference Voltage | $V_{ m R}$ | $0.6 / 2^{N}(V)$ | | Loading Factor | q | 1.33 | | Max. Scaling Factor | S | 5 | | Gain of the 1st-Preamp | $A_1$ | 1.4 (V/V) | | Gain of the 2 <sup>nd</sup> -Preamp | $A_2$ | 2.0 (V/V) | | Gain of the 3 <sup>rd</sup> -Preamp | $A_3$ | 2.0 (V/V) | **Fig. 5** Averaged gains versus resistance ratios with different sizes of the reference step (the numbers indicate the ratios of the reference steps to the full scale range). **Fig. 6** Averaging efficiency factors versus resistance ratios with different sizes of the reference step (the numbers indicate the ratios of the reference steps to the full scale range). the number of dummies is not necessarily $2N_{\rm NS}$ in practical cases, the maximal value $2N_{\rm NS}$ is used in the following simulations for simplicity. When $2N_{\rm NS}$ dummies are added, the systematic offsets are completely eliminated. Replacing the data in Table 1 into (8), Fig. 5 shows the normalized averaged gains versus the resistance ratios with different sizes of the reference step $V_R$ where the numbers mean the ratios of the reference steps to the full scale range. When the resistance ratio grows, the averaged gains approach 1. The gains also approach 1 when the size of the reference step decreases. According to (9), Fig. 6 displays the averaging efficiency factors versus the resistance ratios with different sizes of the reference step. This figure indicates that averaging efficiency grows with the de- Fig. 7 Comparison of conventional and averaging cases. crease of the resistance ratio. A smaller reference step has stronger averaging efficiency since it enhances more interaction among non-saturated amplifiers. According to (14), the gains listed in the table are used to calculate the loading factor of each stage. Note the drain capacitance doubles in a 4-input preamplifier since there are two differential pairs. The loading factors of the three stages are all equal to 1.33. In the first design stage, (14) helps designers see the trend. When actual design parameters of the preamplifiers are available, designers can use (12) for more accurate estimation. The value of the maximal scaling factor is set depending on the employed process. The comparisons of the relative power consumption of different configurations are depicted in Fig. 7 to Fig. 9 where the values in the parentheses are the ratios of the averaging resistance to loading resistance. The first figure shows the comparison of the conventional connection style to averaging cases of different levels. The second figure depicts the comparison between the interpolation case and interpolation plus averaging cases. In this third figure, the most power efficient case of each configuration is printed to help make the final decision. In the three figures, the power consumption on the y axis means relative quantity and is therefore unitless. In these cases, the input voltage range is fixed. As a result, the value of an LSB becomes half with the increase of 1-bit resolution. According to (2), the device sizes of a preamplifier become four times larger. As a result, the value of $p_0$ becomes four times larger with the increase of 1-bit resolution. The comparison of the conventional case to averaging ones is depicted in Fig. 7. This figure shows the strong averaging case, averaging (0.1), is the most power efficient. The power consumption of the weak averaging case, averaging (10), is larger than the conventional one because of the small averaging efficiency factor and dummy preamplifiers. In practice, not so many dummies are required in the Fig. 8 Comparison of interpolation and interpolation + averaging cases. weak averaging case. If the number of dummies is adjusted according to the tolerable systematic offset value, the weak averaging case behaves slightly better than the conventional one. After all, this figure shows the trend that stronger averaging leads to lower power consumption. However, if a very small averaging ratio is used, the gain of an averaged amplifier will be very small. According to (4), a small gain is not good for the power consumption scaling of the subsequent preamplifiers. Figure 8 shows the comparison of the interpolation and interpolation plus averaging cases. Compared to the conventional case, the reference voltage step of a front-stage preamplifier equivalently becomes larger in the interpolation cases. In other words, the value of $\gamma$ becomes larger. According to (8), the gain of the front-stage preamplifier becomes smaller. When the resolution is 4, the gain of the 1st-stage preamplifiers becomes very small, around 0.1 V/V, in the strong averaging case, averaging (0.1). As a result, the sizes of the subsequent preamplifiers can not be scaled down but become larger. Therefore, the power consumption becomes extremely large in the 4-bit case as shown in Fig. 8. Because the interpolation technique equivalently makes the averaging stronger, the benefit obtained from a larger averaging efficiency factor is overcome by a smaller averaged gain. The figure shows combining the averaging and interpolation is very power inefficient when the resolution is less than 6. The comparison of the four configurations is depicted in Fig. 9 where we selected the most power efficient cases of the four configurations. With the increase of the resolution, the combination of the interpolation and resistive averaging becomes more and more power efficient. However, to design a 5-bit ADC, the comparison shows only the interpolation is sufficient. Note these simulation results are only given as a design example. According to the number of required gain Fig. 9 Comparison of power consumption of the four configurations. stages and resolution, (2)–(5) can be modified to estimate the power consumption of all the configurations. Flash ADC designers can replace their design parameters such as gain and loading factor into these equations. Then the most power efficient configuration can be selected based on the simulation results. #### 5. ADC Architecture and Building Block #### 5.1 The ADC Architecture Based on the analysis in Sect. 4, the interpolation technique alone makes the design of a high-speed 5-bit ADC most power-efficient. Figure 10 shows the simplified block diagram of the proposed ADC. A differential track-and-hold amplifier (THA) and two resistor ladders provide input signals and references for the 1st-stage preamplifiers. The three stages of preamplifiers are interpolated. Current-mode flip-flops are placed behind the 3rd-stage preamplifiers to periodically sample data. A current-mode logic-based encoder directly translates thermometer codes into Gray codes. For measurement, the output data are sampled by flip-flops clocked at 1/32 sampling frequency. A three-level clock tree consisting of CMOS inverter buffers performs clock distribution of this ADC. In the clock tree, all the buffers are of the same size and each buffer drives three succeeding ones. #### 5.2 The Input Matching Network and THA A resistive input termination in front of a high-speed ADC makes signal amplitude stable in the desired frequency range. When the input frequency increases, the input impedance decreases because of the sampling and parasitic capacitances. At high input frequencies, the degradation of signal amplitude at input ports limits input bandwidth. In this work, adding a network consisting of two resistors and an inductor [13], as depicted in Fig. 11, extends the input Fig. 10 Simplified block diagram of proposed ADC. Fig. 11 Schematic and frequency response of input matching network + THA. bandwidth. The inductor compensates for the decrease of the input impedance and therefore extends the bandwidth. Figure 11 depicts the simulated frequency response of the input matching network plus THA, where the resistances are 75 and $150\,\Omega$ and inductances of ideal inductors range from 0.5 to 3.5 nH with a step of 0.5 nH. The simulation shows the frequency response at 2-nH inductance is flat. Therefore, two 2-nH spiral inductors are employed to make the input impedance stable from DC to Nyquist frequency. #### 5.3 The Preamplifiers and Flip-Flop Figure 12 shows the schematic of the employed preamplifiers, where the values of the input transistor sizes, loading resistances and bias currents are shown. Since four stages of preamplifiers are necessary to meet the required gain in this process, three stages of preamplifiers are used and the first differential pair of the current-mode flip-flop, as shown in Fig. 13(a), can be seen as the 4th-stage preamplifier. To suppress transistor random offsets, the input gate sizes of the preamplifiers must be large enough to guarantee yield. To save power consumption, the input gate sizes of the preamplifiers are scaled down stage by stage. The scaling of the input transistor sizes generally follows the abovementioned rules. However, the loading resistance of the 1st-stage preamplifier does not follow the rule. Because two gate voltages of a 1st-stage preamplifier are fixed, two input transistors do not provide any transconductance. As a result, the gain of a 1st-stage preamplifier is quite small. A larger resistance, $0.27 \, \text{k}\Omega$ , is used instead of the theoretical value, $0.22 \,\mathrm{k}\Omega$ , to achieve a larger gain. Figure 13(a) shows the schematic of the current-mode flip-flop composed of two identical current-mode latches. In this design, current-mode flip-flops are used as comparators to perform high-speed sampling. After pre-amplifying of the preamplifiers, the current-mode flip-flops amplify signals into robust digital level. Because the transistor width of the 3rd-stage and the current-mode flip-flop is quite small, the width is $5 \, \mu m$ without further scaling. #### 5.4 The Current-Mode Logic Gate and Encoder Because a static ROM-based encoder is not sufficient for $4\,\mathrm{GS/s}$ operation in this $0.13\text{-}\mu\mathrm{m}$ process, a current-mode logic-based thermometer-to-gray encoder is used to coordinate the high-speed frontend. The encoder only consists of one kind of differential current-mode logic gate. Figure 13(b) shows the schematic of the logic gate. To make the logic gate function properly in all conditions, a transistor with its gate biased at the common-mode voltage $V_{\mathrm{CM}}$ is added to make the outputs of the logic gate balanced when A and B are equal to $V_{\mathrm{CM}}$ . To save hardware, the encoder directly translates the thermometer codes to gray codes. Current-mode flip-flops are inserted into the critical paths to pipeline the encoder to make it sufficient for 4-GS/s operation. #### 6. Experimental Results This work is fabricated in a $0.13-\mu m$ CMOS 1P8M process. An ultra-thick $3.3-\mu m$ top metal is available for inductor fabrication. Figure 14 shows the chip microphotograph and floorplan. Excluding the input termination network, this ADC occupies an active area of $0.16 \text{ mm}^2$ . This bare die is directly mounted on a PCB and connected to the board by bonding wires to minimize parasitic inductances. The Fig. 12 Schematic of the employed preamplifiers. Fig. 13 (a) Schematic of the current-mode flip-flop. (b) Schematic of the current-mode logic gate. average power consumption excluding the output buffers is 180 mW. Figure 15 exhibits the distribution of the power consumption. With proper scaling, each preamplifier stage consumes roughly the same power, around 10% of the total value. Because the encoder is composed of currentmode logic gates and flip-flops, it consumes 23% of the total power dissipation. At 2.4 and 3.2 GS/s, the differential clock signals and single-ended synchronous signal for the logic analyzer are provided by a pattern generator Agilent 81250 which generates clock signals up to 3.4 GHz. For the measurement at 4.2 GS/s, we use a custom balun to transform the single-ended signal of a RF signal generator Agilent E4438C into differential ones for the clocks of the ADC. The signal of Agilent E4438C is also fed into a frequency divider to generate low frequency signal for the logic analyzer. The clock generation for the 4.2-GS/s measurement is quite complicated since many external components and wires are involved. The clock jitter at 4.2 GS/s is therefore larger than that at 2.4 and 3.2 GS/s. Figure 16 displays the peak DNL and INL are 0.60 and 0.65 LSB, respectively. Figure 17 shows the measured 32k-point FFT result when the input frequency is 1315 MHz and sampling rate is 3.2 GS/s. The resultant SFDR and SNDR are 35.80 and 27.12 dB, respectively. Figure 17 also shows the power spectrum when the input frequency is 130 MHz and sampling rate is 4.2 GS/s. The resultant SFDR and SNDR are 34.70 and 27.07 dB, respectively. Figure 18 depicts the measured SNDR curves with respect to the input frequency at 2.4, 3.2 and 4.2 GS/s sampling rate. At 2.4 and 3.2 GS/s, there is no obvious difference between the two curves. There are 1- to 3-dB drops of SNDR at 4.2 GS/s. The ADC achieves an ENOB of 4.44 bit and ERBW of 1.65 GHz at 3.2 GS/s. At 4.2 GS/s, Fig. 14 Chip microphotograph and floorplan. Fig. 15 The pie chart of power consumption of the proposed ADC. Fig. 16 Measured DNL and INL. $f_{\rm in}$ =1315MHz and $f_{\rm s}$ =3200MHz $f_{\rm in}$ =130MHz and $f_{\rm s}$ =4200MHz **Fig. 17** FFT spectrum at $f_{\text{in}} = 1315 \text{ MHz}$ and $f_{\text{s}} = 3.2 \text{ GHz}$ (upper), and FFT spectrum at $f_{\text{in}} = 130 \text{ MHz}$ and $f_{\text{s}} = 4.2 \text{ GHz}$ (lower). the ENOB is 4.20 bit and ERBW 1.75 GHz. To evaluate the overall performance of the ADC, we use an FOM equation defined as $$FOM = \frac{Power}{2 \times 2^{ENOB} \times \min(ERBW, f_s/2)}$$ (16) where $f_s$ is the sampling frequency, ERBW is the effective resolution bandwidth and ENOB is the effective number of bits at low input frequencies. The FOMs of this ADC are 2.59 and 2.80 pJ/convsersion-step at 3.2 and 4.2 GS/s, respectively. Table 2 shows the specification summary. Fig. 18 SNDR versus input frequency at 2.4, 3.2 and 4.2 GS/s. Table 2 Specification summary. | Specification (Unit) | Experimental Result | | | |--------------------------------|---------------------------|------|--| | Supply Voltage (V) | 1.2 | | | | Input CM Voltage (V) | 0.2 | | | | Input Range (V <sub>pp</sub> ) | 0.6 | | | | DNL (LSB) | 0.60 | | | | INL (LSB) | 0.65 | | | | Power (mW) | 180 | | | | Sampling Rate (GS/s) | 3.2 | 4.2 | | | ENOB (bit) | 4.44 | 4.20 | | | ERBW (GHz) | 1.65 | 1.75 | | | FOM (pJ/convstep) | 2.59 | 2.80 | | | Resolution (bit) | 5 | | | | Active Area (mm <sup>2</sup> ) | 0.16 (ADC) / 0.28 (Total) | | | | Technology | TSMC 0.13-μm CMOS 1P8M | | | **Fig. 19** Comparison to recent state-of-the-art high-speed flash ADCs (*I*: Interpolation, *A*: Averaging, *C*: Calibration). Figure 19 shows the comparison of recent GS/s non-interleaved CMOS flash ADCs. The authors survey recent low-resolution flash ADCs and select those with FOMs smaller than 10 pJ/conversion-step. The proposed ADC operates at the highest sampling rate and achieves comparable FOMs with the other state-of-the-art works. #### 7. Conclusion A design methodology of the flash ADC is proposed for power reduction. With this methodology, flash ADC designers can evaluate the relative power consumption of different configurations and select the most power efficient one when the resolution and operation speed are specified. Based on the analysis, an interpolated 5-bit high-speed ADC is designed and implemented. The proposed design methodology can also be applied to ADCs of higher resolutions. The proposed ADC operates at a sampling rate up to 4.2 GS/s and achieves comparable FOMs with the other state-of-theart works. #### Acknowledgments The authors would like to acknowledge the fabrication support of National Chip Implementation Center (CIC), Taiwan. We sincerely thank the help of CIC engineers in measurement. #### References - C. Lane, "A 10-bit 60 MSPS flash ADC," Proc. BCTM, pp.44–47, Sept. 1989. - [2] R.E.J. van de Grift, I.W.J.M. Rutten, and M. van der Veen, "An 8-bit video ADC incorporating folding and interpolation techniques," IEEE J. Solid-State Circuits, vol.22, no.6, pp.944–953, Dec. 1987. - [3] H. Kimura, A. Matsuzawa, T. Nakamura, and S. Sawada, "A 10-b 300-MHz interpolated parallel A/D converter," IEEE J. Solid-State Circuits, vol.28, no.4, pp.438–446, Dec. 1992. - [4] K. Kattmann and J. Barrow, "A technique for reducing differential non-linearity errors in flash A/D converters," IEEE ISSCC Dig. Tech. Papers, vol.1, pp.170–171, Feb. 1991. - [5] M. Choi and A.A. Abidi, "A 6-b 1.3-Gsample/s A/D converter in 0.35-μm CMOS," IEEE J. Solid-State Circuits, vol.36, no.12, pp.1847–1858, Dec. 2001. - [6] P.C.S. Scholtens and M. Vertregt, "A 6-b 1.6-Gsample/s flash ADC in 0.18-μm CMOS using averaging termination," IEEE J. Solid-State Circuits, vol.37, no.12, pp.1599–1609, Dec. 2002. - [7] P. Figueiredo and J.C. Vital, "Averaging technique in flash analog-to-digital converters," IEEE Trans. Circuits Syst. I, vol.51, no.2, pp.233–253, Feb. 2004. - [8] P.M. Figueiredo and J.C. Vital, "Termination of averaging networks in flash ADCs," IEEE Proc. Int. Symp. Circuits and Syst., vol.1, pp.121–124, May 2004. - [9] G. Van der Plas, S. Decoutere, and S. Donnay, "A 0.16 pJ/conversion-step 2.5 mW 1.25 GS/s 4b ADC in a 90 nm digital CMOS process," IEEE ISSCC Dig. Tech. Papers, vol.1, pp.566–567, Feb. 2006. - [10] S. Park, Y. Palaskas, and M.P. Flynn, "A 4 GS/s 4 bit flash ADC in 0.18 m CMOS," IEEE ISSCC Dig. Tech. Papers, vol.1, pp.570– 571, Feb. 2006. - [11] B. Razavi, Design of Analog CMOS Integrated Circuits, McGraw-Hill, New York, 2001. - [12] M.J.M. Pelgrom, A.C.J. Duinmaijer, and A.P.G. Welbers, "Matching properties of MOS transistors," IEEE J. Solid-State Circuits, vol.24, no.5, pp.1433–1440, Oct. 1989. - [13] K. Poulton, R. Neff, B. Setterberg, B. Wuppermann, T. Kopley, R. Jewett, J. Pernillo, C. Tan, and A. Montijo, "A 20 GS/s 8 b ADC with a 1 MB memory in 0.18 μm CMOS," IEEE ISSCC Dig. Tech. Papers, vol.1, pp.318–319, Feb. 2003. - [14] G. Geelen, "A 6 b 1.1 GSample/s CMOS A/D converter," IEEE ISSCC Dig. Tech. Papers, vol.1, pp.128–129, Feb. 2001. - [15] C. Sandner, M. Clara, A. Santner, T. Hartig, and F. Kuttner, "A 6-bit 1.2-GS/s low-power flash-ADC in 0.13-\(\mu\)m digital CMOS," IEEE J. Solid-State Circuits, vol.40, no.7, pp.1499–1505, July 2005. - [16] S. Park, Y. Palaskas, A. Ravi, R.E. Bishop, and M.P. Flynn, "A 3.5 GS/s 5-b flash ADC in 90 nm CMOS," IEEE Custom Integrated Circuits Conf., vol.1, pp.489–492, Sept. 2006. - [17] O. Viitala, S. Lindfors, and K. Halonen, "A 5-bit 1-GS/s flash-ADC in 0.13-μm CMOS using active interpolation," IEEE European Solid-State Circuits Conf., vol.1, pp.412–415, Sept. 2006. - [18] K. Deguchi, N. Suwa, M. Ito, T. Kumamoto, and T. Miki, "A 6-bit 3.5-GS/s 0.9-V 98-mW flash ADC in 90 nm CMOS," IEEE Symp. on VLSI Circuits, vol.1, pp.64–65, June 2007. - [20] Y.-Z. Lin, Y.-T. Liu, and S.-J. Chang, "A 5-bit 4.2-GS/s flash ADC in 0.13-\(\mu\mathrm{m}\) CMOS," IEEE Custom Integrated Circuits Conf., vol.1, pp.213-216, Sept. 2007. Ying-Zu Lin received B.S. and M.S. degrees in Electrical Engineering from National Cheng-Kung University, Taiwan, in 2003 and 2005, respectively. At the same school, he is now working toward his Ph.D. degree. His research interests include analog/mixed-signal circuits and comparator-based high-speed analog-to-digital converters. In 2005, Ying-Zu Lin won the Excellent Award in the mater thesis contest held by Mixed-Signal and RF (MSR) Consortium, Taiwan. In 2007, Mr. Lin is the recipient of the Best Paper Award of VLSI Design/CAD Symposium, Taiwan, and TSMC Outstanding Student Research Award. Soon-Jyh Chang received B.S. degree in electrical engineering from National Central University, Taiwan, in 1991. He obtained his M.S. and Ph.D. degrees in Electronic Engineering from National Chiao-Tung University, Taiwan, in 1996 and 2002 respectively. Presently, he is an assistant professor of Electrical Engineering at National Cheng-Kung University in Taiwan. His research interests include design, testing and design automation for analog and mixed-signal circuits. Yen-Ting Liu received the B.S. and M.S. degrees in electrical engineering from National Cheng Kung University, Tainan, Taiwan, in 2004 and 2006. His area of research is mixed-signal circuit design with emphasis on data converters in scaled CMOS technologies.