# Diagnosis of Signaling and Power Noise Using In-Place Waveform Capturing for 3D Chip Stacking

Satoshi TAKAYA<sup>†a)</sup>, Student Member, Hiroaki IKEDA<sup>†b)</sup>, Nonmember, and Makoto NAGATA<sup>†c)</sup>, Senior Member

SUMMARY A three dimensional (3D) chip stack featuring a 4096-bit wide I/O demonstrator incorporates an in-place waveform capturer on an intermediate interposer within the stack. The capturer includes probing channels on paths of signaling as well as in power delivery and collects analog waveforms for diagnosing circuits within 3D integration. The collection of in-place waveforms on vertical channels with through silicon vias (TSVs) are demonstrated among 128 vertical I/O channels distributed in 8 banks in a 9.9 mm  $\times$  9.9 mm die area. The analog waveforms confirm a full 1.2-V swing of signaling at the maximum data transmission bandwidth of 100 GByte/sec with sufficiently small deviations of signal skews and slews among the vertical channels. In addition, it is also experimentally confirmed that the signal swing can be reduced to 0.75 V for error free data transfer at 100 GByte/sec, achieving the energy efficiency of 0.21 pJ/bit. key words: wide I/O bus, through silicon via, signal integrity, power integrity

#### 1. Introduction

Low power electronics will continue its evolvement with a three directional (3D) integration technology. Through silicon vias (TSVs) as a key enabler of vertical digital data transmission between stacked dice reduce the length of signal routing and associated parasitic capacitance as well. This will lead significant improvements of the data bandwidth as well as power efficiency of highly active data communication between memory and logic chips, over conventional planar bus structures on an FR-4 board or an interposer [1]. The 3D stacking of memory chips advances memory systems in mobile applications [2] as well as for lowcost high performance computation [3]. Heterogeneous 3D integration of analog/RF and digital/ $\mu$ P chips will reduce a foot print, attenuate background noise couplings [4], and extend processing capabilities. Vertical signaling essentially characterizes the system-level performance of 3D integrated circuits and systems.

Since it is not possible within a 3D chip stack to probe circuit nodes by physical needles or to magnify structures on silicon surfaces by optical or electron-beam microscopies, on-chip electronic diagnosis measures will only provide solutions to see internal signals on wires or to verify operation of circuits. A variety of test procedures have been proposed or standardized for a 3D chip stack, mostly through digital signaling through wrappers or JTAG protocols [5]. These structures are partly included in a built-in self-test (BIST) mechanism. On the other hand, there is a need of observing analog waveforms on paths of signaling or in power delivery within a 3D stack, for characterizing propagation in vertical channels, unexpected weak opens in vertical connections, multi-tier background noise couplings, and so forth. The primary focus of this paper is therefore on the study of inplace waveform capturing and its applications for on-chip diagnosis of vertical signaling through TSVs in a 3D chip stack. Analog quantities will be extracted from time domain waveforms, providing internal diagnostic observations regarding digital go/no-go test results.

The remaining part of this paper is as follows. Wide I/O test vehicle is outlined in Sect. 2. In-place waveform capturers in 2D and 3D test samples are evaluated in Sect. 3. Diagnosis of vertical channels in the wide I/O test vehicle is demonstrated in Sect. 4. A brief summary will be then given in Sect. 5.

## 2. Wide I/O Test Vehicle

Wide I/O test vehicle of Fig. 1 incorporates a three dimensional stack of a memory chip (MEM), an active silicon interposer (ASI), and a logic interface chip (LOGIC), mounted on an organic substrate of a 527 pin ball grid array (BGA) package. The entire 3D chip stack including the organic substrate is plastic molded and assembled on a system printed circuit board (PCB) for performance evaluation. As one of the first demonstrators of wide bandwidth vertical signaling, a wide input/output bus was embodied in



Fig. 1 Wide I/O test vehicle.

Manuscript received October 21, 2013.

Manuscript revised January 31, 2014.

<sup>&</sup>lt;sup>†</sup>The authors are with the Graduate School of System Informatics, Kobe University, Kobe-shi, 657-8501 Japan.

a) E-mail: takaya@cs26.scitec.kobe-u.ac.jp

b) E-mail: h.ikeda@cs26.scitec.kobe-u.ac.jp

c) E-mail: nagata@cs.kobe-u.ac.jp

DOI: 10.1587/transele.E97.C.557

Copyright © 2014 The Institute of Electronics, Information and Communication Engineers



**Fig. 2** (a) Placement overview of wide I/O banks and in-place waveform capturer, in physical layout of ASI. (b) Magnified view of a single I/O bank.

the structure with 4096-bit TSV data channels and accomplished the power efficiency of 0.56 mW/Gb/s with 1.2 V power supply, at the data transfer of 100 GByte/s [6].

A memory bank of 800 kByte SRAM on the top die is connected with logic interface circuits on the bottom die, through a 4096 bit wide I/O data bus with vertical channels based on TSVs. The bottom die includes TSVs for connections with 527 pin BGA formed on an organic substrate. BIST mechanisms are equipped in the top and bottom dice for at-speed generation and transfer of data bits as well as comparison of the received data bits with expected ones. The number of fail bits is continuously stored in a fail register during persistent repeat of operation. A fail bit map can also be created by the analysis of registered fail bits. The ASI incorporates an in-place waveform capturer for the diagnosis of signaling in vertical channels, in cooperation with the BIST.

Horizontal placements of wide I/O vertical channels are viewed in Fig. 2, represented by the physical layout of ASI. The 4096 bits are divided into eight banks evenly distributed in the chip area. Each bank has 832 TSV pins divided into two TSV sub arrays ( $64 \times 7$  and  $64 \times 6$ ), incorporating vertical I/O channels of 512 bits with additional 16 bits for 32:1 redundancy, and a pair of  $V_{DD}$  and  $V_{SS}$  pins per every 5 columns (Fig. 2(b)).

The chips and interposer used a 90 nm CMOS technology. The wide I/O circuits, memory cores, and related digital circuits nominally use 1.2 V transistors. The area of each chip is  $9.9 \text{ mm} \times 9.9 \text{ mm}$ . Vertical connections with more

|                                     | Logic Chip                                                      | Active Si Interposer | Memory Chip   |  |  |
|-------------------------------------|-----------------------------------------------------------------|----------------------|---------------|--|--|
| Supply Voltage                      | Core & Internal I/O=1.2 V, External I/O=3.3 V, Analog=1.2/3.3 V |                      |               |  |  |
| Clock Frequency                     | Wide I/O = Up to 200 MHz, Core = Up to 100 MHz                  |                      |               |  |  |
| SRAM                                | -                                                               | -                    | 800 kByte     |  |  |
| Device design rule                  | 90 nm CMOS                                                      |                      |               |  |  |
| Die Size                            | 9.932 mm × 9.932 mm                                             |                      |               |  |  |
| Metal Layers                        | M1-7(Cu), M8(AI)                                                |                      |               |  |  |
| TSV Depth                           | 50 μm                                                           | 50 μm                | -             |  |  |
| TSV Diameter                        | 20 µm                                                           | 20 µm                | -             |  |  |
| TSV Pitch                           | 200 µm                                                          | 50 µm                | -             |  |  |
| TSV Process                         | Via Last - Cu                                                   | Via Last - Cu        | -             |  |  |
| # of Front-side μBump<br>(Material) | 7,328 (SnAg/Cu)                                                 | 7,326 (SnAg/Cu)      | 7,326 (Ni/Au) |  |  |
| # of Back-side μBump<br>(Material)  | 729 (Ni/Au)                                                     | 7,328 (Ni/Au)        | -             |  |  |
| Stacking Method                     | Chip to Chip                                                    |                      |               |  |  |
| Organic Substrate                   | 26 mm × 26 mm × 0.676 mm, 8 Metal Layer                         |                      |               |  |  |
| Package                             | 527pin BGA                                                      |                      |               |  |  |

Fig. 3 Wide I/O process specifications [6]. (©2013 IEEE)



Fig. 4 Wide I/O test vehicle.

than 7.3k TSVs as well as the same number of  $\mu$ Bumps are densely formulated with via-last 50 $\mu$ m pitch Cu TSV and chip-to-chip stacking processes [7]. The process specifications of the vehicle are summarized in Fig. 3.

The photo of the wide I/O test vehicle is given in Fig. 4. The 3D chip stack is mounted in a BGA socket and then connected with the system PCB for evaluation. The system consists of a field programmable gate array (FPGA) board, interface modules of USB and I2C bus protocols, a crystal oscillator for system clocking and drivers for distribution, and a variety of electronic components including decoupling capacitors. The FPGA is programmed to concurrently control the wide I/O operation and waveform capture functions. The entire system is governed by a personal computer (PC).

#### 3. In-Place Waveform Capturer

#### 3.1 Overview

An in-place waveform capturer of Fig. 5 incorporates arrays of probing front end (PFE) circuits and a common waveform acquisition kernel (WAK) [8]. The PFE senses a voltage at the point of probe wiring and digitizes a waveform with the series of sampling timings provided by the WAK. The PFEs are prepared for the power supply voltage ( $V_{DDIO}$ ), ground voltage ( $V_{SS}$ ), and voltages in the middle of power rails for



Fig. 5 Block diagram of in-place waveform capturer [8]. (©2013 IEEE)



Fig. 6 Probing front end circuit schematic.

the upper or lower half of signals ( $V_{SIG_{-H}}$ ,  $V_{SIG_{-L}}$ , respectively). The four PFE channel types of  $V_{DD}$ ,  $V_{SS}$ ,  $V_{SIG_{-H}}$ , and  $V_{SIG_{-L}}$  form a single monitor block. Each PFE channel selects a single probing point of interest from 16 probe wires through switches at its front. The capturer needs to be embedded within a 3D stack for the sake of diagnosing channels in a vertical data link that is normally unreachable from the outside of the stack.

The PFE in Fig. 6 consists of a source follower (SF) and a latch comparator (LC) at the front and back end, respectively. The SF replicates the input voltage  $(V_{\rm IN})$  at its output node with an offset DC voltage tuned for each voltage domain of interest. The LC then in-place digitizes the output voltage of SF  $(V_{SFO})$  through successive comparison with a reference voltage  $(V_{\text{REF}})$  given by an on-chip voltage generator (VG), at the sample timing  $(T_{\text{SAMP}})$  defined by an on-chip timing generator (TG). The comparisons are iteratively made by the LC at the same sample timing for searching the most approximated  $V_{\text{REF}}$  to  $V_{\text{SFO}}$ . The sequential bit stream output from the LC is first converted to the probabilities of comparison  $(P_{CMP})$  by a data processing unit (DPU) and then used in the backend digital control procedures embodied in FPGA for the nearest search of  $V_{\text{REF}}$ to  $V_{\text{SFO}}$ . The  $P_{\text{CMP}}$  is computed as Eq. (1). The output of the LC,  $D_{OUT}$ , is accumulated by the DPU for the number of iterations ( $N_{\text{CMP}}$ ) in comparing  $V_{\text{SFO}}$  and  $V_{\text{REF}}$  at  $T_{\text{SAMP}}$ 



Fig. 7 Nearest search algorithm.

and then divided by  $N_{\rm CMP}$  to obtain  $P_{\rm CMP}$ .

$$P_{\rm CMP} = \frac{\sum D_{\rm OUT}}{N_{CMP}} = \begin{cases} 1 & (V_{\rm SFO} > V_{\rm REF}) \\ 0 & (V_{\rm SFO} < V_{\rm REF}) \end{cases}$$
(1)

The nearest search algorithm is outlined in Fig. 7, based on an essential fact that the voltage difference between adjacent digitized points becomes small if the sampling interval is small, namely oversampling is applied. The initial value of  $V_{\text{REF}}$  for the search at  $T_{\text{SAMP}} = T_n$  is chosen as the last selected value of  $V_{\text{REF}}$  as the voltage nearest to  $V_{\text{SFO}}$  determined in the previous search at  $T_{\text{SAMP}} = T_{n-1}$ , as shown in the inset of Fig. 7. The calculation of  $P_{\text{CMP}}$  decides the next value of  $V_{\text{REF}}$  to increase or to decrease, and  $V_{\text{REF}}$  is iteratively updated in the same way. Finally, the nearest  $V_{\text{REF}}$  to  $V_{\text{SFO}}$  is determined. The next initial value of  $V_{\text{REF}}$  at  $T_{n+1}$  is then set to the last selected value of  $V_{\text{REF}}$  at  $T_n$  in the same way.

This algorithm reduces the number of voltage search steps for  $V_{\text{REF}}$  to reach an approximately equal voltage to  $V_{\text{SFO}}$  when starting from the voltage at the bottom of the voltage range. The acceleration of waveform capturing by this search algorithm was originally discussed in [9], [10], however, the size of voltage step of  $V_{\text{REF}}$  was not changed in the present paper for simplifying the measurement program.

The waveform capturer uses 3.3 V I/O devices for covering the whole voltage range of 1.2 V for wide I/O operation. The voltage and timing resolutions of the orders of  $300 \,\mu\text{V}$  and  $10 \,\text{ps}$ , respectively, are realized. The resolutions are manually tunable to the characteristics of waveforms to capture. The WAK includes TG, VG, DPU, and interface logic circuits. The details of PFE and WAK have been reported in [9]–[11] with respect to their circuit designs and digitizing algorithms.

#### 3.2 Integration

The ASI involves an in-place waveform capturing system for snooping waveforms passing through a vertical channel, as conceptually depicted in Fig. 8. A pair of transceivers (mini I/O circuits) shown in Fig. 8(b) have 4-level driving strengths ( $I_{drv}$ ) and drive a full-swing binary logic signal in a time division duplexed bi-directional way. A vertical channel is formed by a single TSV, a single  $\mu$  bump, and metal wires by the backend of line (BEOL) as well as in the



**Fig. 8** (a) Snooping waveforms through vertical channels with (b) mini I/O transceiver circuits.



Monitored power supply TSVs

Fig.9 Probe wiring of in-place waveform capture in wide I/O bank, in physical layout of ASI.

backside redistribution layer (RDL). The PFEs for the signal ( $V_{SIG_H}$  and  $V_{SIG_L}$ ) sense and digitize the waveforms in a vertical channel. The other PFEs for power and ground nodes ( $V_{DD}$  and  $V_{SS}$ , respectively) capture waveforms in the three dimensional power delivery network with power and ground TSVs.

The waveform capture is in oblong layout and located in the center area of the interposer so as not to conflict with the placements of TSV banks, as shown in the physical layout of Fig. 2. The waveform capture includes eight arrays of PFE circuits, corresponding to one PFE block per the wide I/O bank.

The PFE block faces to 16 probe wires for the TSV channels of redundancy in a bank, as given in Fig. 9. The other 4 probe wires are for the  $V_{DDIO}$  and  $V_{SS}$  connections within the bank. The probe wires are then selected by the corresponding PFE. The total number of probing channels is 160 in the test vehicle.

The power delivery network of the capturer is isolated from the wide I/O circuits. The control and data signals for the capturer also pass the vertical channels and then are connected to C4 bumps of BGA in the back side of the logic chip. These signals communicate with an external FPGA



**Fig. 10** Measured static (DC) response of PFEs for voltage domains of power ( $V_{DD}$ ), ground ( $V_{SS}$ ), and signal ( $V_{SIG\_H}$  and  $V_{SIG\_L}$ ). The captures in 2D and 3D realization are compared.

chip as a monitor controller, independently from the operation of wide I/O data channels. The power pins for the capturer are provided in both the top and the bottom areas of the die.

## 3.3 Performance

Input-output DC transfers of the PFE channels in a PFE block are measured as given in Fig. 10. The voltage range of  $\pm$  200 mV at the centers of 1.2 V( $V_{DD}$ ) and 0.0 V( $V_{SS}$ ) allows power noise measurements. The rail-to-rail voltage range from 0.0 V to 1.2 V is covered by the SIG\_L and SIG\_H channels for signal quality measurements.

The transfer characteristics are measured for the waveform capturer in the ASI chip as a stand-alone test sample (before going to 3D integration), and also for the capturer within the wide I/O test vehicle after 3D stacking as well. The measured DC responses are shown to be almost identical among these samples.

Dynamic transfers for AC signals are measured as in Fig. 11. The highest signal-to-noise distortion ratio (SNDR) of 40 dB approximately provides a 7 bit linearity for capturing waveforms with  $\pm 100 \text{ mV}$  amplitudes. The demonstrated SNDR is not compatible to the reported numbers in [12], although the on-chip waveform capturers used a very similar architecture. This happened partly from the situation where the capturer was not fully optimized to the given technology different from [12], while the authors had much focused on the realization of in-place waveform capturing for the three dimensional wide I/O demonstrator. The figure also shows the frequency components in the digitization of sinusoidal waveforms at 10 MHz. The highest spur at the 2<sup>nd</sup> harmonics of the sinusoid limits the largest SNDR similarly in both 2D and 3D samples. The AC performance is again not impacted by the 3D chip stacking. The performance of in-place waveform capturing is as expected even in the 3D



Fig. 11 Measured dynamic (AC) response. The captures in 2D and 3D realization are compared.

test vehicle. This results naturally from the immediate digitization of signals within the PFE with the help of WAK in the ASI. It also eliminates the undesired coupling or parasitic impedance in series among the signals of the waveform capturer through vertical and horizontal wiring paths between ASI in the stack and peripheral electronic components on an evaluation board.

# 4. Diagnosis of Vertical Channel

# 4.1 Signal and Power Integrity

The waveforms are in-place captured in a 3D stack sample and provide in-depth observations of power and signal qualities in vertical conductive connections with TSVs. The power waveforms are given in Fig. 12 for  $V_{\text{DDIO}}$  and  $V_{\text{SS}}$  wiring within one bank of TSV channels. The amount of voltage variations depends on driving strength of mini I/O. However, it remains within 10% of the nominal supply voltage. The stability of power supply is therefore confirmed in the wide I/O test vehicle.



Fig. 12 Noise waveforms of (a) power and (b) ground.

The signal waveforms captured at the 16 probe points among the eight I/O banks are given in Fig. 13, under the  $V_{DDIO}$  at 1.2 V, and the driving strength of 0.5 mA. The waveforms exhibit the operation with sufficient margins, having a full signal swing of 1.2 V. The eye diagram inplace obtained by the capturer for a particular single bit in the vertical bus shows a wide eye opening in a voltage domain as well as in a timing window of the bit time of 2.5 ns (at 200 MHz). The BIST also confirms an error-free data transfer operation at 100 GByte/sec, even with the data pattern of "5-A-5-A-5-A-5-A" that has the highest bit activities and the densest bit flips among neighboring bits at everywhere within 4096 bits.

Signal skews are characterized from the waveforms as given in Fig. 13(a). The skew is defined as the time difference between the slowest and fastest captured signals in the middle of the signal swing in a rise transition, and shown in this bank to be as small as 308 ps. The skew in each bank was figured from the signal timings among 16 channels, eliminating the static timing offset in the time base from the input terminal on the PCB to the input node of mini I/O transceivers. The skews are compared among the banks, as given in Fig. 14(a). The skews in 128 TSV channels in each bank are evenly distributed within 550 ps and



**Fig. 13** Wide I/O operation at  $V_{\text{DDIO}} = 1.2$  V and  $I_{\text{drv}} = 0.5$  mA. (a) Captured signal waveforms among 16 vertical channels and (b) eye diagram. Power supply and ground voltage waveforms are also shown [8]. (©2013 IEEE)

consistent among the eight banks.

The signal slews are then derived from the waveforms, as also given in Fig. 13(a). The slew is defined as the time to swing from 10% to 90% of signal amplitude of 1.2 V in a rise transition. The distribution of measured slew among the channels over banks is given in Fig. 14(b). They are concentrated around 2.5 ns/1.2 V without significant deviations. The skew and slew with such small deviations among the selected 128 channels reasonably represent all the vertical channels distributed among eight banks. The timing variations can be sufficiently smaller in comparison with the half clock period of 2.5 ns for the 100 GByte/sec with 4096 bit width.

The waveforms captured in the 3D stack therefore straightforwardly confirm the high stability in power delivery and timing margins for quality signaling. The in-place collection of waveforms primarily helps diagnosing physical connections and potentially enhances the stability of operations by tuning parameters or configuring circuits like signal drivers. The selection of driving current in mini I/O circuits (Fig. 8) was demonstrated in [6], where the in-place



**Fig. 14** Signal diagnosis with in-place captured waveforms.  $V_{\text{DDIO}} = 1.2 \text{ V}$  and  $I_{\text{drv}} = 0.5 \text{ mA}$ . Distribution of (a) signal skew in each bank and (b) signal skews by occurrence over banks.

waveforms were characterized in eye opening as the measure of stable signaling. The completeness of 3D process technologies is also indirectly proven by the stable waveforms.

#### 4.2 Low Signal Swing Operation

The wide I/O test vehicle explores the energy efficiency of digital data transfer between memory and logic tiers through vertical channels with TSVs. A low-voltage signal swing is pursued for maximizing the energy efficiency of transferring bits. The stable operation at 100 GByte/sec is always confirmed with the evaluation of signal quality by using the in-place waveform capturer as well as the measurements of bit error rates by the on-chip BIST mechanism. The driver strength of mini I/O circuits is fixedly chosen at 0.5 mA in the following experiments.

The power supply voltage of mini I/O circuits,  $V_{DDIO}$ , is reduced to 0.75 V from the standard voltage of 1.2 V. The BIST has a separate power domain from the mini I/O circuits and remains to be supplied at the nominal voltage of 1.2 V. The waveform capturer is also isolated with its own



**Fig. 15** Wide I/O operation at  $V_{\text{DDIO}} = 0.75$  V and  $I_{\text{drv}} = 0.5$  mA. (a) Captured signal waveforms and (b) eye diagram. Power supply and ground voltage waveforms are also shown [8]. (©2013 IEEE)

power domain at 3.3 V.

The signal waveforms and eye diagram for  $V_{\text{DDIO}}$  of 0.75 V are given in Fig. 15, as the lowest supply voltage in an error free operation. The signal swing reduces to 30% of  $V_{\text{DDIO}}$  and the skew enlarges to 2.1 ns. Fail bit maps of Fig. 16 show the number of erroneous bits integrated over the iteration of data transfer by the BIST, in every bin of 4096 bits among the banks. The operation at 100 GByte/sec is stable and error free under  $V_{\text{DDIO}}$  of 0.75 V (Fig. 16(a)).

In comparison, the operation becomes totally erroneous when  $V_{\text{DDIO}}$  is further lowered down to 0.70 V (Fig. 16(b)). The error bits are not localized at all in the space of TSV channels. The error rate exponentially increases with the reduction of  $V_{\text{DDIO}}$  as plotted in Fig. 17, for three measured samples. These results prove a naturally independent occurrence of fail operations among 4096 bits. It should be noted that any redundant bit is not assigned in these experiments.

The energy efficiency per a single bit transfer at 100 GByte/s is evaluated in response to  $V_{\text{DDIO}}$ , as in Fig. 18. The smallest energy efficiency is measured as 0.21 pJ/bit with  $V_{\text{DDIO}}$  of 0.75 V. The measured energy efficiency is

| (a) | BANKA             | BANKO    |             |
|-----|-------------------|----------|-------------|
|     |                   |          |             |
|     |                   |          |             |
|     |                   |          |             |
|     | BANKE             | BANK2    |             |
|     |                   |          |             |
|     | PANK7             |          | # of errors |
|     | DANKI             |          |             |
|     |                   |          | 10-99       |
| (b) | BANK4             | BANK0    | 100-999     |
|     | ace a ci Ci Li Li | 101 co k | 1000-       |
|     | BANK5             | BANK1    |             |
|     | a com re il il fe |          |             |
|     | BANK6             | BANK2    |             |
|     |                   |          |             |
|     | BANK7             | BANK3    |             |
| a   |                   |          |             |

Fig. 16 Fail bit map measured by BIST. (a)  $V_{\text{DDIO}} = 0.75 \text{ V}$  and (b)  $V_{\text{DDIO}} = 0.70 \text{ V}$ .



**Fig. 17** Error rate versus  $V_{\text{DDIO}}$  at  $I_{\text{drv}} = 0.5 \text{ mA}$ .



**Fig. 18** Measured energy efficiency versus  $V_{\text{DDIO}}$  at  $I_{\text{drv}} = 0.5 \text{ mA}$ .

highly competitive with the more sophisticated I/O circuits [13], even with the simplest mini I/O circuit consisting of traditional inverter based buffers.

The size of dynamic voltage variations are measured by the waveform capturer on power supply and ground wires within the power domain of  $V_{DDIO}$ , as superposed in the eye diagrams of Fig. 15(b). The voltage variations, often recognized as power noises, exhibit the amplitude of negligible significance in comparison with the signal swing. This is due partly to the low impedance of the stacked power delivery network with distributed TSVs among the banks and helps the stable low voltage operation at a high data rate.

## 5. Conclusion

The in-place waveform capturer was realized in a 3D chip stack and demonstrated the diagnosis of power delivery and signaling within 4096 bit wide I/O bus structure running at 100 GByte/sec. The diagnosis was based on captured analog waveforms and evaluated the quality of signaling with quantities such as swing, skew, and slew. Power noise was also measured.

In-place waveform capturing functionality strongly helps in-depth characterization of electrical as well as mechanical properties of 3D integration, specifically even with assembly technologies under active developments. The runtime adjustments of circuit parameters will also rely on the in-place diagnosis and potentially enhance the yields of 3D integration. The deployments of waveform capturing need further investigations.

## Acknowledgment

This work was partly supported by NEDO in the project of the development of Functionally Innovative 3D-Integrated Circuit (Dream Chip) Technology. The authors would like to thank Shiro Uchiyama, Harufumi Kobayashi, and Atsushi Sakai for scientific discussions.

#### References

- [1] J. Roullard, A. Farcy, S. Capraro, T. Lacrevaz, C. Bermond, G. Houzet, J. Charbonnier, C. Fuchs, C. Ferrandon, P. Leduc, and B. Flechet. "Evaluation of 3D interconnect routing and stacking strategy to optimize high speed signal transmission for memory on logic," Proc. 62nd IEEE Electronic Components and Technology Conference, pp.8–13, May 2012.
- [2] J.-S. Kim, C.S. Oh, H. Lee, D. Lee, H.-R. Hwang, S. Hwang, B. Na, J. Moon, J.-G. Kim, H. Park, J.-W. Ryu, K. Park, S.-K. Kang, S.-Y. Kim, H. Kim, J.-M. Bang, H. Cho, M. Jang, C. Han, J.-B. Lee, K. Kyung, J.-S. Choi, and Y.-H. Jun, "A 1.2 V 12.8 GB/s 2 Gb mobile wide-I/O DRAM with 4 × 128 I/Os using TSV-based stacking," Dig. Tech. Papers, 2011 IEEE Intl. Solid-State Circuits Conference, pp.496–497, Feb. 2011.
- [3] J. Jeddeloh and B. Keeth, "Hybrid memory cube new DRAM architecture increases density and performance," Dig. Tech. Papers, 2012 Symp. on VLSI Technology, pp.87–88, June 2012.
- [4] Y. Araga, M. Nagata, G. Van der Plas, J. Kim, N. Minas, P. Marchal, Y. Travaly, M. Libois, A. La Manna, W. Zhang, and E. Beyne, "In-tier diagnosis of power domains in 3D TSV ICs," Proc. 2012

IEEE International 3D Systems Integration Conference, pp.1–6, Feb. 2012.

- [5] E.J. Marinissen, J. Verbree, and M. Konijnenburg, "A structured and scalable test access architecture for TSV-based 3D stacked IC," Proc. 2010 28th IEEE VLSI Test Symposium, pp.269–274, April 2010.
- [6] S. Takaya, M. Nagata, A. Sakai, T. Kariya, S. Uchiyama, H. Kobayashi, and H. Ikeda, "A 100 GB/s wide I/O with 4096b TSVs through an active silicon interposer with In-place waveform capturing," ISSCC Dig. Tech. Papers, pp.434–435, 2013.
- [7] H. Takatani, Y. Tanaka, Y. Oizono, Y. Nabeshima, T. Okumura, T. Sudo, A. Sakai, S. Uchiyama, and H. Ikeda, "PDN impedance and noise simulation of 3D SiP with a widebus structure," Proc. 62nd IEEE Electronic Components and Technology Conference, pp.673–677, May 2012.
- [8] S. Takaya, M. Nagata, and H. Ikeda, "Very low-voltage swing while high-bandwidth data transmission through 4096 bit TSV," Proc. 2013 IEEE International 3D Systems Integration Conference, pp.III.1.1–III.1.4, Oct. 2013.
- [9] Y. Araga, T. Hashida, and M. Nagata, "An on-chipwaveform capturing technique pursuing minimum cost of integration," Proc. Intl. Symp. on Circuits and Systems, pp.3557–3560, May 2010.
- [10] Y. Araga, N. Ueda, Y. Takagi, and M. Nagata, "Performance evaluation of probing front-end circuits for on-chip noise monitoring," IEICE Trans. Fundamentals, vol.E96-A, no.12, pp.2516–2523, Dec. 2013.
- [11] T. Hashida and M. Nagata, "An on-chip waveform capturer and application to diagnosis of power delivery in SoC integration," IEEE J. Solid-State Circuits, vol.46, no.4, pp.789–796, 2011.
- [12] T. Hashida, Y. Araga, and N. Nagata, "A diagnosis testbench of analog IP cores for characterization of substrate coupling strength," IEICE Trans. Electron., vol.E94-C, no.6, pp.1016–1023, June 2011.
- [13] Y. Liu, W. Luk, and D. Friedman, "A compact low-power 3D I/O in 45 nm CMOS," Dig. Tech. Papers, 2012 IEEE Intl. Solid-State Circuits Conference, pp.142–143, Feb. 2012.



Satoshi Takaya received a B.E. and M.E. degree in Department of Computer and System Engineering, Faculty of Engineering, Kobe University in 2009 and 2011, respectively. He is currently a doctoral course student of Kobe University. His present research focus is design techniques of mixed-signal LSI. He is a student member of IEEE and IEICE.



Hiroaki Ikeda received B.E. and M.E. from Hiroshima University in 1975 and 1977 respectively. He Joined NEC in 1977 and started his carrier as a DRAM designer. In 1999, he moved to Elpida and drove 3D-IC/TSV technology developments from 2004 including NEDO (New Energy and Industrial Technology Development Organization) projects. He assigned the general manager of 3-D integration technology research department of Association of Super-Advanced Electronics Technologies (ASET) from October

2011. He is currently a visiting professor of the graduate school of system informatics, Kobe University and also the CTO of Napra.



**Makoto Nagata** received the B.S. and M.S. degrees in physics from Gakushuin University, Tokyo, Japan, in 1991 and 1993, respectively, and the Ph.D. in electronics engineering from Hiroshima University, Japan, in 2001. He was a research associate at Hiroshima University, Japan, from 1994 to 2002, and then an associate professor of Kobe University, Japan, from 2002 to 2009. He is currently a professor of the graduate school of system informatics, Kobe University. His research interests include design tech-

niques toward high performance mixed analog, RF, and digital VLSI systems with particular emphasis on power/signal/substrate integrity and electromagnetic compatibility, testing and diagnosis, three dimensional system integration, as well as connectivity and security applications. Dr. Nagata has been a member of a variety of technical program committees of international conferences such as the Symposium on VLSI Circuits (2002–2009), Custom Integrated Circuits Conference (2007–2009), Asian Solid-State Circuits Conference (2005–2009), International Solid-State Circuits Conference (2014–), and many others. He was a technical program chair (2010–2011) and a symposium chair (2012–2013) for Symposium on VLSI circuits. He also served as an associate editor of the IEICE Transactions on Electronics (2002–2005).