## A Monolithically-Integrated Chip-to-Chip Optical Link in Bulk CMOS C. Sun<sup>1</sup>, M. Georgas<sup>1</sup>, J. S. Orcutt<sup>1</sup>, B. R. Moss<sup>1</sup>, Y-H. Chen<sup>1</sup>, J. Shainline<sup>2</sup>, M. Wade<sup>2</sup>, K. Mehta<sup>1</sup>, K. Nammari<sup>2</sup>, E. Timurdogan<sup>1</sup>, D. Miller<sup>3</sup>, O. Tehar-Zahav<sup>3</sup>, Z. Sternberg<sup>3</sup>, J. C. Leu<sup>1</sup>, J. Chong<sup>1</sup>, R. Bafrali<sup>3</sup>, G. Sandhu<sup>3</sup>, M. Watts<sup>1</sup>, R. Meade<sup>3</sup>, M. A. Popović<sup>2</sup>, R. J. Ram<sup>1</sup> and V. Stojanović<sup>1,4</sup> <sup>1</sup>Massachusetts Institute of Technology, <sup>2</sup>University of Colorado Boulder, <sup>3</sup>Micron Technology, <sup>4</sup>UC Berkeley Email: sunchen@mit.edu Abstract A silicon-photonic link is monolithically-integrated in a bulk CMOS process for the first time. Deep-trench isolation enables polySi waveguide integration. PolySi resonant detectors remove the need for Ge integration. Split-diode design enables half-rate receivers, mitigating transistor speed limitations. An on-chip feedback loop locks the resonant defect detector to the laser wavelength, combating thermal upset. The 5m optical link achieves 5Gb/s at 3pJ/b electrical and 13pJ/b optical energy, in 0.18µm (100ps FO4) bulk CMOS memory periphery process. **Keywords**: Monolithic, optical, DRAM, transceiver, bulk. Integrated photonic interconnects present a disruptive alternative to electrical I/O for many VLSI applications. To date, silicon photonics has largely been constrained to SOI processes [1,2] that use buried oxide to provide optical mode confinement below the waveguide. Photonic devices previously demonstrated in a bulk DRAM periphery process [3] use solid-phase epitaxy silicon deposition for the waveguides and Ge for the photodetector (PD). However, due to design rule violations required to make such devices, successful integration with operational circuits has yet to be demonstrated. In this paper, we present a 5Gb/s chip-to-chip optical link with monolithically-integrated circuits and photonics in bulk CMOS with no Ge. The 5.5M-transistor technology development platform is fabricated in a modified flash periphery process with 0.18 $\mu$ m transistors and three metal layers. Low-loss waveguides and optical devices are formed using a customized polySi layer on top of 1.2 $\mu$ m thick deep trench isolation oxide, providing optical mode confinement and optical isolation from the silicon substrate. Each chip hosts an array of 35 electro-optic transceiver macros (Fig. 1). The synthesized digital backend interfaces with the custom transmitter (TX) and receiver (RX) through 8-to-2 mux/demux tree SerDes and runs at one-fourth the data clock. The custom TX (RX) perform the final 2-to-1 (de-)serialization and Fig. 1 Chip overview and components of the chip-to-chip link. Fig. 2 Modulator driver design, performance and energy cost. interface with a variety of integrated photonic modulators and detectors. To form a chip-to-chip link, an unmodulated wavelength $\lambda_1$ couples on-chip into a TX macro through a vertical grating coupler. The modulator driver drives a microring modulator that imprints data onto $\lambda_1$ , which then couples off-chip into a single-mode fiber bound for a RX macro on the receive chip. A resonant PD and receiver tuned to $\lambda_1$ capture $\lambda_1$ . The on-chip backend records the BER *in situ* and exports statistics off-chip. Figure 2 shows a push-pull diode driver with an NMOS pullup on the device anode to limit the forward-bias voltage. The depletion-mode ridge microring modulator is created via a partial polySi etch and doped to form a pn-junction across the ridge. The circuit drives the junction to $-V_{DD}$ for a full-depletion logic I and $V_{REF}$ - $V_T$ for a weak forward-bias $\theta$ , modulating the depletion region width. This creates a shift in resonance from the carrier-plasma effect and modulates the input wavelength. To mitigate the slow speed of the process ( $L_{\rm eff}$ = 220nm and FO4~100ps), at the RX we adopt a split-diode technique [4] where the ring PD is separated into two electrically-isolated half-PDs, each connected to one half-rate receiver (Fig. 3). A Fig. 3 Receiver schematic and performance, $V_{PD} = -10V$ . dummy half-PD and TIA serve as a reference for each sense-amplifier (SA) while current and capacitive DACs provide offset compensation and eye-measurement capability. Datarates above 4Gb/s squeeze the SA evaluation time, requiring much larger photocurrents to compensate. The PIN ridge-waveguide microring PD with responsivity of 0.2A/W utilizes free-carrier generation through sub-bandgap transitions involving defect states in the polySi [5]. The device exhibits 3dB bandwidths of 1.5GHz and 9.7GHz at -1V and -15V biases, respectively. To receive bits, the detector microring's resonance must align with the laser wavelength. Since chip temperatures fluctuate (and laser wavelengths drift), active resonance tuning is essential. We demonstrate an on-chip wavelength-locking circuit that maximizes the receive eye opening. The synthesized receive-side tuning sub-system (Fig. 4), clocked at 1/64th of the data clock, is composed of two optical power meter circuits, a small data path, and a configurable controller used together to wavelength-lock the ring and stabilize the photocurrent for the receiver. Similar to traditional dataconditioned level-trackers, the power meters use the receiver's offset compensation DACs and an up/down counter in a feedback loop to track photocurrent. During wavelengthlocked operation, one power meter takes control of one halfrate receiver and its counter is conditionally enabled using the data stream from the other half (which receives data normally). A DC wavelength vs. photocurrent sweep indicates a lock range of ~0.5nm (or 9K in temperature), limited by heater tuning range. We perform a 2.5Gb/s single-rate transient wavelength-lock experiment by clock-gating the backend of the adjacent wavelength-slices, creating a temperature aggressor that induces a ring temperature change of ~20K/W. The wavelength-locked receiver has BER=0 until t = 65s, when we apply a deliberately large aggression to exceed the lock range. By contrast, an unlocked receiver fails immediately with temperature perturbations caused by the aggressor. The tuning subsystem consumes 0.43mW at 2.5Gb/s (171fJ/bit) and 0.024mm<sup>2</sup> excluding the heater $\Delta\Sigma$ driver [6] and the receiver. Figure 5-left, shows the optical power breakdowns and link <10<sup>-10</sup> BER eye diagrams for a full-rate 2Gb/s chip-to-chip link over 5m of single-mode fiber. The data-rate is limited by the degradation of receiver sensitivity at higher rates and the maximum output power of the off-chip laser (10mW). While optimized vertical couplers with 3dB loss/coupler and Fig. 4 Thermal tuning backend and wavelength-lock demo. Fig. 5 Link laser power breakdown and BER eye diagrams for the 2Gb/s (no amplifier) and 5Gb/s (with amplifier) 5m link. critically coupled PD rings exist elsewhere on the test platform, sub-optimal vertical couplers (5dB loss/coupler) and an over-coupled PD (3dB extinction) placed in the circuit test sites contribute 9dB of extra loss. To overcome this unnecessary loss, we add an optical amplifier between the transmit chip and the receive chip, adding ~8dB of optical gain and enabling a 5Gb/s 5m chip-to-chip link. This configuration is shown in Figure 5-right. At 2Gb/s the 5m link consumes 4pJ/b electrical and 5pJ/b optical energy, while at 5Gb/s the link consumes 3pJ/b electrical and 13pJ/b optical energy. Placing the optimized couplers and rings into the link would lead to an order of magnitude lower optical energy cost. To our knowledge, this is the first chip-to-chip monolithically-integrated photonic link in a bulk CMOS process, demonstrating that photonic interconnects need not be confined to niche, high-cost processes. Dense and energy-efficient photonic interconnects are thus feasible even on cost-aware mainstream bulk CMOS platforms. ## Acknowledgments This work was funded by DARPA POEM award HR0011-11-C-0100 and contract HR0011-11-9-0009, led by Dr. Jagdeep Shah. The views expressed are those of the authors and do not reflect the official policy or position of the DoD or the U.S. Government. We gratefully acknowledge the support of all POEM team members at Micron Technology, Inc, MIT, University of Colorado Boulder, and BWRC at UC Berkeley. ## References - [1] J.F. Buckwalter, *et al*, "A Monolithic 25-Gb/s Transceiver with Photonic Ring Modulators and Ge Detectors in a 130-nm CMOS SOI Process," *IEEE JSSC*, vol.47, no.6, pp.1309-1322, June 2012. - [2] S. Assefa, *et al*, "Monolithic integration of silicon nanophotonics with CMOS," *IEEE Photonics Conference*, pp.626,627, Sept. 2012. - [3] D. J. Shin, *et al*, "Integration of silicon photonics into DRAM process," *OFC*, 2013, pp.1,3, 17-21 March 2013. - [4] M. Georgas, *et al.*, "A Monolithically-Integrated Optical Receiver in Standard 45-nm SOI [Invited]," *IEEE JSSC*, vol. 47, no. 7, pp. 1693-1702, July 2012. - [5] K. Preston, *et al*, "Waveguide-integrated telecom-wavelength photodiode in deposited silicon." *Optics Letters* 36.1 (2011): 52-54. [6] C. Sun, *et al*, "Integrated microring tuning indeep-trench bulk CMOS," *IEEE OI Conference*, pp. 54-55, May 2013.