## 28.7 A 20-Gb/s/pin 0.0024-mm² Single-Ended DECS TRX with CDR-less Self-Slicing/Auto-Deserialization to Improve Tolerance on Duty Cycle Error and RX Supply Noise for DCC/CDR-less Short-Reach Memory Interfaces

Jaeyoung Seo<sup>1</sup>, Sooeun Lee<sup>2</sup>, Myungguk Lee<sup>1</sup>, Changjae Moon<sup>1</sup>, Byungsub Kim<sup>1</sup>

<sup>1</sup>Pohang University of Science and Technology, Pohang, Korea <sup>2</sup>Samsung Electronics, Hwaseong, Korea

In massively parallel short-reach (SR) interfaces [2-5], thousands of I/Os communicate through many low-loss parallel interconnects (Fig. 28.7.1). Due to the large number of I/Os, each transceiver (TRX) design must fit within a small area and be energy-efficient. One challenge in TRX design is the increasing clocking area and power. Distributing the clock to thousands I/Os, while satisfying stringent duty cycle constraints, requires many duty-cycle correction (DCC) and duty-cycle detection (DCD) circuits. For reliable data recovery with a reduced eye opening, RXs also require precise clock and data recovery (CDR) or clock and data alignment (CDA) circuits. Their area and power also needs to be minimized, as these circuits are employed in proportion to the I/O count.

To reduce the area and power for SR memory interfaces, a 20Gb/s/pin single-ended (SE) compact TRX that works without high-speed DCC/DCD nor CDR/CDA is proposed (Fig. 28.7.1). The RX can directly recover and deserialize the data from the data-embedded clock signaling (DECS) input without using a CDR nor a CDA; this recovery and deserialization will be referred to as *self-slicing* and *auto-deserializing* in this paper. These techniques also improve tolerance to duty-cycle error and RX supply noise (SN); thus, DCC/DCD can be omitted. Since neither DCC/DCD nor CDR/CDA is required in the proposed TRX, its area and power can be significantly reduced.

The DECS signaling method, illustrated in Fig. 28.7.2, embeds data into the voltage level of the clock. To transmit 1, the TXP voltage is increased above the nominal reference clock voltage and is lowered to transmit 0. Compared to phase-difference modulation (PDM), where data is embedded into the transition edge of the clock [1], DECS is more suitable for high-speed operation: PDM signaling relies on about a 4× higher frequency spectrum than DECS. The proposed DECS TRX architecture is shown in Fig. 28.7.2: moverter-based drivers are used to improve the voltage swing while reflections are mitigated via proper RX termination. Data lanes transmit SE signals and one clock lane is used to forward the reference clock. Digitally-controlled delay lines (DCDLs) are used to deskew data lanes so that one clock lane can be shared across many data lanes (Fig. 28.7.1). The RX consists of N-type and P-type paths. Each path reacts depending on the input voltage level: N-type evaluate the data when the input is low and P-type when high. The RX does not use a clock input, from either a CDR or CDA, for slicing nor deserialization.

Figure 28.7.3 shows the TX schematic and its output waveforms. Disableable inverter banks are used as drivers, enabling driver strength control via static enable bits. The data driver consists of a data modulation (DM) driver and a weak driver. While the weak driver always produces a reduced clock waveform, the DM driver only turns on and increases the amplitude when necessary: **Deven** and **clock** are high, or **Dodd** and **clock** are low. Simulation results confirm that increasing the DM driver strength increases the amount of TXP amplitude modulation, while also enlarging the timing margin for self-slicing at the RX. DCDLs were used to correct the timig skew between TXP and CK<sub>TX</sub>. The clock (CK) driver is similar to the weak driver except that it generates a different amplitude for CK<sub>TX</sub>. DCC circuits are only used for testing tolerance to duty cycle error.

A detailed RX schematic diagram is shown in Fig. 28.7.4; the RX is very area efficient as it only consists of on-die-terminations (ODTs), non-clocked self-slicing comparators (SSCs), and dynamic latches (DLs). The N-type SSC evaluates the data when the input signal is low while the P-type SSC evaluates the data when the input signal is high. Therefore, deserialization of the DECS input is triggered by the DECS input itself, as shown by the simulated waveforms in Fig. 28.7.4. When the input common-mode voltage is high, the output voltages (comp and comp\_b) are pulled low. In this *preparation* phase, the voltage difference between comp and comp\_b is proportional to the differential DECS input voltage. This small voltage difference helps the SSC to quickly split the output during the *evaluation* phase. When the input common-mode voltage becomes low, the and start pulling comp and comp\_b up. Due to regeneration by the cross-coupled inverters only one of comp or comp\_b becomes high, depending on the DECS input. Through the connected pull-down NMOS input of the following DL, the results are updated in the DL. Similarly, the P-type SSC and the P-type DL make a decision when DECS input is high. The proposed self-slicing RX is less sensitive to RX SN than a typical RX using conventional comparators. Since the clock and the data are embedded in the

SSC DECS input, the resulting clock and data paths are equally matched and the impact of the RX SN on the RX performance is minimized. In a conventional RX the sampling clock is generated by a CDR; hence, the signal path is not well matched to the data path and clock jitter caused by SN significantly degrades RX performance. In addition, the SSC has good power supply rejection, since a reference clock is used as an input instead of a reference voltage. Similar to differential signaling, the DECS helps subtracting the common-mode noise. Supply-induced reference-voltage noise for a conventional SE comparator further degrades performance.

The DECS TRX is fabricated, with test-support blocks, in a 28nm CMOS LPP technology. An external RX clock is used to measure bathtub curves after auto-deserialization by the RX. A 1-mm 50-Ω on-chip transmission line is used to emulate a low-loss SR channel (Fig. 28.7.5): transmission lines are 4µm wide and spaced 12µm appart. Ground shields are placed between the data and clock lanes to reduce crosstalk. The measured channel loss is -2.5dB, at Nyquist. The measured maximum data rate is 20Gb/s/pin, with a 0.99UI eye (widest compared to prior state-of-the-art [2-5]), and a 10<sup>-10</sup> BER (Fig. 28.7.5 and 28.7.6). These results show that timing requirements after the RX are greatly relaxed: owing to the wide data eye and rail-to-rail signal swing after self-slicing/autodeserialization; hence, timing requirements for an RX clock, used to fetch the deserialized data, is much more relaxed than for a usual RX. The measured energy efficiency is 1.24pJ/b at 20Gb/s/pin. The proposed TRX can tolerate a 40-60% clock duty cycle distorition while maintaining a 0.88UI eye width and a 10° BER (Fig. 28.7.5); illustrating that the TRX can work without DCC/DCD. SN is injected on the PCB to test RX SN ssensitivity. A 0.88UI eye width is measured with a 50MHz 300mV<sub>n-n</sub> measured on the PCB (Fig. 28.7.5); in comparison, a 0.63UI wide eye was achieved by noise immunity coding in [5].

The TRX occupies  $0.0024 mm^2$ , excluding test-support blocks, which the smallest area among TRXs for SR interfaces [2-5]. The RX core, excluding ODTs, occupies only  $58 \mu m^2$ , which is only 5% of the total RX area. Compared to prior state-of-the-art [2, 3 and 4], the proposed DCC/CDR-less TRX improves the normalized area cost by 4.2, 619, and  $1502\times$ , respectively. Although the data rate is slower than [3, 4], the proposed TRX improves energy efficiency by  $1.37\times$  compared to [3, 4]. To achieve 40Gb/s/pin, [3] used an injection-locked phase-interpolator (IJL-PI), which increases the power and area cost. Likewise, the PAM-4 TRX [4] consumes more power than the proposed TRX to achieve a 112Gb/s/pin data rate with enormous hardware costs (CDR, CTLE, and FFE). Whereas the prior art [2-5] required expensive high-speed clocking circuits (DCC [2-5], delay line [2], IJL-PI [3], digital CDR [4], and de-skew [5] circuits), these circuits are not required by the proposed DECS TRX; thus, the smallest area and decent energy efficiency can be achieved.

## Acknowledgement:

This work was supported by the Commercializations Promotion Agency for R&D Outcomes (COMPA) grant funded by the Korea government (MSIT) (No. 20211100), by Samsung Electronics Co., Ltd (IO201211-08055-01), by BK21 FOUR project of NRF for the Department of Electrical Engineering, POSTECH, and by IDEC.

## References:

[1] S. Lee et al., "A 7.8Gb/s/pin 1.96pJ/b Compact Single-Ended TRX and CDR with Phase-Difference Modulation for Highly Reflective Memory Interfaces," *ISSCC*, pp 272-273, 2018.

[2] J. M. Wilson et al., "A 1.17pJ/b 25Gb/s/pin Ground-Referenced Single-Ended Serial Link for Off- and On-Package Communication in 16nm CMOS Using a Process- and Temperature-Adaptive Voltage Regulator," *ISSCC*, pp 276-277, 2018.

[3] K. McCollough et al., "A 480Gb/s/mm 1.7pJ/b Short-Reach Wireline Transceiver Using Single-Ended NRZ for Die-to-Die Applications," ISSCC, pp 184-185, 2021.

[4] R. Yousry et al., "A 1.7pJ/b 112Gb/s XSR Transceiver for Intra-Package Communication in 7nm FinFET Technology," *ISSCC*, pp 180-181, 2021.

[5] Y.-Y. Hsu et al., "A 7nm 0.46pJ/bit 20Gbps with BER 1E-25 Die-to-Die Link Using Minimum Intrinsic Auto Alignment and Noise-Immunity Encode," *IEEE Symp. VLSI Circuits*, June 2021.

Authorized licensed use limited to: POSTECH Library. Downloaded on November 16,2025 at 16:11:36 UTC from IEEE Xplore. Restrictions apply.



Figure 28.7.1: Application scenario for the proposed transceiver.



Figure 28.7.2: Transceiver architecture and signaling method



Figure 28.7.3: Transmitter schematic diagram (left) and transmitter output eye diagrams (right).



Figure 28.7.4: Receiver schematic (top) and its timing diagrams (bottom).

|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 20Gb/s @PRBS31, 1.1V <sub>DD_TX</sub> , 1V <sub>DD_RX</sub>                                    |  |  |  |  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|--|--|--|--|
| Width(W): 4µm<br>Space(S): 12µm<br>Length(L): 1mm                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | -1<br>-2<br>-3<br>-3                                                                           |  |  |  |  |
| GND DATA GND CK GND                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | S .5                                                                                           |  |  |  |  |
| 1/w/s/w/s/w/s/w/s/w/s/w/                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | © -6<br>-7<br>-8<br>-9<br>-9<br>-9<br>-9<br>-9<br>-9<br>-9<br>-9<br>-9<br>-9<br>-9<br>-9<br>-9 |  |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | -10 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1                                                 |  |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | Clock Phase (UI)                                                                               |  |  |  |  |
| 20Gb/s with various duty cycle                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | With a injected 50-MHz RX supply noise                                                         |  |  |  |  |
| -1 -2 Duty cycle: 35% Duty cycle: 00% Duty cycle: 05% Duty | -1 -2                                                                                          |  |  |  |  |
| -8<br>-9<br>-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 0.88UI @ BER<10-9                                                                              |  |  |  |  |
| Clock Phase (UI)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1<br>Clock Phase (UI)                                 |  |  |  |  |

|                                    | ISSCC<br>2018 [2]            | ISSCC<br>2021 [3] |           | ISSCC<br>2021 [4] |                       | VLSI<br>2021 [5]                      | This work                           |   |
|------------------------------------|------------------------------|-------------------|-----------|-------------------|-----------------------|---------------------------------------|-------------------------------------|---|
| Technolongy (nm)                   | 16 FinFET                    | 7 FinFE           | Τ         | 7 FinFET          |                       | 7 FinFET                              | 28 LPP                              |   |
| Data rate/pin (Gb/s/pin)           | 25                           | 40                |           | 112               |                       | 20                                    | 20                                  |   |
| Signaling method GRS 1             |                              | NRZ               | NRZ PAM-4 |                   | NRZ                   | DECS                                  |                                     |   |
| CDR                                | Delay line                   | IJL-PI [          |           | Digital C         | DR                    | Deskew loop                           | Not required<br>(Auto deserialized) |   |
| Duty cycle correction              | Required                     | Required          |           | Require           | ed                    | Required Not required (for 40% - 60%) |                                     | ) |
| Suppy noise immunity               | oise immunity N/A N/A        |                   | N/A       |                   | Noise immunity coding | 50-MHz 300mVp-p<br>on PCB             |                                     |   |
| Channel loss (dB)                  | nnel loss (dB) - 4 - 8 - 3.7 |                   |           | - 3               | - 2.5                 |                                       |                                     |   |
| Horizontal eye size (UI)           | 0.77                         | 0.55              |           | 0.14              |                       | 0.63                                  | 0.99UI                              |   |
| Energy TX                          | 0.449                        | N/A               |           | N/A               |                       | N/A                                   | 1.09                                |   |
| efficiency                         | 0.108                        | N/A               |           | N/A               |                       | N/A                                   | 0.15                                |   |
| (pJ/b) Others                      | 0.617                        | N/A               |           | N/A               |                       | N/A                                   | N/A                                 |   |
| (porb) Total                       | 1.17                         | 1.7               |           | 1.7               |                       | 0.46                                  | 1.24                                |   |
| TX                                 | 0.0014(2) 3.5                | 0.047(3) 6        | 80        | N/A               |                       | N/A                                   | 0.001236                            | 1 |
| Area Norm.<br>(mm²) w/ tech.(1) RX | 0.001938(2) 5                | 0.047(3) 6        | 31        | N/A               |                       | N/A                                   | ODT 0.001134<br>Others 0.000058     | 1 |
| Total                              | 0.003338(2) 4.2              | 0.094(3) 6        | 19        | 0.228(4) 1502     |                       | N/A                                   | 0.002428                            | 1 |

<sup>(1)</sup> Area is normalized with technology.(2) Area of the I/O brick is divided by the number of the lanes.

<sup>(3)</sup> Area is divided by the number of the lanes

Figure 28.7.5: Channel structure (top left) and on-chip measured BER bathtub curves. Figure 28.7.6: Performance summary and comparison table.

## **ISSCC 2022 PAPER CONTINUATIONS**



Figure 28.7.7: Chip micrograph.