# 3 × 16 Gb/s Compact Single-Ended PAM4 Transmitters With Inverter-Based Crosstalk Compensation for Memory Interfaces

Changjae Moon<sup>®</sup>, Graduate Student Member, IEEE, Iksu Jang<sup>®</sup>, Graduate Student Member, IEEE, Sungmin Lim<sup>®</sup>, Graduate Student Member, IEEE, Yaejoon Huh<sup>®</sup>, Graduate Student Member, IEEE, and Byungsub Kim<sup>®</sup>, Senior Member, IEEE

Abstract—A four-level pulse-amplitude modulation (PAM4) transmitter (TX) with crosstalk compensation (XTC) is proposed for short-reach memory interfaces. Simple encoders and transition detectors detect the data pattern causing crosstalk and appropriately activate inverter-based XTC taps. With gain and delay control of XTC, compensation error due to the mismatch between the victim and aggressor channels was minimized. The TX was fabricated in 28 nm LP CMOS and tested at 16 Gb/s. With XTC, the eye height and width were improved by 203% and 396%, respectively. Because it uses area-efficient inverter-based XTC taps, the TX occupies only 0.0067 mm², achieving an area per data rate of 0.00042 mm²/Gbps.

Index Terms—Single-ended, PAM4 transmitter, crosstalk, farend crosstalk (FEXT) cancellation, memory interface.

#### I. Introduction

ITH the increasing demand for high-speed, low-latency, and low-power memory access, single-ended four-level pulse-amplitude modulation (PAM4) memory interfaces utilizing massively parallel short-reach interconnects such as interposers, high bandwidth memory (HBM), etc. are

Manuscript received 1 May 2024; revised 8 July 2024; accepted 2 August 2024. Date of publication 8 August 2024; date of current version 26 November 2024. This work was supported in part by the Institute of Information and Communications Technology Planning and Evaluation (IITP) Grant funded by the Korea Government (MSIT) under Grant 2022-0-01171; in part by the National Research and Development Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT under Grant 2020M3H2A107804514; in part by the Next-Generation Intelligence Semiconductor Research and Development Program through the NRF funded by MSIT under Grant RS-2023-00258227; in part by the BK21 FOUR Project of NRF for the Department of EE, POSTECH; and in part by Samsung Electronics Company Ltd. under Grant IO201211-08055-01. This brief was recommended by Associate Editor L. Yang. (Corresponding author: Byungsub Kim.)

Changjae Moon, Iksu Jang, Sungmin Lim, and Yaejoon Huh are with the Department of Electrical Engineering, Pohang University of Science and Technology, Pohang 37673, South Korea.

Byungsub Kim is with the Department of Electrical Engineering, the Department of Convergence IT Engineering, the Department of Semiconductor Engineering, and the Graduate School of Artificial Intelligence, Pohang University of Science and Technology, Pohang 37673, South Korea, and also with the Institute for Convergence Research and Education in Advanced Technology, Yonsei University, Seoul 03722, South Korea (e-mail: byungsub@postech.ac.kr).

Color versions of one or more figures in this article are available at https://doi.org/10.1109/TCSII.2024.3440999.

Digital Object Identifier 10.1109/TCSII.2024.3440999



Fig. 1. PAM4 crosstalk compensation.

becoming attractive [1]. Because many short interconnects are densely placed in parallel in such applications, the channel loss is typically small, while the far-end crosstalk (FEXT) is large. By taking advantage of the small channel loss, PAM4 signaling can double data rate per pin at the cost of reduced eye opening compared with non-return-to-zero (NRZ) signaling.

However, the large FEXT is more problematic in PAM4 than in NRZ because the eye height is less than 1/3 of NRZ's while the peak crosstalk amplitude is the same as NRZ's (Fig. 1). In addition, active cancellation of PAM4 FEXT is more difficult than NRZ FEXT because the PAM4 FEXT amplitude dynamically changes with the data pattern (Fig. 1). Furthermore, PAM4 FEXT compensation requires more accurate matching of driver strengths and interconnect skews between the victim and the aggressors than NRZ FEXT compensation [2], [3] (Fig. 1). Because variations in driver strengths and interconnect delays cause larger mismatches between crosstalk compensation (XTC) signal and FEXT in PAM4 signaling than in NRZ signaling, residual FEXTs can be large enough to close the eye-opening even after crosstalk compensation in PAM4 signaling (Fig. 1). In addition, the PAM4 FEXT compensation circuit must be integrated in a tight area because numerous inputs/ouputs (I/Os) must be integrated in a limited area in such massively parallel short-reach memory interface applications. Because of all these challenges and requirements, designing a single-ended PAM4 interface circuit for massively parallel short-reach memory interfaces is quite challenging.

At the receiver (RX), the XTC [2], [3] uses high-pass RC (resistor-capacitor) filters to cancel NRZ FEXT. However, due to the termination resistor and RC high-pass filter connection, the RX input impedance does not match  $50~\Omega$ , causing signal integrity problems. Moreover, the RC filter resistor introduces additional parasitic capacitance, which degrades both the bandwidth of the high-pass filter and the area efficiency metric. Reference [2] demonstrated delay adjustment for NRZ FEXT cancellation by RC-delay control to compensate for delay mismatches between the victim and aggressors. Reference [3] demonstrated by simulation that decision-feedback crosstalk canceller (DFXC) reduces the sensitivity to the delay mismatches. However, these techniques are only introduced for NRZ FEXT cancellation [2], [3].

At the transmitter (TX), an XTC signal is added to the victim signal during the transition time of the aggressor signal to reduce the FEXT [5], [6], [7], [8]. Also, [4] employed capacitive coupling to eliminate crosstalk-induced jitter (CIJ) due to its opposing characteristics to inductive coupling. However, [4], [6], [7] used current-mode logic (CML) drivers, which are power-hungry. In [5], an XTC utilized de-emphasis source-series termination (SST) drivers, hence, decreasing output swing. Reference [7] utilized capacitive peaking XTC. However, the capacitors occupy an additional TX area [4], [7]. In [8], a full-rate clock is used to generate a return-to-zero (RZ) transition at the pulse generator input to produce narrow XTC pulses, and thus, [8] is not efficient for reaching high data rates. In addition, none of these techniques [4], [5], [6], [7], [8] demonstrated delay adjustment circuits for FEXT cancellation, although delay matching is critical in the compensation of a PAM4 FEXT.

We present an area-efficient single-ended PAM4 TX with inverter-based XTC for short-reach memory interfaces. Each TX consists of one main tap and two XTC taps to cancel FEXT signals from two aggressors. For precise crosstalk cancellation required in PAM4 signaling, each XTC tap employs accurate delay and gain control circuits. Because compact inverter-based XTC taps are utilized in our design, our TX achieved the smallest area occupancy of 0.0067 mm<sup>2</sup>.

The rest of this brief is organized as follows. Section II describes the circuit design of the proposed PAM4 XTC TX. Section III shows the experimental results and comparison with the prior arts. Section IV provides the conclusion.

## II. TRANSMITTER DESIGN

To verify the proposed concept, three TXs were designed for three single-ended interconnects (Fig. 2). A TX consists of one main tap and two XTC taps as well as clock circuits. The main tap is composed of 11 LSB segments and 20 MSB segments, which take inputs of 4 LSB bits and 4 MSB bits at every 1/4x clock cycle, respectively (Fig. 2). Because all segment designs are identical, the main tap produces PAM4 signal by the 1:2 strength ratio between the LSB and MSB segments. Each segment consists of a 4:2 MUX, a 2:1 MUX, a predriver, and a SST driver. The quarter-rate input 4 bits are serialized to a full-rate 4-bit stream by 4:1 serialization of the two MUXs, and then fed to the pre-driver followed by the



Fig. 2. The schematic diagram of the transmitter and the S-parameters of the interconnects.

SST driver. The strength of the SST driver can be statically controlled by 4-bit transistor banks. One additional segment is assigned for LSB to improve the ratio of level mismatches (RLM). Because the loss of the short-reach interconnect is small (Fig. 2), equalization was not employed.

An inverter-based XTC design is proposed to cancel a PAM4 FEXT (Fig. 2). Because a PAM4 FEXT is a narrow pulse of which amplitude differs from an NRZ FEXT (Fig. 1), we adopted the narrow pulse generation method from the prior XTC for an NRZ FEXT [8] and appropriately modified it for faster PAM4 FEXT cancellation. In [8], high-speed RZ control signals were generated for XTC operation at the speed of 4 Gb/s. Because the pulse widths of these RZ signals must be narrower than 1 UI, producing such a narrow pulse is not energy-efficient at our target speed of 16 Gb/s. Therefore, in our design, we appropriately modified the technique to produce high-speed NRZ control signals instead of RZ control signals. The proposed XTC tap consists of an encoder and four identical XTC segments (Fig. 2). Each segment is composed of two 4:2 MUXs and two 2:1 MUXs, one rising transition detector, one falling transition detector, and a bank of inverterbased XTC drivers. Because there are seven possible voltage values (-3XT, -2XT, -XT, 0, XT, 2XT, 3XT) of a PAM4 FEXT depending on the aggressor data pattern, the four XTC segments are differently controlled to produce the opposite pulse according to the encoding table (Fig. 3). The input quarter-rate 8 bits are encoded to the quarter-rate 32 control bits for the XTC segments. Each XTC segment has an input of quarter-rate 8 bits (4 bits for pull-up and 4 bits for pulldown). The four MUXs serialize them to feed two full-rate bits (INR and INF) to the rising and falling transition detectors. respectively. The Boolean expressions of these inputs are summarized for each XTC segment in Fig. 3. For faster speed, energy efficiency, and design simplicity, a narrow XTC pulse is generated by the NRZ-based transition detectors instead of the RZ-based pulse generators [8] while four XTC segments



Fig. 3. Encoding table for XTC segments, and the Boolean expressions of the encoded input (INR, INF) to the four XTC segments (SEG1, SEG2, SEG3, and SEG4).



Fig. 4. (a) A schematic diagram of the DCDL and (b) delay increments of the DCDL versus fine and coarse digital codes under different corners and temperatures.

are utilized. To achieve energy-efficient operation at a higher speed, we employ a half-rate clock. Instead, the encoder guarantees the necessary NRZ transition of inputs of the transition detectors for a narrow pulse generation (Fig. 2). The pulse width is also statically controlled for precise FEXT cancellation for higher speed (Fig. 2). The XTC driver is a bank of four binary-weighted inverters with foot switches that allow the driver strength control. To minimize the timing error, no buffer is inserted between a transition detector and an XTC driver. For the delay control of the XTC signal, the last re-timing clock of the XTC tap can be adjusted by the digitally controlled delay line (DCDL) (Fig. 2). Fig. 4 presents a schematic diagram of the DCDL and delay increments of the DCDL under different corners, supply voltages, and temperatures. The DCDL consists of three inverters with MOS capacitor banks inserted between the inverter stages. The delays are coarsely controlled by 3 binary bits and finely controlled by 4 binary bits.

## III. MEASUREMENT RESULTS

The TXs were fabricated in 28nm LP CMOS technology. Fig. 5 shows the TX's die micrograph and the power breakdown. The TX consumes 1.6 pJ/bit/lane at the maximum speed of 16 Gb/s. The XTC circuit occupies 10.8% of the TX power consumption. A TX occupies the smallest area of only 0.0067  $mm^2$  in comparison with similar prior arts and was tested via 4-cm PCB traces with a PRBS31 pattern. The channel space and width are 602  $\mu$ m and 430  $\mu$ m, respectively. The measured S-parameters of the signal (S52) and FEXT (S51)





Fig. 5. (a) A chip microphotograph and (b) the power breakdown of the proposed PAM4 TX with XTC taps.



Fig. 6. A measurement setup.



Fig. 7. A measured FEXT eye diagram when two TXs at the aggressor channels generate 16 Gb/s PAM4 signals.

paths of the PCB traces are -1.1 dB and -18 dB, respectively, at 4 GHz (Fig. 2). Because the parasitic capacitances from PADs, ESDs, solder bumps, and etc. are included in this measurement, the overall channel characteristics would be worse than the S-parameters in Fig. 2. A test environment of the TX is shown in Fig. 6. The two aggressors are located on either side of the victim channel (Fig. 6). Fig. 7 illustrates the measured FEXT eye diagram when two TXs at the adjacent aggressor channels generate 16 Gb/s PAM4 signals. Although the FEXT signal attenuation of the channel is -18 dB, the measured peak-to-peak FEXT of 234 mV indicates that two aggressors transmitting PRBS31 patterns can cause a significant level of crosstalk interference when the crosstalk signals are accumulated: this peak-to-peak FEXT of 234 mV is the largest among the prior arts that clearly reported the FEXT voltage amplitudes. Because the theoretical eve-opening without inter-symbol interference (ISI) is only about 166.7 mV in PAM4 signaling, this level of FEXT can completely close the eye diagram. Fig. 8(a) and (b) present the measured FEXT pulses without and with the XTC, respectively, when the TX at the aggressor channel generates a single-bit pulse at a symbol



Fig. 8. Measured single-bit responses at the aggressor and FEXT pulses at the victim (a) without and (b) with XTC.



Fig. 9. Measured PAM4 TX eye diagrams with and without XTC taps, delay control, and aggressors.

rate of 8G symbols per second. Without the XTC, the peak-to-peak voltage of the FEXT pulse caused by a single-bit aggressor pulse is measured 108.9 mV. However, when the XTC gain is precisely adjusted, the peak-to-peak voltage of the FEXT is reduced to 19.4 mV, showing 82% reduction of FEXT.

The far-end eye diagram of the victim was measured with a high-speed oscilloscope in various conditions, and we report the smallest eye-opening among three PAM4 eyes in this brief. In our proof-of-concept design, the XTC driver strengths for gain control and the DCDLs for timing control are manually controlled without any adaptive algorithm. The TX achieved the maximum data rate of 16 Gb/s, the eye height of 40.3 mV, and the eye width of 0.352 UI, canceling FEXTs from the two neighbor aggressors (Fig. 9). However, the upper and middle eye heights are relatively smaller than the bottom eye height due to ISI caused by parasitic capacitance.

When XTC taps were turned off, the eye height and width were reduced to 13.3 mV and 0.071 UI, respectively. This result shows that the proposed crosstalk compensation improves the eye height and width by 203% and 396%, respectively. Without the aggressors, the eye height and width were measured 60.2 mV and 0.376 UI, respectively. We emulated the delay-mismatch scenario to demonstrate the importance of the delay matching for FEXT cancellation in practical applications where the lengths of many interconnects can hardly be identical due to differences in routing paths caused by various practical constraints such as different bump



Fig. 10. Measured PAM4 TX eye diagrams of the victim and two aggressors without XTC taps and delay control.

locations and different signaling layers, etc. In the worst delaymismatch scenario, the skew mismatches between the victim and the two aggressors were adjusted for the smallest eye. When the XTC taps were turned off in the worst delaymismatch scenario, the eye diagram was completely closed (Fig. 9). Even though the gains of XTC taps were adjusted for the best eye-opening in the worst delay scenario, the eye diagram was almost closed (Fig. 9). This result clearly shows that the eye-opening can be almost closed no matter how precisely we control the FEXT XTC gains unless the delay mismatches between the victim and aggressors are appropriately adjusted. However, the eye diagram of the aggressors was opened because the FEXT magnitude between the two outermost channels is -30.9 dB, which is smaller than the FEXT magnitude of -18 dB between the closest channels (Fig. 10).

The performances of the proposed TX and the prior arts are summarized and compared in Table I. Among the similar prior arts that clearly reported FEXT amplitudes, the proposed TX compensates for the peak-to-peak FEXT of 234 mV from two aggressors. Except [4] and [8], only our TX compensates for the crosstalks from both aggressors, while others [5], [6], [7] compensate for only one aggressor. Although the prior art [8] compensates for crosstalks from 4 aggressors, it can only compensate for NRZ crosstalk at the maximum speed of 4 Gb/s. In comparison with the similar prior arts [4], [5], [6], [7], [8], only our TX employs an XTC timing control circuit.

Although the TX designs in [5], [6], [7] achieve faster speeds than the proposed TX, they occupy more than 3.55 times, 35.4 times, and 12.7 times of the chip area of the proposed design, respectively, because the XTCs [5], [6], [7] employ large SST drivers, large CML drivers, and large capacitors, respectively. In contrast, the proposed TX uses area-efficient inverter banks to generate XTC signals. As a result, the proposed TX occupies the smallest area of 0.0067 mm<sup>2</sup> and achieved the best area efficiency (area per data rate) of 0.00042 mm<sup>2</sup>/Gbps, among the prior arts [4], [5], [6], [7], [8]. In the future advanced memory packaging applications where silicon area for I/O circuits is a critically limited resource, the proposed TX design can achieve the highest data rate.

## IV. CONCLUSION

In this brief, we proposed the area-efficient PAM4 TX with compact inverter-based XTCs for short-reach memory interfaces. To address the delay mismatches between the victim and the aggressors, we also introduce delay adjustment circuits for PAM4 FEXT compensation for the first time.

|                                                   | This Work                         |                           | TCASI'14 [4]                               |        | TCASI'16 [5]                   |                    | ISSCC'24 [6]                  |           |            | ISSCC'24 [7]                 |            |           |                                   | ISSCC'20 [8]    |         |               |  |
|---------------------------------------------------|-----------------------------------|---------------------------|--------------------------------------------|--------|--------------------------------|--------------------|-------------------------------|-----------|------------|------------------------------|------------|-----------|-----------------------------------|-----------------|---------|---------------|--|
| Technology (nm)                                   | 28 LP                             |                           | 130                                        |        | 65                             |                    | 28                            |           |            | 28                           |            |           | 65                                |                 |         |               |  |
| Modulation                                        | PAM4                              |                           | NRZ                                        |        | NRZ                            |                    | PAM4 / NRZ                    |           |            | PAM4 / NRZ                   |            |           | NRZ                               |                 |         |               |  |
| Data Rate (Gb/s/lane)                             | 16                                |                           | 5                                          |        | 25                             |                    | 112 (PAM4), 56 (NRZ)          |           |            | 64 (PAM4), 32 (NRZ)          |            |           | 4                                 |                 |         |               |  |
| Supply (V)                                        | 1                                 |                           | 1.2                                        |        | 1.2                            |                    | N/A                           |           |            | N/A                          |            |           | 1.2                               |                 |         |               |  |
| Single/Differential                               | Single                            |                           | Single                                     |        | Single                         |                    | Single                        |           |            | Single                       |            |           | Single                            |                 |         |               |  |
| XTC Type                                          | FIR-XTC                           |                           | Capacitive Coupling                        |        | FIR-XTC                        |                    | FIR-XTC                       |           |            | Merged C-peaking XTC         |            |           | FIR-XTC                           |                 |         |               |  |
| Number of aggressor<br>channels                   |                                   |                           | 2                                          |        | 1                              |                    | 1                             |           |            | 1                            |            |           | 4                                 |                 |         |               |  |
| Pin Efficiency                                    | 200%                              |                           | 100%                                       |        | 100%                           |                    | 200%                          |           |            | 200%                         |            |           | 100%                              |                 |         |               |  |
| Architecture                                      | XTC                               |                           | XTC                                        |        | XTC + FFE                      |                    | XTC + FFE                     |           |            | XTC + FFE                    |            |           | XTC + FFE                         |                 |         |               |  |
| The worst peak-to-<br>peak FEXT<br>amplitude (mV) | 234                               |                           | N/A                                        |        | N/A                            |                    | N/A                           |           |            | N/A                          |            |           | N/A                               |                 |         |               |  |
| Channel Loss (dB)                                 | -1.1                              |                           | N/A                                        |        | -8.9                           |                    | -4                            |           |            | -11                          |            |           | -25.6                             |                 |         |               |  |
| FEXT (dB)                                         | -18                               |                           | N/A                                        |        | N/A                            |                    | -9                            |           |            | -15.8                        |            |           | -31.8                             |                 |         |               |  |
| Channel Loss-to-<br>FEXT Ratio (dB)               | 16.9                              |                           | N/A                                        |        | N/A                            |                    | 5                             |           |            | 4.8                          |            |           | 6.2                               |                 |         |               |  |
| XTC Timing Control                                | Yes<br>(DCDL Control)             |                           | N/A                                        |        | N/A                            |                    | N/A                           |           |            | N/A                          |            |           | N/A                               |                 |         |               |  |
| XTC Gain Control                                  | Yes<br>(Inverter Bank<br>Control) |                           | Yes<br>(Level Switching<br>Buffer Control) |        | Yes<br>(SST Driver<br>Control) |                    | Yes<br>(Bias Current Control) |           |            | Yes<br>(Capacitance Control) |            |           | Yes<br>(Inverter Bank<br>Control) |                 |         |               |  |
| PRBS                                              | 31                                |                           | 7                                          |        | 7                              |                    | N/A                           |           |            | N/A                          |            |           | 7                                 |                 |         |               |  |
| Horizontal Eye<br>Opening (UI)                    | w/o XTC &<br>Delay Control        | w/ XTC &<br>Delay Control | w/o XTC                                    | w/ XTC | w/o XTC                        | w/ XTC             | w/o<br>XTC                    | W/<br>XTC | W/o<br>XTC | RZ<br>W/<br>XTC              | W/o<br>XTC | W/<br>XTC | W/o<br>XTC                        | RZ<br>W/<br>XTC | w/o XTC | w/<br>XTC+FFE |  |
|                                                   | 0                                 | 0.352                     | 0.479                                      | 0.569  | 0.28d                          | 0.444 <sup>d</sup> | 0                             | 0.31      | 0          | 0.42                         | 0          | 0.36      | 0.32                              | 0.6             | 0       | 0.4ª          |  |
| Vertical Eye<br>Opening (mV)                      | 0                                 | 40.3                      | N/A                                        | N/A    | 85 <sup>d</sup>                | 180 <sup>d</sup>   | 0                             | 18ª       | 0          | 20.4ª                        | 0          | 36        | 100                               | 180             | 0       | 26ª           |  |
| Inductor-less                                     | YES                               |                           | YES                                        |        | YES                            |                    | NO                            |           |            | YES                          |            |           | YES                               |                 |         |               |  |
| Energy Efficiency<br>(pJ/bit/lane)                | 1.6                               |                           | 1.6 <sup>b</sup>                           |        | 0.87                           |                    | 1.55                          |           |            | 1.27                         |            |           |                                   | 1.4             |         |               |  |
| TX area / Lane<br>(mm²/lane)                      | 0.0067                            |                           | 0.1505°                                    |        | 0.0238                         |                    | 0.237                         |           |            | 0.085                        |            |           |                                   | 0.0077          |         |               |  |
| TX area / Data rate<br>(mm²/Gbps)                 | 0.00042                           |                           | 0.0301                                     |        | 0.00095                        |                    |                               | 0.00212   |            |                              |            | 0.00133   |                                   |                 |         | 0.00193       |  |

TABLE I
PERFORMANCE SUMMARY AND COMPARISON WITH OTHER REPORTED TRANSMITTER-SIDE XTC DESIGNS

To verify the proposed XTC scheme, three transmitters were designed and fabricated in 28 nm LP CMOS technology. The TXs successfully transmitted a PRBS31 pattern at 16 Gb/s while compensating for PAM4 FEXT signals produced by two aggressors. Due to the compact inverter-based XTC design, the proposed TX occupies the smallest area of only 0.0067 mm<sup>2</sup> and achieves the best area efficiency per data rate.

## ACKNOWLEDGMENT

The authors thank IDEC and Ansys Inc. for tool support.

## REFERENCES

- J. Jin et al., "A 4nm 16Gb/s/pin single-ended pam4 parallel transceiver with switching-jitter compensation and transmitter optimization," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, 2023, pp. 404–405.
- [2] Y.-U. Jeong, S. Choi, S. Kim, and J.-H. Chae, "Single-ended receiver-side crosstalk cancellation with independent gain and timing control for minimum residual FEXT," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 70, no. 12, pp. 4793–4803, Dec. 2023.

- [3] C. Aprile et al., "An eight-lane 7-Gb/s/pin source synchronous single-ended RX with equalization and far-end crosstalk cancellation for backplane channels," *IEEE J. Solid-State Circuits*, vol. 53, no. 3, pp. 861–872, Mar. 2018.
- [4] K.-D. Hwang and L.-S. Kim, "A 5Gbps 1.6 mW/Gbps/CH adaptive crosstalk cancellation scheme with reference-less digital calibration and switched termination resistors for single-ended parallel interface," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 61, no. 10, pp. 3016–3024, Oct. 2014.
- [5] S. Yuan, L. Wu, Z. Wang, X. Zheng, C. Zhang, and Z. Wang, "A 70 mW25 Gb/s quarter-rate serdes transmitter and receiver chipset with 40 dB of equalization in 65 nm CMOS technology," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 63, no. 7, pp. 939–949, Jul. 2016.
- [6] L. Zhong et al. "A 112Gb/s/pin single-ended crosstalk-cancellation transceiver with 31dB loss compensation in 28nm CMOS," in *Proc.* IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2024, pp. 134–135.
- [7] W. Wu et al., "A 64Gb/s/pin PAM4 single-ended transmitter with a merged pre-emphasis capacitive-peaking crosstalk-cancellation scheme for memory interfaces in 28nm CMOS," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, 2024, pp. 240–241.
- [8] H.-G. Ko, S. Shin, J. Oh, K. Park, and D.-K. Jeong, "An 8Gb/s/um FFE-combined crosstalk-cancellation scheme for HBM on silicon interposer with 3D-staggered channels," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, 2020, pp. 128–129.

<sup>&</sup>lt;sup>a</sup> Estimated from measured eye diagrams

<sup>&</sup>lt;sup>b</sup> Including the RX power

<sup>&</sup>lt;sup>c</sup> Estimated from the chip microphotograph

d Data rate is 20 Gb/s