### **Clock and Data Recovery in High-Speed Wireline Communications**

Nikola Nedovic Fujitsu Laboratories of America Sunnyvale, CA May 21, 2009

N. Nedovic, "Clock and Data Recovery in High-Speed Wireline Communications"

# Outline

- Introduction
- Design Strategy system and architecture
- Hybrid Oversampling CDR
- Design Example
- Measurements
- Conclusion



#### • Input at the receiver:

- Jitter timing deviation from ideal phase
- Wander low frequency timing variations
- Noise voltage-domain fluctuations
- Asynchronous to any clock in the system
- Clock and Data Recovery (CDR) widespread in communication systems
- Extracts clock from the incoming data stream and synchronizes the data with the clock
- Often performs demultiplexing to a lower data rate

# **CDR Design Requirements**

#### • Specifications

- Bit Error Rate (BER)
- Electrical specifications
- Jitter Tolerance
- Jitter Transfer
- Alarms specifications (Loss of Lock, Loss of Signal)
- Power limit
- Consecutive Identical Digits (CID) limit
- Specific requirement for particular system
  - Optical system may require phase adjust
  - Multi-bit-per-symbol signaling (e.g. PAM-4, duobinary) may require locking to specific edges

#### SONET OC-768 Jitter Tolerance Mask



## **CDR Architecture**

### **Phase-tracking**





- Feedback architecture
- Center samples hard-wired to output
- CDR loop tracks low-frequency jitter
- Low-jitter VCO for high-freq. jitter tolerance

### **Blind oversampling**





- Feedforward architecture
- Transitions detected from several samples per UI
- Inserting or removing bits from FIFO tracks low frequency jitter

### **Linear Model**



- Linear analysis from control systems possible and extremely useful
  - Stability, loop bandwidth, jitter transfer, steady-state error, noise transfer function...
- Linear model is only an approximation (sometimes crude)
  - Nonlinearities limit usability of the model, e.g. bang-bang PD

N. Nedovic, "Clock and Data Recovery in High-Speed Wireline Communications"

## Linear Model (cont'd)



### • Low-pass input jitter transfer

High-pass / band-pass jitter generation

### Design Strategy: System and Architecture

N. Nedovic, "Clock and Data Recovery in High-Speed Wireline Communications"

# **CDR Design Strategy: System**

#### • 1) Choose technology if possible

- CMOS slow but cheap, widely available, easy to integrate and lowpower
- III-V Compounds (GaAs, InP) much faster but expensive and highpower
- BiCMOS (e.g. SiGe)

### • 2) Decide architecture

- Feedback phase tracking CDR dominant
  - Simple, well understood, capable to support high data rates, relatively low power
- Blind oversampling only if wander is small
- Special architecture for given application (e.g. burst, MIMO)
- 3) Based on application, data rate, and available technology choose loop linearity type and rate
  - Bang-bang loop faster thus preferrable in high-speed applications
  - Linear loop easier to control and model thus preferrable at lower speeds
  - Choose highest rate that the technology can accommodate
    - VCO frequency, sampler frequency and aperture usually critical

# **Design Strategy: Loop BW and Type**

#### • 4) Choose loop bandwidth

- Loop  $BW\uparrow$   $\Rightarrow$  input noise transfer  $\uparrow$ , VCO noise generation  $\downarrow$
- BW lower bound: the knee of jitter tolerance characteristics
- Upper bound: defined by bandwidth of jitter transfer if specified, ultimately by loop stability

### • 5) Choose loop type and numbers of poles and zeroes

- Normally defined by loop filter
- Desired type-II (two poles at f=0) for zero steady state error for a frequency step
  - One 1/s term from phase integrating function of the VCO
  - Additional 1/s term from loop filter typically done by an active circuit (e.g. charge pump)
- Simplest loop filter:

$$H_{LPF}(s) = \frac{K_{CF}}{s}$$



- Unfortunately, with above loop filter, the system becomes unstable
  - Closed loop poles on imaginary axis

May 21, 2009

### Loop BW and Type (cont'd)

• Need a stabilizing zero:

$$H_{LPF}(s) = \frac{K_{CP} \cdot (1 + sC_1R)}{s(C_1 + C_2) \cdot \left(1 + s\left(\frac{C_1C_2}{C_1 + C_2}\right)R\right)} = \frac{K_{CP} \cdot (1 + s/z)}{s(C_1 + C_2) \cdot \left(1 + \frac{s}{p}\right)}$$

High-frequency pole in the filter is due to parasitic capacitance at V<sub>CTRL</sub>, also useful to suppress spurs



May 21, 2009

### **Design Strategy: Pole / Zero** Positioning

6) Determine pole/zero position



For maximum phase margin:

$$\omega_{\mathbf{C}} = \sqrt{\mathbf{z}\mathbf{p}}$$

$$\mathsf{P}\boldsymbol{M} = \mathbf{2}\boldsymbol{a}\boldsymbol{r}\boldsymbol{c}\boldsymbol{t}\boldsymbol{g}\left(\sqrt{\frac{\boldsymbol{p}}{\boldsymbol{z}}}\right) - \frac{\pi}{2}$$

N. Nedovic, "Clock and Data Recovery in High-Speed Wireline Communications"

### Design Strategy: Pole / Zero Positioning (cont'd)

- In a bang-bang system, concept of transfer function and bandwidth does not exist
- Still possible to design having in mind that bandwidth depends on amplitude of disturbance
- Statistically, gain of PD can be linearized through VCO jitter and/or input jitter
  - Implies the system parameters such as bandwidth, jitter transfer, peaking etc depend on noise !
  - A wide design margin should be allocated



May 21, 2009



May 21, 2009



#### • K widely varies in non-linear CDR

- Design for wide margin
- About 4x span for K available for damping factor  $\zeta$ >1/sqrt(2)

N. Nedovic, "Clock and Data Recovery in High-Speed Wireline Communications"

# **Design Strategy: Architecture**

- 7) Examine CDR transient locking behavior
  - Lock range is the maximum difference between VCO clock and data rate for which lock is achieved without a cycle slip
    - All recovered data are valid
    - Useful for burst CDRs, less for continuous mode CDRs
  - Pull-in range is the maximum difference between VCO clock and data rate for which lock is eventually achieved
    - cycle slips allowed, some data may be invalid
    - Most important parameter for conventional continuous mode CDRs
  - Plenty of literature on lock/pull-in properties of PLLs, most for first or second order loops and/or oversimplified
  - Pull-out range is maximum difference between VCO clock and data rate for which CDR stays in lock



- If VCO and data frequencies differ, VCO clock "sweeps" data eye
- PD output has a beat frequency component equal to f<sub>D</sub>-f<sub>VCO</sub>
- Area between f<sub>D</sub> and f<sub>VCO</sub> is equal to one unit interval (UI) over one beat period

## **CDR Lock and Pull-in**

- Pull-in happens when f<sub>VCO</sub> converges towards f<sub>D</sub> in one DN-UP cycle
- Change of  $f_{VCO}$  in one UP/DN cycle proportional to difference of inverses of average  $\Delta f$ 's in UP and DN cycles.
  - Ideally,  $\phi(PD \rightarrow V_{ctl}) < \pi/2 \implies$  pull-in range is equal to tuning range

N. Nedovic, "Clock and Data Recovery in High-Speed Wireline Communications"

# **CDR Pull-in with Latency in the Loop**

- Pull-in in 3rd order type-II CDR with latency in the loop ccccccccccccccccccccccccccc D Ė Ė Ė F F F F t<sub>DN</sub> t<sub>UP</sub>=t<sub>DN</sub> UP Ilatencv DN data rate f<sub>D</sub> area=UI/2 ∆**f=**∆**f**<sub>init</sub> area=UI/2 f<sub>VCO</sub> <sup>∐</sup> *A*finit ★ ∆f<sub>avq</sub>(DN)  $\Delta f_{avg}(UP) = \Delta f_{avg}(DN)$
- Pull-in limited by latency of PD/DEMUX
- Pull-in depends on the phase of the loop gain
  - If  $\varphi(\beta A(s=j|\omega_{VCO}-\omega_D|))=-\pi$ , there is no pull-in (false lock)
  - A safe pull-in range is about loop bandwidth as there is a defined phase margin for stability (latency acounted for) x100ppm-x1000ppm

N. Nedovic, "Clock and Data Recovery in High-Speed Wireline Communications"

# **Aided Frequency Acquisition**

- As pull-in can be guaranteed only for a few hundred ppm around data rate, phase tracking CDR's use aided frequency acquisition
- Most common: dual loop



• Frequency detector monitors the frequency difference between rclk and Ref. clk

- Switch to frequency acquisition loop if greater than a threshold
- Switch to phase loop if less than a threshold

# Summary - Part I

- CDR linear model useful but one must be aware of limitations
- High-speed design severe nonlinearities
- Concurrent system, architecture, and circuit design needed
  - Circuit specifications depend on architecture, system specifications
  - Loop dynamics depends on circuit non-idealities
- Behavioral simulation necessary

# **Hybrid Oversampling CDR**

N. Nedovic, "Clock and Data Recovery in High-Speed Wireline Communications"

### **Hybrid CDR Architecture**



- Detect edge positions as in blind oversampling and lock to these edges as in phase tracking CDRs
- Output data are the samples furthest from the transition
- CDR loop only needs to loosely track low-frequency jitter
  - FIFO tracks high-frequency jitter
  - Maximum tolerable phase error > 1UI, depends on FIFO depth
- Oversampling allows seamless lock directly from data and suppresses low-frequency VCO jitter

# Hybrid CDR: Principle of Operation

- Sample more than two times per UI, detect phase by comparing adjacent samples
  - Detect the *direction* of the phase error
  - Operating range of the phase detector >  $2\pi$
- Keep track of the detected phase in Finite State Machine (FSM)
  - Each position in FSM represents a discrete amount of phase error
- Send the phase information to VCO for clock recovery



N. Nedovic, "Clock and Data Recovery in High-Speed Wireline Communications"

### Hybrid Oversampling CDR: Strategy Revision



• Loop bandwidth can stay same for more design margin, or it can be reduced for better input jitter tolerance

### Hybrid Oversampling CDR: Frequency Acquisition



May 21, 2009

### Hybrid Oversampling CDR: Effect of Circuit Parameters on Performance

• Bit error ocurrs when the VCO wanders during CID so phase tracking direction is erroneous (bit skip)



- Expression for VCO jitter tolerance is complex and not enlightening, however some insight is possible
  - Gradual changes in phase due to accumulating VCO jitter tracked by the FSM
  - Break dependency of BER on loop bandwidth
    - Hybrid scheme has much higher tolerance to VCO jitter than phase tracking CDR
    - Potential direction for deep submicron technologies that suffer from relative increase of noise, jitter
- Tighter specification for sampler aperture

# **Design Example**

# **Circuits: Design Example**

• 40Gb/s 3x oversampling hybrid CDR [Nedovic et al., ISSCC'07]



May 21, 2009

### **Circuits: Input Data Line**



- 208 $\mu$ m long T-line segments between DEMUX taps
- Digitally controlled varactors in the middle of the segments
  - Avoid large discontinuities, reduce distortion
- Parallel loss R<sub>SHUNT</sub> to reduce ISI
  - Distortionless line:

$$\frac{R}{L} = \frac{G}{C} \qquad \qquad \text{loss} = \sqrt{RG} \neq f(\omega)$$
$$\text{delay} = \sqrt{LC} \neq f(\omega)$$

- +  $\mathsf{R}_{\mathsf{SHUNT}}\text{=}670\Omega$  as a compromise between ISI and loss
- Analogous to Pupin coils from early days of telephone industry

# **Circuits: Sampler**



#### • CML latch

 Transistors p1 and p2 to improve bandwidth and shut off input differential pair

#### • Sense Amplifier

- Reset in the off-phase through feedback n5-n8
- Compare input with output using quad n1-n4 to reduce aperture

#### • Two-stage buffer

N. Nedovic, "Clock and Data Recovery in High-Speed Wireline Communications"



• Distributed closed-loop VCO with built-in mechanism to set the clock direction counter-clockwise [Tzartzanis et al, ISSCC'06]

### **Circuits: Charge Pump / Loop Filter**



May 21, 2009

### **Charge Pump Characteristic**



May 21, 2009

# **Summary - Part II**

- Hybrid oversampling CDR takes more than two samples per unit interval in PLL-like loop
  - Detect the direction of the phase error
  - Improves jitter tolerance jitter generation tradeoff
  - Improve pull-in range
  - Improve immunity to VCO noise

# **Measurement Setup**

#### • Building measurement setup

- Goal: generate setup that enables us to verify that a design meets specifications
- At high data rates, integrated test systems either don't exist or cost too much
  - Justified only for production test
- DIY measurement setup is a project on its own!

#### • Most critical test is jitter tolerance

- Requires generation of input signal with sinusoidal jitter with varying frequency and amplitude
- For low jitter frequencies (<~1MHz), we can use internal phase modulation of CW signal sources
- For high jitter frequencies, external modulation must be used
  - External IQ modulation up to 800MHz is available as an option in Agilent's E8267C

### Generation of Sinusoidal Jitter with External IQ Modulation



N. Nedovic, "Clock and Data Recovery in High-Speed Wireline Communications"

### **Phase Modulation Approximations**

IQ plane trace for max. deviation of pi/4



May 21, 2009

### **Jitter Tolerance Test**



May 21, 2009

# Spectrum and Phase Noise of Recovered Clock



May 21, 2009

### **Jitter Tolerance**



May 21, 2009



May 21, 2009

# Summary - Part I, Part II, Part III

- CDR linear model useful but one must be aware of limitations
- High-speed design severe nonlinearities
- Concurrent system, architecture, and circuit design needed
  - Circuit specifications depend on architecture, system specifications
  - Loop dynamics depends on circuit non-idealities
- Hybrid oversampling CDR allows for decoupling between highlevel and low-level design parameters
  - Detect the direction of the phase error
  - Improves jitter tolerance jitter generation tradeoff
  - Improve pull-in range
  - Improve immunity to VCO noise
- Design of measurement setup at high data rates is an entire project by itself
- Behavioral simulation necessary