### Clock Distribution on a Dualcore Multi-threaded Itanium<sup>™</sup>-Family Microprocessor

Patrick Mahoney, Eric Fetzer, Bruce Doyle, Sam Naffziger Intel Corporation, Fort Collins, CO

## Topics

- Overview of distribution
- L0 route from PLL to Digital Frequency Divider (DFD)
- L1 route from DFD to 2nd Level Clock Buffer (SLCB)
- Regional Active Deskewing (RAD) system
- L2 route from SLCB to CVD's
- Clock Vernier Device (CVD) description
- L3 route from CVDs to gaters to latches
- Summary

## Overview with block diagram



## Level 0 Route: PLL to DFD

- Connects PLL to the DFDs
- 400mV Low-voltage swing, constant frequency
- 20mm long, repeated every 5mm
- Balanced h-tree tapered route with matched impedance resistively terminated.
- No duty cycle correction in repeaters, DFD not sensitive



#### L0 Repeater Circuit



#### Level 1 Route: DFD to SLCB

- Connects DFD to the Second Level Clock Buffer (SLCB)
- Full rail signal w/ varying supply and varying frequency
- Runs at half core frequency using differential clocks.
- Balanced h-tree tapered route with matched impedance



## **SLCB Block Diagram**



#### L1 Route Topology



#### **SLCB** Delay Operation



Delay line diagram

Measured silicon data showing adjusted delay vs. trim setting

#### **RAD System Overview**



Each arrow in the zone diagram represents the below phase comparator



## **RAD** Circuit Operation



RAD system enabled.

## L2 route: SLCB to CVD

- Connects SLCB to Clock Vernier Device (CVD)
- Full rail signal w/ varying supply and varying frequency
- Runs at full core frequency and is single-ended.
- H-tree route 2.1-3.3mm in length matched in RLC delay
- Extracted RLC simulation with a target skew of 6ps
- Each SLCB typically drives several hundred CVD's
- Route tapered to reduce power and RC flight



# L2 Clock Routing Tool

- In house tool called EZRoute used for balanced h-tree creation.
- Uses two methods of modeling: Elmore delay and SPICE.
- Uses recursive algorithm to adjust wire widths to obtain lowest skew for a given target.
- Tool inputs to are the location of clock inputs and their loading. The direct output is the layout mask of a clock tree.

#### Die micrograph showing L2 grid



## Clock Vernier Devices (CVD's)

- CVD's act as local clock buffers and allow scanable delay insertion
- 14500 individually controllable CVD's on die for localized clock skew adjustment - RAD CVD's for unit-level adjustments
- Skew can be added to the clock
  - Automate speedpath detection
  - Find and fix speedpaths in software via scan
  - Find and fix races in software via scan
  - In combination with scan latches and on-die clock shrink, a powerful tool for rapid speedpath debug..

## **CVD** Circuit and Operation



#### Level 3 Route: CVDs to latches

- Connects CVDs to the clock-gating buffers and then latches.
- Routed by individual circuit designers not clock team
- Clocks are gated by logic wherever possible.
- Up to 1.5mm long in the extreme case, but typically a few hundred microns.
- Shielded route with a worst-case delay of 12ps.
- Routes not correctable by RAD system
- Routes and loads are extracted and simulated in SPICE using "picoskew" custom tool and CVD and gater sizing determined by loading.

### **Route Statistics and Power**

L0

| Route | Terminals  | Distance | Delay |
|-------|------------|----------|-------|
| LO    | 14         | 20mm     | 640ps |
| L1    | 71         | 5mm      | 215ps |
| L2    | 14500      | 2-3.3mm  | 60ps  |
| L3    | ~5 million | 0-1.5mm  | 12ps  |

L3 Total CPU

L1

Route statistics

Power dissipation contribution by route

L2

Highest load and most power dissipated in the L3 route.
Future research into low-power clock distribution should focus on last section of route.

## Summary

- Four level clock distribution system operating in a variable frequency, variable voltage domain at 2GHz.
- Deskewing system minimizes routing skew to 10ps or less constantly across PVT.
- CVD's allow software detection and repair of setup or hold timing violations.
- Distribution system dissipates about 25W of power

#### Acknowledgements

The authors wish to recognize the efforts of Steve Hall and Erin Francom and the others whose contributions made this design possible.