# Graph Algorithms for VLSI Power and Clock Networks

Rassul Bairamkulov

Department of Electrical and Computer Engineering

Advisor: Eby G. Friedman





























# **Circuit Analysis**

- In 1847, G. R. Kirchhoff introduced graph methods into circuit theory
- Postulated two fundamental laws of electrical circuits
  - Kirchhoff Current Law
  - Kirchhoff Voltage Law
- Number of independent cycles within network is

 $\mu = |E| - |V| + 1$ 

G. R. Kirchhoff, "Ueber die Auflösung der Gleichungen, auf welche man bei der Untersuchung der linearen Vertheilung galvanischer Ströme geführt wird," *Annalen der Physik*, Vol. 148, No. 12, pp. 497-508, October 1847.





### Finite State Machine

- Model of computational system
- Nodes represent states
- Edges represent transitions between states



G. H. Mealy, "A Method for Synthesizing Sequential Circuits," *The Bell System Technical Journal*, Vol. 34, no. 5, pp. 1045–1079, September 1955.
E. F. Moore, "Gedanken-Experiments on Sequential Machines," *Automata Studies*, Vol. 34, pp. 129–154, April 1956.

## Logic Graph

- Graphical representation of logical operations
- Powerful analysis of complex logic





#### Figure 17. Superposition of a network and its dual

C. E. Shannon, "A Symbolic Analysis of Relay and Switching Circuits," *Electrical Engineering*, Vol. 57, no. 12, pp. 713–723, December 1938.

### Maze Routing



C. Y. Lee, "An Algorithm for Path Connections and Its Applications," *IRE Transactions on Electronic Computers*, Vol. EC-10, no. 3, pp. 346-365, September 1961

20



# Floorplanning

- Floorplan efficiently represented using graphs
- Encodes relations between blocks
  - Adjacency
  - Relative position







J. Grason, "An Approach to Computerized Space Planning Using Graph Theory," Proceedings of the ACM/IEEE Design Automation Workshop, pp. 170-179, June 1971

# Circuit Partitioning

- Split circuit into modules
- Minimize number of pins
  - External connections
- Solved using graph minimum cut
  - Split graph into two equal sets
  - Reduce number of external edges



B. W. Kernighan and S. Lin, "An Efficient Heuristic Procedure for Partitioning Graphs," *The Bell System Technical Journal*, Vol. 49, no. 2, pp. 291-307, February 1970

## Race Analysis System (RAS)

- Sequential circuit converted to data flow graph
- Signal delay accumulated from each flip flop output to clock source
- Timing hazards are efficiently located



R. A. Harrison and D. J. Olson, "Race Analysis of Digital Systems Without Logic Simulation," *Proceedings of the IEEE/ACM Design Automation Workshop*, pp. 82–94, June 1971.



#### Modified Nodal Analysis



C.-W. Ho, A. Ruehli and P. Brennan, "The Modified Nodal Approach to Network Analysis," *IEEE Transactions on Circuits and Systems*, Vol. 22, no. 6, pp. 504-509, June 1975



# Synchronization in VLSI Systems

- VLSI systems are mostly synchronous
  - Combinatorial logic executes functions
- Registers synchronize data flow
  - Clock signal required
- Modeled with timing graph
  - Nodes = registers
  - Edges = data paths



I. S. Kourtev and E. G. Friedman, *Timing Optimization Through Clock Skew Scheduling*, Kluwer Academic Publishers, 2000

## Clock Skew Scheduling

- Sequential circuit graph
  - Efficiently determine clock arrival time
  - Optimized using quadratic programming
- Clock arrival time adjusted to enhance
  - Robustness
  - Performance



29

J. L. Neves and E. G. Friedman,"Design Methodology for Synthesizing Clock Distribution Networks Exploiting Non-Zero Clock Skew," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. VLSI-4, No. 2, pp. 286-291, June 1996.



#### Effective Resistance in Resistive Mesh

- Effective resistance in mesh estimated in O(1) time
- Accelerates IR drop analysis

$$R_{x,y} = \frac{kr}{\pi} \int_{0}^{\pi} \frac{\left(2 - e^{-|x|\alpha} \cos y\beta\right)}{\sinh \alpha} d\beta$$
$$R_{x,y}/r = \frac{\sqrt{k}}{2\pi} \left[\ln(x^2 + ky^2) + 3.44388\right]$$
$$- 0.033425k - 0.0629k(k-1) \qquad \text{for } k \to 0.0629k(k-1)$$



S. Köse and E. G. Friedman, "Effective Resistance of a Two Layer Mesh," *IEEE Transactions on Circuits and Systems II: Express Briefs*, Vol. 58, No. 11, pp. 739 - 743, November 2011

1





#### Research Effort since 2017

VISI COMBUNICANT

SYNCHRONIZAFTON

LOGI

COMPUTER

ARCHITECTURE

NTHESIS

CRYOGENIIC OCE/IN

HEORY

PHYSIC/III

DESIGN

OWER

**JELIVE** 

Effective Resistance in Truncated and Finite Grids

SUPERCONDUCTIVE

**QUANTUM** 

COMPUTING

EMRISTOR

ISLAND-

SPINTRONIC~

<u>Nonwolissile</u> S<del>CH</del>



#### Research Effort since 2017

VLSI CONTINENT

CRYOGENIC OCETIN

IRCUI

HEORY

Placement of Distributed On-Chip Voltage Regulators

OGI

COMPUTEI

ARCHITECTURE

Destgi

MEMRISTOR

SPINTRONIC~

**OUANTUM** 

SUPERCONDUCTIVE

COMPUTING

Monwollinne

SEA

ICE



<u>Nonwol/ssile</u> Sen
#### Research Effort since 2017

**QuCTS – Single Flux Quantum Clock Tree Synthesis** 

OWER

LOGI

COMPUTER

ARCHITECTURE

NTHESIS

PHYSICAI

DRCUIT

HEORY

MEMRISTOR

SPINTRONIC~

**OUANTUM** 

COMPUTING

Monvoluule

SEA

*ITCE* 

#### Research Effort since 2017

WLSI COMUUNCANU

PE

SYNCHRONIZHTTON

OGJ

CRYOGENIC OCE/IN

IGN

Qualcom

Exploratory Methodology for Power Delivery

NONVOL/HULC

SEA

MEMRISTOR Island

SPINTRONIC~

**QUANTUM** 

COMPUTING

SUPERCONDUCTIVE

#### Iterative Design Process

• Conventional analysis iterative in nature

T=Nt

- *t* time for correction and analysis
- Niterations to converge
- Significant delay in development process





















# Fewer Iterations with Exploration

- Bring initial system closer to optimum
- Fewer iterations to converge

R. Bairamkulov, K. Xu, M. Popovich, J. S. Ochoa, V. Srinivas, and E. G. Friedman, "Power Delivery Exploration Methodology Based on Constrained Optimization," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, Vol. 39, No. 9, pp. 1916 - 1924, September 2020.







# Case Study: Decoupling Capacitor Allocation

- Objective
  - Minimize the cost of decoupling capacitors
- Parameters
  - Capacitance at each level
  - Supply voltage
- Constraints
  - Minimum voltage
  - Maximum voltage
  - Power



52

R. Bairamkulov, K. Xu, E. G. Friedman, M. Popovich, J. Ochoa, and V. Srinivas, "Versatile Framework for Power Delivery Exploration," *Proceedings of the IEEE International Symposium on Circuits and Systems*, May 2018.

# Decoupling Capacitor Allocation: Results

- Decoupling capacitor cost reduced by 15%
- Power reduced by 39%

|                      | Lower<br>bound       | Upper<br>bound       | Initial<br>value     | Optimized<br>value   |
|----------------------|----------------------|----------------------|----------------------|----------------------|
| Supply voltage       | 1.4 volts            | 10.0 volts           | 5.0 volts            | 3.09 volts           |
| PCB decap            | 25.0  nF             | $10.0 \ \mu F$       | $5.00 \ \mu F$       | $2.71 \ \mu F$       |
| Package decap        | $50.0 \ \mathrm{pF}$ | 100  nF              | $50.0 \ \mathrm{nF}$ | $9.77 \ \mathrm{nF}$ |
| Die decap            | 2.00  pF             | $10.0 \ \mathrm{nF}$ | $5.00 \ \mathrm{nF}$ | 9.32  nF             |
| Minimum load voltage | 1.40 volts           |                      | 2.96 volts           | 2.94 volts           |
| Power dissipation    |                      | 10.0 watts           | 10.6 watts           | 6.51 watts           |
| Load voltage         |                      | 10.0%                | 19.3%                | 9.07%                |
| Normalized cost      |                      |                      | 0.317                | 0.270                |
|                      |                      |                      |                      |                      |



53

R. Bairamkulov, K. Xu, E. G. Friedman, M. Popovich, J. Ochoa, and V. Srinivas, "Versatile Framework for Power Delivery Exploration," *Proceedings of the IEEE International Symposium on Circuits and Systems*, May 2018.

#### Research Effort since 2017

WLSI CONTUNCTUR

SYNCHRONIZATION

CRYOGENIC Octain

ROCHESTER Qualcom

HYSICH

#### SPROUT – Smart Power ROUTing Tool for Power Network Prototyping

NONWOLIIUILE Ram MEMRISTOR ISLAND

**OUANTUM** 

COMPUTING

Spintronic-Island

SEII

#### Board-level Power Network Synthesis

- Typical approach
  - Manual design
- Proposed approach
  - First fully automated board-level power network layout synthesis







- Power network prototype generated in five steps
  - Available space





- Power network prototype generated in five steps
  - Available space



- Power network prototype generated in five steps
  - Available space

PMIC Vias Capacitor

- Power network prototype generated in five steps
  - Available space
  - Seed





- Power network prototype generated in five steps
  - Available space
  - Seed



- Power network prototype generated in five steps
  - Available space
  - Seed
  - Growth



- Power network prototype generated in five steps
  - Available space
  - Seed
  - Growth



- Power network prototype generated in five steps
  - Available space
  - Seed
  - Growth



- Power network prototype generated in five steps
  - Available space
  - Seed
  - Growth
  - Refinement



- Power network prototype generated in five steps
  - Available space
  - Seed
  - Growth
  - Refinement
  - Layout



#### Case Study: Six-Net System

|               | Net   | Manual | SPROUT |
|---------------|-------|--------|--------|
|               | $V_1$ | 133    | 131    |
| Normalized    | $V_2$ | 103    | 99     |
| inductance @  | $V_3$ | 131    | 127    |
| 25 MHz        | $V_4$ | 161    | 155    |
| (picohenrys)  | $V_5$ | 152    | 150    |
|               | $V_6$ | 116    | 114    |
|               | $V_1$ | 15.0   | 16.8   |
| Normalized    | $V_2$ | 8.4    | 9.1    |
| DC resistance | $V_3$ | 13.0   | 14.2   |
| (milliohms)   | $V_4$ | 18.4   | 18.2   |
|               | $V_5$ | 18.5   | 18.9   |
|               | $V_6$ | 9.2    | 9.2    |

R. Bairamkulov, A. Roy, V. Srinivas, M. Nagarajan, and E. G. Friedman, "SPROUT - Smart Power ROUting Tool for Board-Level Exploration and Prototyping," *Proceedings of the IEEE/ACM Design Automation Conference*, pp. 283-288, December 2021



#### Power Delivery Exploration

- Three nets
  - 1. Modem
  - 2. CPU
  - 3. DSP
- Two blockages
- Connect BGA with PMIC
  - L1
  - L2
  - L3
- Explore relationship between
  - Metal area
  - Power network impedance

$$A_1 = 35.0, A_2 = 35.0, A_3 = 7.5$$



Impedance vs. Area



R. Bairamkulov, A. Roy, M. Nagarajan, V. Srinivas, and E. G. Friedman, "SPROUT - Smart Power ROUting Tool for Board-Level Exploration and Prototyping," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems* (in press)

#### Research Effort since 2017

VISI COMPUNICANT

SYNCHRONIZAFION

OGI

COMPUTEI

ARCHITECTURE

CRYOGENIIC OCE/IN

HEORY

PHYSIC

OWER

ELIV

Effective Resistance in Truncated and Finite Grids

SUPERCONDUCTIVE

**OUANTUM** 

COMPUTING

EMRISTOR

ISLAND.

SPINTRONIC~

NONWOLIIILE SEH

ROCHESTER

# Importance of Grids

Appear frequently in science and engineering

- Chemical bonds
- Substrate models
- Discretized electrical and thermal models
- On-chip power and ground networks





Y. Ogasahara, M. Hashimoto, T. Kanamoto, T. Onoye, "Measurement of Supply Noise Suppression by Substrate and Deep N-well in 90 nm Process" *Proceedings of Asian Solid-State Circuits Conference*, pp. 397-400, November 2008

#### Analysis of Grids


• Superlinear complexity



• Superlinear complexity



| , | 2  | -1 | 0  | 0  | -1 | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  |
|---|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|
| _ | -1 | 3  | -1 | 0  | 0  | -1 | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  |
|   | 0  | -1 | 3  | -1 | 0  | 0  | -1 | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  |
|   | 0  | 0  | -1 | 2  | 0  | 0  | 0  | -1 | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  |
| _ | -1 | 0  | 0  | 0  | 3  | -1 | 0  | 0  | -1 | 0  | 0  | 0  | 0  | 0  | 0  | 0  |
|   | 0  | -1 | 0  | 0  | -1 | 4  | -1 | 0  | 0  | -1 | 0  | 0  | 0  | 0  | 0  | 0  |
|   | 0  | 0  | -1 | 0  | 0  | -1 | 4  | -1 | 0  | 0  | -1 | 0  | 0  | 0  | 0  | 0  |
|   | 0  | 0  | 0  | -1 | 0  | 0  | -1 | 3  | 0  | 0  | 0  | -1 | 0  | 0  | 0  | 0  |
|   | 0  | 0  | 0  | 0  | -1 | 0  | 0  | 0  | 3  | -1 | 0  | 0  | -1 | 0  | 0  | 0  |
|   | 0  | 0  | 0  | 0  | 0  | -1 | 0  | 0  | -1 | 4  | -1 | 0  | 0  | -1 | 0  | 0  |
|   | 0  | 0  | 0  | 0  | 0  | 0  | -1 | 0  | 0  | -1 | 4  | -1 | 0  | 0  | -1 | 0  |
|   | 0  | 0  | 0  | 0  | 0  | 0  | 0  | -1 | 0  | 0  | -1 | 3  | 0  | 0  | 0  | -1 |
|   | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | -1 | 0  | 0  | 0  | 2  | -1 | 0  | 0  |
|   | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | -1 | 0  | 0  | -1 | 3  | -1 | 0  |
|   | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | -1 | 0  | 0  | -1 | 3  | -1 |
|   | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | -1 | 0  | 0  | -1 | 2  |
|   |    |    |    |    |    |    |    |    |    |    |    |    |    |    |    |    |

• Superlinear complexity



 $^{-1}$  $^{-1}$ -3 -10 0 0 - 1

• Superlinear complexity





• Superlinear complexity





#### Effective Resistance



#### Effective Resistance



### Problem Statement

Find effective resistance between two arbitrary nodes in rectangular resistive grid with given

- Size  $w_x \times w_y$
- Node coordinates  $(x_0, y_0), (x, y)$
- Vertical resistance r
- Horizontal resistance kr



#### Infinity Mirror Technique

• Effective resistance is found in O(1) time



R. Bairamkulov and E. G. Friedman, "Effective Resistance of Finite Two-Dimensional Grids Based on Infinity Mirror Technique," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 67, no. 9, pp. 3224-3233, September 2020



### Model Accuracy

- Infinitely many images
- Only closest images are sufficient

$$\begin{split} R_{w_x,w_y} = &\sum_{i \in \mathbb{Z}} \sum_{j \in \mathbb{Z}} \left( 2\Omega_{ij}^k (x - x_0, y - y_0) + 2\Omega_{ij}^k (x - x_0, y + y_0 + 1) \right. \\ &+ 2\Omega_{ij}^k (x + x_0 + 1, y + y_0 + 1) + 2\Omega_{ij}^k (x + x_0 + 1, y - y_0) \\ &- \Omega_{ij}^k (2x_0 + 1, 0) - \Omega_{ij}^k (2x_0 + 1, 2y_0 + 1) - \Omega_{ij}^k (0, 2y_0 + 1)) \\ &+ \Omega_{ij}^k (2x + 1, 0) - \Omega_{ij}^k (2x + 1, 2y + 1) - \Omega_{ij}^k (0, 2y + 1) - 2\Omega_{ij}^k (0, 0)) \end{split}$$



83

R. Bairamkulov and E. G. Friedman, "Effective Resistance of Finite Two-Dimensional Grids Based on Infinity Mirror Technique," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 67, no. 9, pp. 3224-3233, September 2020

### Summary

- Infinite grid model extended for finite grids
- Improved accuracy near grid boundaries
- Fast IR drop analysis
  - Runtime independent of grid size
  - Scales quadratically with number of nodes of interest

#### Research Effort since 2017



SYNCHROMIZAT

LOGI

COMPUTER

ARCHITECTURE

NTHESIS

ROCHESTER

**SYNOPSYS**<sup>®</sup>

OWER

IIC

IRCUII

HEORY

QUANTUM

### QuCTS – Single Flux Quantun Clock Tree Synthesis

SUPERCONDUCTIVE

MEMRISTOR Island

SPINTRONIC~

MONVOL/2001/2

SEH





### Restricted Cell Placement

- Gates placed within specific cells
- N-1 splitters for N gates



# SFQ Clock Tree Synthesis

- The first SFQ clock tree synthesis algorithm utilizing clock skew
- Input: RTL circuit description
- Output: clock tree layout optimized for robustness and cost



R. Bairamkulov, T. Jabbari, and E. G. Friedman, "QuCTS - Single Flux Quantum Clock Tree Synthesis," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, in press.

## Delay Balancing: Macro Level

- Clock arrival time determined from clock skew scheduling
- Binary tree satisfies clock arrival time constraints
  - Inherent interconnect delay
    - Snaking to fine tune delay
  - Add JTLs for additional delay















- Find splitter and delay element locations to connect A and B
- Cells closest to line added to proxy graph
- Create proxy graph
  - Nodes are vacant cells
  - Edge weights are Manhattan distance between cells
  - No edge between A and B
- Analyze paths from A to B
  - Choose path minimizing
    - $|\Delta \tau \Delta t|$
    - $\Delta \tau$  difference in required arrival time
    - $\Delta t$  difference in actual arrival time



- Find splitter and delay element locations to connect A and B
- Cells closest to line added to proxy graph
- Create proxy graph
  - Nodes are vacant cells
  - Edge weights are Manhattan distance between cells
  - No edge between A and B
- Analyze paths from A to B
  - Choose path minimizing
    - $|\Delta \tau \Delta t|$
    - $\Delta \tau$  difference in required arrival time
    - $\Delta t$  difference in actual arrival time

(3)

6)

B

5

4

- Find splitter and delay element locations to connect A and B
- Cells closest to line added to proxy graph
- Create proxy graph
  - Nodes are vacant cells
  - Edge weights are Manhattan distance between cells
  - No edge between A and B
- Analyze paths from A to B
  - Choose path minimizing
    - $|\Delta \tau \Delta t|$
    - $\Delta \tau$  difference in required arrival time
    - $\Delta t$  difference in actual arrival time



- Find splitter and delay element locations to connect A and B
- Cells closest to line added to proxy graph
- Create proxy graph
  - Nodes are vacant cells
  - Edge weights are Manhattan distance between cells
  - No edge between A and B
- Analyze paths from A to B
  - Choose path minimizing
    - $|\Delta \tau \Delta t|$
    - $\Delta \tau$  difference in required arrival time
    - $\Delta t$  difference in actual arrival time



### Hanan Grid

- Embed path A-5-6-B
- Vertical and horizontal lines through
  - Nodes A and B
  - Splitter
  - JTLs
  - Obstacle boundaries
- Routing algorithms determine path



### Hanan Grid Routing

- Embed path A-5-6-B
- Vertical and horizontal lines through
  - Nodes A and B
  - Splitter
  - JTLs
  - Obstacle boundaries
- Routing algorithms determine path



### Hanan Grid Routing

- Embed path A-5-6-B
- Vertical and horizontal lines through
  - Nodes A and B
  - Splitter
  - JTLs
  - Obstacle boundaries
- Routing algorithms determine path



# Synthesis Results

- Benchmark circuits
  - ISCAS'89
  - ITC'99
  - AMD2901
    - 1,050 Clocked gates
    - 52 minutes runtime



#### Benchmark Circuits

| Circuit         | Clocked    | Delay      | Total          | $\mathbf{Runtime},$ |  |  |
|-----------------|------------|------------|----------------|---------------------|--|--|
|                 | gates      | elements   | wirelength, mm | minutes             |  |  |
| AMD2901         | $1,\!049$  | $1,\!241$  | 1,027          | 53                  |  |  |
| ISCAS'89 S13207 | $1,\!636$  | $2,\!405$  | 272            | 73                  |  |  |
| ITC'99 B14      | $6,\!365$  | 5,762      | 905            | 212                 |  |  |
| ISCAS'89 S38417 | 11,796     | $11,\!367$ | 1,002          | 393                 |  |  |
| ISCAS'89 S35932 | $14,\!914$ | $15,\!814$ | $6,\!656$      | 372                 |  |  |
| ITC'99 B18      | 45,710     | $71,\!090$ | 31,736         | $2,\!309$           |  |  |

#### **Research Effort since 2017**

SYNCHROMIZATION

CIRYOGEINIC OCEAN

SICHI

ESIGI

**ROCHESTER** 

#### **Placement of Distributed On-Chip Voltage Regulators**

NONVOLATULE

EMRISTOR ISLAND.

SPINTRONIC~

**OUANTUN** 

SUPERCONDUCTIVE

COMPUTING

SEA



### Conventional Power Delivery

- Single regulator within power management IC (PMIC)
- Large distance to load
- Poor regulation capability



### Distributed Power Delivery

- Distributed regulators within IC
- Minimal distance to load
- Superior regulation capability


#### Placement for Power Delivery: Capacitors

- Effectiveness of a capacitor depends on distance from load
  - Effective radius of decoupling capacitor

M. Popovich, M. Sotman, A. Kolodny, and E. G. Friedman, "Effective Radii of On-Chip Decoupling Capacitors," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 16, No. 7, pp. 894-907, July 2008



#### Placement for Power Delivery: Capacitors and I/O

- Distributed power delivery
  - decoupling capacitors

• I/O



S. Köse and E. G. Friedman, "Distributed On-Chip Power Delivery," *IEEE Journal on Emerging and Selected Topics in Circuits and Systems*, Vol. 2, No. 4, pp. 704-713, December 2012.

#### Placement for Power Delivery: Voltage Regulators



## Placement Procedure

- Objective
  - Minimize worst voltage drop within power network
- Constraints
  - Case one: no restrictions
  - Case two: prohibited areas within the grid
  - Case three: limited regulator current



# Regulation Mechanism

- Load circuitry effectively separated from input
- Stable output voltage (input regulation/load regulation)
- Analyze load part of power network



# Voltage Regulator Model

- Detailed model
  - Accurate behavior
  - Computationally expensive



- Simplified model
  - Poor accuracy
  - Computationally efficient
  - High fidelity



# Fidelity of Regulator Model

0.86

0.84

0.82

0.80

0.78

- 0.76

- $20 \times 20$  resistive-inductive mesh
- Ten randomly distributed loads
- Place regulator to minimize voltage drop
- Voltage estimated using HSPICE



Detailed model



Voltage source model

116



# Load Clustering

- Practical grids contain billions of loads
- Reduce number of loads by clustering









# Efficient Grid Analysis +

- MNA-based analysis computationally expensive
- Accelerate using infinity mirror technique



## Load Model

• Loads modeled as current sources

• Loads  

$$\mathcal{L} = \{ \ell_p | p \in [1, ..., n] \}$$

$$\ell_p = (x_p, y_p, i_p)$$



#### Load Model

Loads modeled as
current sources
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g

**∕--{/**--^

∽-+-∕∕∕∕--

Λ\_\_

┉ᠮ┉ᠮ᠁

~~~+~~~+~~~

• Loads  $\mathcal{L} = \{\ell_p | p \in [1, ..., n]\}$   $\ell_p = (x_p, y_p, i_p)$ 

-///-**{**\_///-**{**\_///-**{**\_////-

## Source Model

• Source modeled as unknown current sources

 $\mathcal{S} = \left\{ s_q \, \middle| \, q \in [1, \dots, m] \right\}$ 

• How to find current from voltage source?



## Superposition

- Any source/load distorts voltage within grid
- Voltage at any node as superposition of distortions





### Matrix Equation

- Solved for  $I(s_i)$
- Polynomial complexity with number of sources  $\sim O(m^3)$

$$\begin{bmatrix} v^{\mathbf{g}}(s_{1},s_{1}) & \dots & v^{\mathbf{g}}(s_{m},s_{1}) \\ \vdots & \ddots & \vdots \\ v^{\mathbf{g}}(s_{1},s_{m-1}) & \dots & v^{\mathbf{g}}(s_{m},s_{m-1}) \\ 1 & \dots & 1 \end{bmatrix} \begin{bmatrix} I(s_{1}) \\ \vdots \\ I(s_{m}) \end{bmatrix} = \begin{bmatrix} V_{1}^{\mathbf{g}} \\ \vdots \\ V_{1}^{\mathbf{g}} \\ \vdots \\ V_{1}^{\mathbf{g}} \\ V_{1}^{\mathbf{g}} \end{bmatrix} - \begin{bmatrix} v^{\mathbf{g}}(\ell_{1},s_{1}) & \dots & v^{\mathbf{g}}(\ell_{n},s_{1}) \\ \vdots \\ v^{\mathbf{g}}(\ell_{1},s_{m-1}) & \dots & v^{\mathbf{g}}(\ell_{n},s_{m-1}) \\ 1 & \dots & 1 \end{bmatrix} \begin{bmatrix} I(\ell_{1}) \\ \vdots \\ I(\ell_{n}) \end{bmatrix}$$

• Load voltage

$$\begin{bmatrix} v^{\mathbf{g}}(\ell_1) \\ \vdots \\ v^{\mathbf{g}}(\ell_{n-1}) \end{bmatrix} = \begin{bmatrix} v^{\mathbf{g}}(\ell_1,\ell_1) & \dots & v^{\mathbf{g}}(\ell_n,\ell_1) \\ \vdots & \ddots & \vdots \\ v^{\mathbf{g}}(\ell_1,\ell_{m-1}) & \dots & v^{\mathbf{g}}(\ell_n,\ell_{m-1}) \end{bmatrix} \begin{bmatrix} I(\ell_1) \\ \vdots \\ I(\ell_n) \end{bmatrix} + \begin{bmatrix} v^{\mathbf{g}}(s_1,\ell_1) & \dots & v^{\mathbf{g}}(s_m,\ell_1) \\ \vdots \\ v^{\mathbf{g}}(s_1,\ell_{m-1}) & \dots & v^{\mathbf{g}}(s_m,\ell_{m-1}) \end{bmatrix} \begin{bmatrix} I(s_1) \\ \vdots \\ I(\ell_n) \end{bmatrix}$$

#### IBMPG Benchmark Circuits - Resistance

- From 30,638 to 1,670,494 nodes
- Irregular mesh topology

- Approximated as regular grid
  - Dominant resistance and pitch



#### IBMPG Benchmark Circuits - Pitch

- From 30,638 to 1,670,494 nodes
- Irregular mesh topology

- Approximated as regular grid
  - Dominant resistance and pitch



## Simplified Grids

- IBMPG benchmarks modeled as anisotropic grids
- Dominant pitch and resistance

|        | Pitch     |           | $\underline{\mathbf{Resistivity},\mathrm{m}\Omega}$ |       | Dimensions |       |
|--------|-----------|-----------|-----------------------------------------------------|-------|------------|-------|
|        | x         | y         | <i>x</i>                                            | y     | <i>x</i>   | y     |
| ibmpg1 | $2,\!062$ | 33        | 5.714                                               | 0.635 | 10         | 629   |
| ibmpg2 | 48        | 72        | 4.000                                               | 16.25 | 169        | 113   |
| ibmpg3 | 864       | $1,\!296$ | 0.714                                               | 2.407 | 354        | 236   |
| ibmpg4 | 48        | 24        | 35.00                                               | 32.50 | 284        | 571   |
| ibmpg5 | 82        | 12        | 10.00                                               | 21.67 | 129        | 882   |
| ibmpg6 | 280       | 280       | 0.286                                               | 0.464 | 3,630      | 3,644 |



### Case One

#### Number of regulators: **4**0

- Unrestricted position and regulator current
- Loads merged into 100 clusters
- Number of regulators varied from 10 to 50



#### Case Two

• Placement restricted to uncongested regions

#### Location of blockages in ibmpg4



### Case Two

#### Number of regulators: **4**0

- Placement restricted to uncongested regions
- Greater average voltage drop than in case one



### Case Three

- Maximum current limited at each regulator
  - Limited area
  - Electromigration
- Prohibit exceeding maximum current
- New model needed

Max. current = 7



Max. current = 7



Max. current = 7

Max current exceeded. Repeat until convergence



Max. current = 7



### Case Three: Results

- Greater average voltage drop than case one
- Regulators spread more evenly within layout

#### Number of regulators: **4**0



## Summary

- Distributed voltage regulation effectively reduces power noise
- Regulator placement methodology is proposed
- Practical cases are supported
  - Restricted position
  - Limited current
- Validated using industrial benchmarks

## Final Remarks

- Graph theory important component of VLSI systems design and analysis
- Graphs are however underutilized in certain areas of VLSI
  - Power delivery
  - Physical design
  - Beyond CMOS technologies
- Five works extending graph theory to underutilized areas
  - Exploratory methodology for power delivery
  - Effective resistance in truncated and finite mesh
  - SPROUT Smart Power ROUTing tool for power network prototyping
  - QuCTS single flux Quantum Clock Tree Synthesis
  - Placement of Distributed On-Chip Voltage Regulators

