Summary: | In VLSI circuits, signal delays play an important role in design, timing verification and
signal integrity checks. These delays are attributed to the presence of parasitic resistance,
capacitance and inductance. With increasing clock speed and reducing feature sizes, these
delays will be dominated by parasitic inductance. In the next generation VLSI circuits, with
more than millions of components and interconnect segments, fast and accurate inductance
estimation becomes a crucial step.
A generalized approach for inductance extraction requires the solution of a large,
dense, complex linear system that models mutual inductive effects among circuit elements.
Iterative methods are used to solve the system without explicit computation of the system
matrix itself. Fast hierarchical techniques are used to compute approximate matrix-vector
products with the dense system matrix in a matrix-free way. Due to unavailability of system
matrix, constructing a preconditioner to accelerate the convergence of the iterative method
becomes a challenging task.
This work presents a class of parallel algorithms for fast and accurate inductance extraction
of VLSI circuits. We use the solenoidal basis approach that converts the linear
system into a reduced system. The reduced system of equations is solved by a preconditioned
iterative solver that uses fast hierarchical methods to compute products with the
dense coefficient matrix. A GreenâÃÂÃÂs function based preconditioner is proposed that achieves
near-optimal convergence rates in several cases. By formulating the preconditioner as a
dense matrix similar to the coefficient matrix, we are able to use fast hierarchical methods for the preconditioning step as well. Experiments on a number of benchmark problems
highlight the efficient preconditioning scheme and its advantages over FastHenry.
To further reduce the solution time of the software, we have developed a parallel implementation.
The parallel software package is capable of analyzing interconnects con-
figurations involving several conductors within reasonable time. A two-tier parallelization
scheme enables mixed mode parallelization, which uses both OpenMP and MPI directives.
The parallel performance of the software is demonstrated through experiments on the IBM
p690 and AMD Linux clusters. These experiments highlight the portability and efficiency
of the software on multiprocessors with shared, distributed, and distributed-shared memory
architectures.
|