# Immense ASIC Design in Nanometer Era Dr. Danny Rittman December 2005 <u>danny@tayden.com</u>

## ABSTRACT

Current silicon process technology allows designers to integrate immense number of features into a single IC. As design size climbs into hundreds of million of gates, new design and manufacturing challenges are arise. Managing designs of this sheer size presents a new dimension of issues, and managing the physical and electrical effects of these geometries presents another challenge to current EDA technology. This paper discusses the challenges of vary large ASIC designs which consist of hundreds millions of logic gates, into a single chip solution.

## INTRODUCTION

The rapid increase in design complexity has become a serious limiting factor in nanometer ASIC designs. This sharp increase is driven by two factors. One is the exponential rise in the number of devices integrated in a single chip. Second factor is due to many new issues, such as interconnect, noise, power, and thermal limitations that are associated with technology scaling. As a result, a gap is widening between the silicon capacity and the design productivity. Silicon technologies are advancing from 80 Million gates in 90 nanometer technology designs, to over 100 million gates in 65/45 nanometer. This enormous growth in gate capacity has led to unprecedented capability for design size, functional integration, and complexity on a single chip solution. Single chips are now replacing multiple chip packages and entire systems. Managing this functional capacity and complexity requires more efforts in design productivity and quality through design methodology. Another issue is the design performance and density enablement through circuit techniques and physical architecture. Finally, functional complexity management, and combining IP from multiple sources and multiple technology platforms, through System On Chip (SoC) integration methods. Time-to-market factor prohibits a proportional increase in product schedule with the size of the design, requiring increased design productivity, and decreasing turn-around time of a given design size, to keep pace with the increased design capacity.

Hierarchical design method is becoming a significant feature to optimize the design's turn around time and the desire to achieve First Time Right design is also becoming a necessity, to maintain market position. The ever-increasing density has drawn shrinking circuit geometries toward an array of fundamental limits produced by electrical and material effects.

## **HIGH LEVEL INTERCONNECT**

Shrinking the interconnect size, performance is getting worse. Starting with the 0.13 micron generation, the interconnect delay began to surpass the intrinsic gate delay. As technology steps into nanometer arena, interconnect has become a key challenge of the chip design process. Performance, power consumption, signals and power integrity are all affected by the chip interconnect. To reduce interconnect RC delay, copper and low-k materials were introduced. However, the new materials also brought with them great difficulties in manufacturing. Chip failures started to occur due to low mechanical strength of the low-k materials, large thickness variation of the wires, or premature electro-migration (EM) failure.

While IC's interconnect increasingly becomes a challenge, technology scaling continues to evolved. The smaller feature size allows larger and more complex system to be built on a single chip. The gate counts of a cell based ASIC product have increased from approximately 10 million in at 90nm to well above 80 million in the 65 nm generation. Such large-scale, high performance, SoC designs make it difficult for ASIC vendors to meet the demands of turnaround time, first-time right design, and high manufacturability.

These challenges and the ever-higher demands created new type of solutions which led to an interconnect-aware design methodology. This methodology has moved to high-level integration in multiple areas previously regarded as independent, such as design implementation, physical analysis, chip process, and packaging. In the traditional ASIC implementation flow, accurate interconnect delay can only be obtained at a late routing stage. However the freedom of modifying the design at this stage is limited. Most of the delay comes from the IC's interconnect, so the tool flow needs accurate interconnect delay information as early as possible, and should allow continuous optimization in different stages to correctly reflect the real interconnect delay. Optimal interconnect architecture is the key for this methodology.

## **DESIGN PRODUCTIVITY**

The increase of design complexity, measured in unique logic gates per chip and clock frequencies increase have pushed ASICs limits to new dimensions. Time-to-market factor continues to drive chip design turn around time requirements downward, while the design capacity on a chip increases. The result is an exponential increase in the required design productivity.

**Design Flow** – The design flow is continuously improving within IC design houses and EDA technology providers. Typical design flow there includes a set of every designer-executed step, iteration and redo of steps, the time of and between each step that needs to take place. For example, one of IBM's ASIC groups has been able to reduce the number of steps executed by the designer from around 200 to 130 in one year alone based upon such analysis and subsequent changes in the design methodology. These changes include encapsulating sequences of multiple steps into one, and moving the discovery of problems that cause redo to a point earlier in the cycle or eliminating them entirely. Tool development by IBM and its research partners has substantially reduced the time required for complex design steps such as layout timing closure, as has the application of faster and highly-parallel CPU's to such performance-intensive steps.

**Tool Integration** - Is a key factor to a fully efficient design flow and methodology development. It includes the coupling of previously-unique design steps and algorithms into a single algorithm. Additionally, it is the careful selection of designer-driven steps for automation: continuing to leverage the designer's knowledge and decisions in the design process, while automating the sequences that take place between decisions. Finally, integration drives the measurement and fixing of problems discovered late in the design cycle, to incorporating these measurements into the tool processes which create the design initially, thus preventing the problem from ever being introduced (e.g. an avoidance action). Tools integration can be done in a wide variety of design flow segments. For example: 1. Placementbased synthesis tool flows for early and late timing closure merged the operation of gate placement (where interconnect timing estimates can be highly accurate) with synthesis. (Where logic timing optimization is performed) The integration of synthesis and placement was extended to wiring congestion avoidance, and timing-driven global routing. 2. Cross-coupling of nets necessitates the detection of timing changes or possible false switching due to activity in a near-by-routed net.

3. Incremental timing, and combining multiple cross-chip process variations into a single path analysis, to reduce the number of timing runs. This is now evolving into statistical timing approaches to account for device-specific process variation.

4. Design Planning methodologies move the designer's decisions for logical and physical partitioning, floor-planning, and timing closure from later in the design flow (when changes require larger turn around time) to an earlier timeframe (when changes can be made rapidly). Design planning further drives increased automation into the final stages of the flow, reducing the schedule's critical path.

#### Flat and Hierarchical Design Methods

Managing turn around time and design productivity leads, for each chip design, to evaluation of tradeoffs between flat and hierarchical design methods. Differences within the chip design, as well as the organization of the design project, affect how the advantages and disadvantages of flat and hierarchical design apply. In many cases, a combination of both approaches leads to the fastest solution. Flat design allows the complete chip design to be solved as a single placement and routing problem. The ability to globally optimize placement and logic for the entire design allows for paths between synthesis partitions to be optimized. Avoidance of hard physical partition boundaries can lead to higher utilization of the chip. Hierarchical design requires partitioning of the design, and can constrain optimization of the physical design. It can, however, be a powerful technique for design architectures with natural functional "islands," and can be particularly leveraged when different design teams work on different islands at the same time, running these smaller designs in parallel. Partitioning can localize the problems of timing closure and wiring ability, minimizing the issues of global timing and wiring congestion. If the final design change only affects one or a few partitions, the entire design may not have to be reprocessed. Hierarchical design, on the other hand, requires additional design steps including partitioning, partition pin management, planning wiring resource between the partitions and the top level, integrating the partitions, and resolving global timing and wiring issues at the top-level. Hierarchical techniques were developed for optimal benefit on many designs. Typically we consider two types of hierarchical approaches.

**Partial Hierarchy**: A critical logic partitions is designed as a hierarchical block, but the remainder of the chip is easily closed as a flat design.

**Hybrid Hierarchy**: Placement and timing closure is partitioned, but routing is done flat. This allows flat versus hierarchical tradeoffs to be made separately for placement and routing. Further, this allows the additional wiring-related design steps for hierarchical routing to be avoided. With incremental wiring, a final change for only a single block has required only localized rewiring despite the fact that routing was initially flat. Soft Hierarchy (region constraints): Localized placement objectives, such as timing closure and placement density, are enabled by logical partitioning. However, the partitioning is soft, avoiding most of the hierarchical design steps. Figure 1 shows a chip designed with a mixture of flat techniques, and hard and soft hierarchical techniques.



Picture 1: Hierarchy integration Image Source: UCLA

## **Packaging Priorities**

Nanometer ASICs are extremely susceptible to resistance and inductanceinduced instantaneous voltage drop, because of the higher resistance, the higher signal frequency, and the lower noise margin. This issue is addressed in the steps of power planning, packaging, and placement along the flow. The appropriate package is determined by the total power consumption. To minimize the resistance of the power supply, the top thick metal layers are fully used to build the power mesh. The amount of decoupling capacitance needed for each portion of the chip is calculated to reduce the noise on the power lines which may result in a delay shift or wrong switching.

Interconnect heat dissipation of the signal and clock nets cannot be ignored any more because of the smaller interconnect size and poorer thermal conductivity of low-k materials. Part of the wires could reach a temperature much higher than the substrate temperature during operation, causing signal integrity degradation or reliability problem.

In efficient design flows, the RMS current of each signal or clock net is analyzed to determine the wire width and the amount of via needed. In this way, the methodology guarantees that the current induced temperature increase in the signal and clock nets is within the spec under the worst operating condition. A wire thickness model should be introduced in physical analysis to calculate the wire thickness based on the layout. The success of this approach depends on the consistency of the CMP process, parameter extraction, and model accuracy. This is a challenging task, especially in a foundry production environment.

In nanometer Cu/low-k process, a design passing all levels of verification does not necessarily mean high yield. The Cu/low-k process makes the concept of "design for manufacturability" (DFM) extremely important. Effective ASIC design flow, DFM is always one of the key considerations. For the Cu/low-k interconnect, wire density plays the important role in controlling the wire thickness. However, to set a density window is not an effective way to gain control. It unnecessarily limits the design capability. In order to have good pattern uniformity is a much better way to minimize the wire thickness variation. Dummy metal insertion methodology is a significant key factor to achieve the best uniformity. Nanometer ASIC design has been strongly influenced by on-chip interconnects scaling. An effective design flow for the nanometer era has to have a design-aware and process-aware interconnect methodology. This methodology should be well integrated across multiple areas to address the issues effectively.

## **DESIGN QUALITY**

The major benefit of an efficient design quality is the elimination of design redo of the manufactured chip, where schedule impact of redo is most costly. Design quality is partly based on error-free execution of the design process. The main key technical aspects of the design process are critical for the project's success and can be summarize as follows:

1. Static timing analysis and timing modeling characterized to the logic circuit and physical chip implementations, and tuned to the target manufacturing processes.

2. Race-free full-scan Design-For-Test structures, with full boundary scan, enabling completely automated test and diagnostic pattern generation.

3. Correct-by-construction physical templates of the chip that provide robust power distribution, signal and power I/O locations, and locations for logic placement.

4. Technology and manufacturing specific checking of the logical and physical implementations.

5. Equivalency checking to ensure the final logical implementation is the same as that provided originally by the ASIC designer.

6. Broadened timing analysis to detect and eliminate issues due to cross coupled noise and power supply drop. Now, noise-avoidance methods are applied to the global routing and placement steps.

7. Extended Design-For-Test techniques able to provide the increased test data volume of huge gate counts, and able to identify delay-based defects, the need for which increases with decreasing circuit and process geometries.

8. Automated image generation to allow specific permutations of the predefined image types, including chip size, I/O types, power structure, signal pre-wiring, and multiple placement terrains.

#### **POWER MANAGEMENT**

Power is probably the 'hottest' area dealing with nanometer design fro the last decade. Power consumed by CMOS circuitry is driven by active power, whose primary component is dynamic signal switching and static power which is produced mainly by leakage current. Active power (Dynamic) can be expressed as:

#### Pactive = C \* Vdd<sup>2</sup> \* F

Whereas each successive technology generation decreased the Vdd requirement by around 30%, this has been offset by a corresponding 30% increase in capacitance per unit area.

Given the increasing required frequency of product design by generation, the overall effect is increased active power. The most dominant component of leakage current is the circuit's sub-threshold transistor current. Transistor performance has been increased through reduced oxide thickness (Tox), which for reliability requires a drop in Vdd and a corresponding drop in threshold voltage (Vt) to provide performance. The combined reduction in Tox and Vt increases leakage current, which 90nm technologies has emerged to equal importance with active power, as shown in Figure 2.



## Figure 2: Active/Leakage Power Density (Via LPOLY Width) Image Source: VLSI LAB; Bell-Lucent

Managing power consumption in an ASIC design can be addressed at multiple levels:

- 1. Circuit or library level.
- 2. Logic design level, including design optimization.
- 3. Architectural level.

Multiple circuit libraries providing options for Vt can provide logic design and optimization options for increasing performance in a given logic path at the expense of increased static power (low Vt), or conversely reducing power in a path that meets, with extra margin, the required performance (high Vt). Taking advantage of multiple Vt options is in the realm of the logic designer and the synthesis and layout optimization tool flows.

Further, Vt libraries can be assigned architecturally to entire functional partitions, providing high performance yet high leakage for the highest performance applications, while reducing power (with high Vt) for logic blocks that can operate at lower performance. When leveraging multiple Vt, optimization tools must consider the maximum allowable leakage current for the chip in the test and product environments. Further, integrating multiple transistor design points (multiple Vt libraries) within a single chip extends the need to consider device-specific process variation in the timing signoff tools. Logic design techniques for reducing active power include drive strength reduction for non-timing-critical logic paths, glitch- free combinational logic, disabling unobserved combinational blocks, gating the clock locally for registers that retain logic state across several cycles, allowing clock skew to reduce simultaneous switching, and double-edged clocking. Managing power at the architectural level can provide significant leverage in reducing chip power, therefore became a standard with ASICs design flow.

## NOISE AND SIGNAL INTEGRITY

Noise and signal integrity are typical nanometer design challenges. Coupled noise was the most problematic form of noise in digital designs using the 0.18 and 0.13 micron technology nodes, and design methodologies have been developed for avoiding, detecting, and fixing coupled noise problems. IR drop (both AC and DC) became the predominant noise problem in 90 nm and below. While power densities have increased or remained the same due to thermal considerations, the supply voltage has continued to scale.

This results in more current-per-unit-area on the chip. Metal lines have also scaled, raising resistance in the on-chip power distribution. Additionally, transistor threshold voltage has not scaled due to the exponential increase in leakage current that would result, thereby resulting in circuits that are more sensitive to IR drop, due to decreased (Vgs - Vt). The analysis of the ASIC power supply system requires knowledge of the power distribution design of the card, the package, and the chip. Unfortunately, the power supply response of the system incorporating the ASIC is dependent on functional patterns, and a representative pattern set is rarely available. This forces the ASIC designer to develop a robust power distribution to minimize IR drop due to power consumption. The physical design methodology needs to consider not only the design of the power grid, but also the placement of high-current cells, and the number and location of decoupling capacitors needed for reducing the effects on the power distribution.

Reliability wear-out mechanisms that were safely guard-banded against in the past, such as negative bias temperature instability (NBTI) and hot carrier effects, must be considered during the design of the ASIC to ensure proper function over the life of the part. Both NBTI and hot carrier effects result in degraded transistor performance over time. The effects, as seen in .13 micron and 90 nm silicon processes, can result in significant delay changes. Unfortunately, the delay change is not uniform for every path on the design due to differences in the path delay components (wire-dominated vs. circuitdominated, rise-delay vs. fall-delay, etc). This fact arise attention to the differential variation between two paths with common dependencies (setup and hold checks) and not only to logic cycle. The amount of margin in the design will vary as a function of time as the paths degrade at different rates, and this needs to be accounted for in the timing analysis. Unlike NBTI, the impacts of hot carrier effects are a function of individual node switching activity. Clocks degrade more than logic because they switch more often, and clock gating can actually result in the creation of additional clock skew as the design ages.

#### **SoC INTEGRATION - Functional SoC**

A typical System-on-Chip (SoC) can be characterized both by large design size as measured in gates and integration of functional blocks. In the past, it made sense to deliver library based designs or IP (intellectual property) in a synthesized netlist form. This provided a reasonable assurance that when incorporated into the chip design, the function would operate at the performance determined in the original logic synthesis.

For digital IP with higher performance requirements, a fully implemented hard core is pre-defined and integrated as a physical block in the target SoC. However, the increased dominance of interconnect delay (as described earlier) has made it far more difficult to optimize a function to a given performance level outside the context of the intended chip design. Further, the success of placement-based synthesis methodologies generally leads to best results by integrating logic synthesis and placement within the context of the target chip. With these factors in mind, the following levels of library based and integrating predefined digital functions emerge: 1. Standard performance: Provide RTL that can be synthesized and placed together with the SoC's remaining RTL.

2. Higher performance: Provide synthesized and placed gates that can be flattened into, or integrated hierarchically into, the SoC's floor-plan.

3. Highest performance: Provide fully laid-out hard cores, to be integrated into the SoC's floor-plan.

With the need to implement a SoC comprised of functional blocks from multiple sources with possibly different development schedules, comes the need for project management and methods to integrate design data and flow that are able to deal with this complexity. System-level design planning tools and methodologies will need to extend from their existing physical floorplanning features, into the realm of functional architecture, integrating both. The needs include:

1. Partitioning for power and path performance.

2. Determination of clock domains, including frequency and physical distribution.

3. Architectural performance modeling and functional/physical pipeline planning.

4. Characterization and/or abstraction of the above attributes, and their application in high-level SoC design.

## Technology "Islands"

Latest technology generations have provided far greater functional integration on a single chip. This integration brings together functional components implemented in varying circuit families and/or physical layout architectures, varying voltage operating points for specified operation, analog and digital designs, and varying design platforms such as standard cell and FPGA. Diverse design requirements, different optimal design points, varying flexibility, and product schedule preclude redesigning all functions into a common homogeneous physical structure.

And of course, power management is of ever- increasing importance. Further, the diverse manufacturing test requirements of the functional components must be integrated into a single test process for the chip. Voltage Island techniques provide a functional block with a voltage source that can be unique from other functional blocks of the chip. An SoC comprised of multiple Voltage Islands can provide each functional block the specific voltage level needed to meet required performance. Therefore, substantial power savings can be realized for functional blocks of lower performance and thus lower voltage requirement, even when there are other, much higher performance functions on the chip. The power to a Voltage Island can be uniquely switched, whereby a SoC of multiple Voltage Islands need only provide power to active functions, a capability valuable to low power or battery-powered applications. Mixed Terrain techniques allow each functional block to use a circuit library and a corresponding circuit placement row pattern optimized to the performance and wire ability requirements of the function. For example, a lower performance function with low wiring congestion can be implemented in a physically-smaller block by using a lower-track circuit library and placing these circuits in a high density structure of corresponding circuit row sizes.

While First Time Right methodology takes much of the risk out of the physical and electrical implementation of an SOC design, there is still a chance that a logic error may be introduced by the logic designer. While gate array backfill can reduce a logic fix to a simple wire change, the use of an embedded programmable FPGA block can further mitigate the risk of error. By implementing "risky" logic in a programmable FPGA, a logic error can be repaired without the cost and schedule impact of a chip re-spin.

Embedded FPGA logic is slower and less area-efficient than standard cell logic, so architectural planning is required to leverage this capability. Hierarchical physical design approaches become necessary for integrating a design of functional/technology "islands". Uniquely by island, a physical architecture can be defined including circuit row topology, power distribution structure, Vdd supply, transistor threshold and/or transistor voltage bias supply, and island-specific circuitry for voltage level shifting, voltage regulation and switching, capacitive decoupling, and electrostatic discharge. To provide these approaches, a highly-flexible methodology for detailed design planning becomes extremely important for managing and automating the implementation and verification of chip structures needed to integrate all these physical design methods into a single SoC. Further, early design planning methods such as functional partitioning and architecture-level timing and power analysis, must be extended to assist the SoC designer in making effective use of these integrated approaches. The traditional chip-level design tradeoff mix of circuit density, wire ability, performance, and power becomes more complex through the mix of applications across the SoC. Today's dominant approach to testing a manufactured ASIC uses full-scan design for test (DFT) structures and automatic test pattern generation. This approach derives much of its benefit in productivity, test quality, and ability to diagnose failures, from consistent and predictable DFT structures across the entire chip.

A functional/technology "island" often brings with it a unique DFT design and pattern application requirement that differs from the overall ASIC. Such is the case when integrating analog IP and embedded FPGAs, for example. Creating test data for diverse components may require unique test data development or characterization, and the resulting data stream must be integrated into the data stream of the overall chip, which may continue to be based at least in part on traditional full-scan methods.

#### CONCLUSION

Clearly, ASIC methodologies will continue to yield ground to programmable solutions at the low end of the performance curve, especially for designs with quick-prototyping requirements or with low expected volume over the design's lifetime. High end ASICs using careful design techniques in certain highly specialized applications, have already achieved clock speeds of 800 MHz or more, a performance level far beyond the reach of state-of-the-art programmable solutions.

In the near future a new generation of design tools will enable more and more ASIC designers to push this performance envelope. Tools and technologies that will be designed for hybrid optimization, allow designers to harness the power of custom transistor-level optimization techniques heretofore available only through handcrafting of designs — from within standard-cell-based design flows. Such optimizations will eliminate bottlenecks posed by a fixed set of standard cells found in a pre-designed library, and allow the use of dynamically created and sized building blocks, while retaining benefits of the cell-based ASIC design infrastructure and methodologies. Furthermore advanced technologies will be complemented by new physicaldesign tools that harness the power of structured physical design — common in handcrafted designs, and totally alien to automatically generated physical layouts — while staying within an automated design framework. This new generation of physical-design tools will likely include a mix of known structured layout techniques, like tiling for specialized data-path blocks, and radically orthogonal layout techniques such as route-place-route, to achieve a degree of layout compactness that has previously eluded automated layout.

The combination of transistor-level design optimization and structured layout is expected to affect every measure of quality of ASIC designs. Most important, it holds the promise of performance gains without a rise in area or power consumption, as demanded in cell-based designs today. In fact, the expected reduction in transistor count through design optimization and greater compactness in layout will likely allow reduction in area and power while improving performance, as compared to a reference design with the same functionality implemented using an existing cell-based methodology.

Designing hundred million gate chips, made possible by nanometer silicon technologies, has first presented the challenge of managing massive design size and complexity, while product performance requirements continue to increase and time-to-market requirements continue to shrink. Design productivity gains and design schedule reductions are being realized through design process improvement and tool integration. A comprehensive strategy for design closure of large flat designs, hierarchical designs, and combinations of both, can provide the path to the earliest design closure solution. First-Time-Right design provides the greatest benefit to time-tomarket, and thus design quality methods continue to rise in importance with increased design content and complexity. Silicon density at the 65 nm level has increased the need to manage active and static power within the design, at the circuit, logic, and architecture levels.

Signal integrity issues, and their avoidance through design techniques, were presented including coupled noise, IR drop, and reliability wear-out mechanisms. Leveraging massive design capabilities in a single SoC leads to the integration of diverse functional components. Functional organization and chip organization must be combined into a single design planning solution. The functional components comprising the SoC can be diverse in terms of optimum library and technology, operating point, implementation platform, and test methodology. This has led to design integration methodologies including Voltage Islands, mixed library/placement terrains, and embedded FPGA's. When quality matters, the emerging ability to squeeze extra performance through transistor-and layout-level optimizations favors ASIC designs well into the nanometer era.

#### REFERENCES

[1] D.E. Lackey, "Applying Placement-based Synthesis for On- Time Systemon-a-Chip Design," *Proceedings of the IEEE Custom Integrated Circuits Conference*," 2000, pp.121-124.

[2] J. Koehl, D.E. Lackey, G.W. Doerre, "IBM's 50 Million Gate ASICs," *Proceedings of the IEEE ASP Design Automation Conference*, 2003, pp. 628-634.

[3] L. Marshall, T.Wagner, J.Koehl: "A New ASIC Timing Signoff Methodology", *IBM MicroNews*, 2/2002.

[4] W. Donath, P. Kudva, L. Stok, P. Villarrubia, L. Reddy, A. Sullivan, "Transformational Placement and Synthesis," *Proceedings of the Conference and Exhibition on Design, Automation, and Test in Europe*, 2000, pp.194-201.

[5] C.J.Alpert, G.Gandham, J.Hu, J.L. Neves, S.T. Quay, S.S. Sapatneker, "Steiner Tree Optimization for Buffers, Blockages, and Bays," *Proceedings of the IEEE International Symposium on Circuits and Systems*, 2001, pp.399-402.

[6] C. J. Alpert, G.J. Nam, and P. G. Villarrubia, "Free Space Management for Cut-Based Placement", *Proceedings of the IEEE ICCAD*, November, 2002, pp.746-751.

[7] J. Koehl, J. Schietke, "In-place Timing Optimization", *Proceedings of SAME, Sophia Antipolis*, 2000.

[8] J. Vygen, "Algorithms for Large-Scale Flat Placement", *Proceedings of the 34th Design Automation Conference*, 1997, pp.746-751.

[9] G.W. Doerre, D.E. Lackey, "The IBM ASIC/SoC methodology – A recipe for first-time success," *IBM Journal of Research and Development*, Vol.46, No.6, pp.649-660, Nov. 2002.

[10] U. Brenner, A. Rohe, "An Effective Congestion Driven Placement Framework", *ISPD 2002*, pp.6-11.

[11] J.A Darringer, *et al.*, "EDA in IBM: Past, Present, and Future," *IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems*, Vol.19, No.12, pp. 1476-1496, December, 2000.

[12] J.A.G. Jess, K, Kalafala, S.R. Naidu, R.H.J.M. Otten, C.Visweswariah, "Statistical timing for parametric yield prediction of digital integrated circuits," *Proceedings of the 2003 Design Automation Conference*.
[13] D.E. Lackey, "Design Planning Methodology for Rapid Chip Deployment," *The Eighth IEEE/DATC Electronics Design Processes Workshop*, April, 2001.

[14] T.R Bednar, P.H. Buffet, R.J. Darden, S.W. Gould, P.S. Zuchowski, "Issues and strategies for the physical design of system-on-a-chip ASICs," *IBM Journal of Research and Development*, Vol.46, No.6, pp. 661-673, Nov. 2002.

[15] S.Oakland, J.Monzel, R.Bassett, P.Gillis, "An ASIC Foundry View of Design for Test," Proceedings of the IEEE International Test Conference, 1994.

[16] A.Kuelmann *et al.*, "Verity – a Formal Verification Program for custom CMOS circuits," *IBM Journal of Research and Development*, Vol.39 No.1/2, pp.149-165, Jan/March 1995.

[17] M.R. Becer, D.Blaauw, R. Panda, I.H. Hajj, "Early Probabalistic Noise Estimation for Capacitively Coupled Interconnects," IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems, Vol.22, No.3, March 2003.

[18] E. Enomoto, "Low Power Design Technology for Digital LSIs," *IEICE Transactions on Electronics,* Vol. E79-C, No.12, pp.1639-1649, December 1996.

[19] S. Borkar, "Design Challenges of Technology Scaling," IEEE Micro, Vol. 19, No. 4, pp.23-29, July-August 1999.

[20] A.O. Adan and K. Higashi, "OFF-State Leakage Current Mechanisms in BulkSi and SOI MOSFETs and Their Impact on CMOS ULSIs Standby Current," IEEE Transactions on Electron Devices, Vol. 48, No. 9, pp.2050-2057, Sept. 2001.

[21] E.J. Nowack, "Maintaining the Benefits of CMOS Scaling when Scaling Bogs Down," *IBM Journal of Research and Development*, No. 2/3 March/May 2002.

[22] A.Dean, G. Garrett, M. Stan, S. Ventrone, "Low Power Design for ASIC Cores," *VLSI Design*, 2000, pp.1-15.

[23] J.G. Xi, W. W-M. Dai, "Low Power Clock Distribution," *Low Power Design Methodologies*, Copyright 1996 by Kluwer Academic Publishers, Chapter 5. [24] T.Stoehr *et al.* "Analysis, Reduction and Avoidance of Crosstalk on VLSI Chips," *Proceedings of the ISPD*, 1998.

[25] J.A Darringer, *et al.*, "Early analysis tools for System-on-chip design," *IBM Journal of Research and Development*, Vol.46, No.6, pp. 691-707, Nov. 2002.

[26] D.E. Lackey, P.S. Zuchowski, T.R. Bednar, D.W. Stout, S.W. Gould, J.W. Cohn, "Managing Power and Performance for System-on-Chip Designs using Voltage Islands," *Proc. IEEE Conference on Computer-Aided Design*, Nov. 2002.

[27] P. S. Zuchowski, C. B. Reynolds, R. J. Grupp, S. G. Davis, B. Cremen, B. Troxel, "A Hybid ASIC and FPGA Architecture", *Proc. IEEE Conference on Computer-Aided Design*, Nov. 2002.

[28] Chang, C.C., Cong, J., and Xie, M., "Optimality and Scalability Study of Existing Placement Algorithms," Asia Pacific Design Automation Conference, Jan. 2003.

[29] Chan, T., Cong, J., Shinnerl, J., and Sze, K., "An Enhanced Multilevel Algorithm for Circuit Placement," International Conference on Computer-Aided Design, Nov. 2003.

[30] Karypis, G., et. al., "Multilevel Hypergraph Partitioning: Application in VLSI Domain," Design Automation Conference, June 1997.

[31] Cong, J., Xie, M., and Zhang, Y., "An Enhanced Multilevel Routing System," International Conference on Computer-Aided Design, Nov. 2002.

[32] Cong, J., Fan, Y., Han, G., Yang, X., and Zhang, Z., "Architecture and Synthesis for On-Chip Multicycle Communication," IEEE Trans. on CAD, Vol. 23, April 2004.

[33] Carloni, L., McMillan, K., and Sangiovanni-Vincentelli, A., "Latency Insensitive Protocols," International Conference on Computer-Aided Verification, July 1999.

[34] Smith, G., and Nadamuni, D., "2003 ESL Landscape," ID Number: SEMC-WW-DP-0259, Dataquest, April 2003.

[35] Goering, R., "Platform-based Design: A Choice, Not a Panacea," EE Times, Sept. 2002.

[36] Sangiovanni-Vincentelli, A., and Martin, Grant, "Platform-Based Design and Software Design Methodology for Embedded Systems," IEEE Design and Test of Computers, Volume 18, Number 6, November-December 2001.

[37] Sangiovanni-Vincentelli, A., et. al., "Benefits and Challenges for Platform-Based Design," Design Automation Conference, June 2004.

[38] Cong, J., Fan, Y., Han, G., and Zhang, Z., "Application-Specific Instruction Generation for Configurable Processor Architectures," Field-Programmable Gate Arrays, Feb. 2004.

[39] Snyder, C., "Structured ASICs Offer Application Adaptability," www.synplicity.com/literature/pdf/v1-1\_adaptability1.pdf, SemiView, Dec. 2003.