[IEEE High Density Packaging (ICEPT-HDP) - Xi'an, China (2010.08.16-2010.08.19)] 2010 11th International Conference on Electronic Packaging Technology & High Density Packaging - Scaling system performance through packaging and interconnect: A study in networking applications

May 8, 2018 | Author: Anonymous | Category: Documents
Report this link


Description

Scaling System Performance through Packaging and Interconnect: A Study in Networking Applications Judy Priest Cisco Systems, Inc. 170 West Tasman Drive San Jose, CA 95135 Email: [email protected] Abstract High end networking and computing applications continue to drive silicon technologies to higher data rates, increased storage capacity, and increased bandwidth. Interfaces are transitioning more toward serial technologies, even for short distance data transfer. Voltage supply rails continue to drop and consequently, the noise margins become reduced, even with increasing numbers of simultaneously switched signals. The ratio of leakage current to active current is also increasing for high performance silicon processes, as is the demand for more instantaneous switching current. Power integrity and power distribution design becomes a more prevalent issue in electrical design as it dictates the efficiency of current draw available to switch on-chip circuits. Silicon integration and device scaling still leads to overall higher performance, but there are practical limits to the yield and assembly reliability of very large die. Packaging begins to play a more critical role in the improvement of performance through scaling of interconnect. This paper will examine the design tradeoffs for high end networking chips for performance and cost optimization. Two examples are shown where packaging design is directly linked to enabling system level performance impact. System Performance Networking applications are used for communication, voice, video, and data transfer. They range from end points at handheld consumer mobile and wired products through high end, very large racked enterprise applications. The usage models, environment, and reliability requirements for these products also broadly vary depending on the application. For purposes of this paper, the focus is on high end applications, which tends to require more complex performance driven hardware. Like computers, network system performance is generally and simplistically determined by three factors: bandwidth, capacity, and latency. Bandwidth across an interface is defined as the rate of data throughput. It is related to frequency of switching and operation, but should not be used interchangeably with that parameter. Capacity is largely determined by memory size, and how much data storage is available. Latency is important in many lookup functions and transactional applications. Together, how much data can be stored, how long it takes to access that data, and how much of it can be read in and out, contributes to performance metrics for that machine. A hardware system consists of many components, sheet metal, fans, power supplies, subassemblies, etc. Examples of hardware rack systems for high performance networking are shown in Figure 1. These are from the Cisco Systems, Inc. Catalyst product family of enterprise class switches. The two major components of the electronics subsystem are essentially some combination of active devices and passive interconnect. Active devices are silicon based and generally follow the process technology roadmap dictated by the semiconductor foundry. The system interconnect includes component packaging, printed circuit board(s), connectors, wiring, and cables. Silicon vs. Interconnect Most companies trying to build high performance hardware utilize the most aggressive integrated silicon technologies available. The fine features sizes of smaller silicon process nodes generally correspond to faster gates, greater density, and faster operating frequencies. These metrics can improve 15- 20% just through silicon process node shrinks. Figure 2 shows the scaling of transistor speed of high performance logic for various fabrication technologies [1]. Given this, a corollary could be argued that silicon integration offers the best performance possible. When advanced silicon performance can be coupled with interconnect, the resulting performance can be a greater scalability than even Moore’s Law [2, 3]. Figure 1: Examples of Cisco Systems chasses systems. 2010 11th International Conference on Electronic Packaging Technology & High Density Packaging 978-1-4244-8142-2/10/$26.00 ©2010 IEEE 130 This is essentially true for theoretical and linear scaling, but what about in actual design practice? IP design implementation, isolation requirements, and grid based physical design rules reduce performance and utilization from Moore’s Law. To compensate for this, foundries have gone to half node shrinks for bulk CMOS, or use of SOI and other fabrication technologies. With the exception of a few captive fabs, most of the industry moves on the same curve for silicon process nodes within. Differentiation usually comes from IP enabled devices, designs, and architecture, not just the silicon process node. Special purpose devices include low voltage threshold transistors, deep trench capacitors, dense memory cells, etc. IP include low jitter PLLs, high speed serial and parallel interfaces, etc. Examples of architecture differentiation include centralized versus distributed architectures, cut through versus store and forward, hardware and software interaction, etc., the list of examples are endless. This opens the stage for the use of interconnect as a key enabler for scaling performance. It is no longer the use of passive wires to connect active devices, but can increase performance and enable products. Can packaging and interconnect actually increase performance that is measurable from the system perspective, in a way that visible and relevant to customers? The answer to this question provides the business case justification for advance packaging design and identifying inflection points where the use of new packaging technology is required. Focus on interconnect Interconnect can offer two major benefits to a design. The first is enabling a product that wouldn’t otherwise exist without this new technology. The second is reducing cost or improving performance in a way that visible from the system perspective. Both are compelling cases for justifying new technology deployment. Two case studies illustrate these two paradigms. The first is product enablement through the use of an FCAMP (Flip Chip and Memory Package). The second is a substrate change that significantly reduced clock jitter, which could have otherwise been problematic in the final product. Case 1: Product enablement In the example of product enablement, an FCAMP was assembled with a flip chip ASIC die, and four packaged, tested- at-speed, and burned-in custom SRAM memory on to a single substrate [4, 5]. Figure 4 shows the top side with half the lid removed, and the bottom BGA side below. The center die is a 17.2 mm by 17.2 mm bare die, with approximately 7000 bumps, and 2200 active I/O. The custom memory is packaged in an 18.4 mm x 13.6 mm Chip Scale Package (CSP), which is an 850 BGA with 0.5 mm ball pitch. Figure 4: Flip Chip and Memory Package (FCAMP). Figure 3: Scaling greater than Moore’s Law. [Source: ITRS] Figure 2: Scaling of transistor intrinsic speed of high performance logic. [Source: ITRS] 2010 11th International Conference on Electronic Packaging Technology & High Density Packaging 131 This innovative packaging allows for at-speed testing and burn-in of the device prior to assembly, and repair and rework after initial assembly. Dense substrate technology also enabled this product. The overall dimension of the substrate is 52.5 mm x 52.5 mm. To optimize routability, it was necessary for the substrate technology to allow for the same via pitch as the specified bump pitch for the ASIC. This enabled the cleanest escapes through the bump field and the most uniform signal routing to the bottom side ball grid. If there is a core in the substrate stackup that requires a wider via pitch, this will result in many routing and electrical issues. Microvias are an essential technology in this respect, for maintaining signal integrity and routing efficiency. Additionally, stacking vias are critical for maintaining power plane integrity. Figure 5 shows the density of a quadrant of a single routing layer. Figure 6 shows a linear elastic finite element model of the assembly [6]. A quarter symmetry model was developed, with each layer represented. This allowed for simulation of various warpage profiles of the center die and memories, for a variety of material properties with and without a lid. Several underfills were evaluated, and reliability tests performed, including ATC (accelerated thermal cycle), THB (temperature/humidity/bias), HTS (high temperature storage), DTC (deep thermal cycle), and a Weibull plot to demonstrate reworkability of the assembly. While the ASIC has approximately 2200 active switching I/O, only about 900 I/O are physically brought off the FCAMP through the 1.0 mm ball grid array. This allows for a commodity printed circuit board technology to be used for the system. However, the frequency of operation, connectivity, and signal integrity would not have been achievable without this type of packaging and interconnect technology in the FCAMP. Case 2: System performance improvement from packaging A high volume production ASIC (application specific integrated circuit) was tested with excessive clock jitter but still operating to current specifications. However, this was preventing the chip to operate at a higher switching frequency desired for higher performance systems. The noise was tracked to supply noise found on the core and I/O power rails. The component shown in Figure 7, is a 40 mm x 40 mm BGA package with a 14 mm x 14 mm die utilizing a 4-4-4 “thin-core” substrate (200 μm core with 2 additional pre-preg layers; total core thickness is ~500μm) with top side Surface Mount Terminal (SMT) capacitors [7]. The silicon die is 13.2 mm x 13.2 mm, with 2545 bumps. Seventeen 0306 SMT capacitors were placed on VDD and 7 were placed on VDDQ. While this helped with decoupling the Figure 6: Quadrant symmetry model. Figure 5: Quadrant of a single signal layer. Figure 7: Thin core 4-4-4 substrate BGA package. 2010 11th International Conference on Electronic Packaging Technology & High Density Packaging 132 supply, the effective loop inductance of the component was still relatively high. Figure 8 shows a histogram of the total clock jitter measured on the system level board with the original package, as a combination of deterministic and random jitter. The system contribution was isolated and removed, so the resulting plot is due the device only. The bimodal response lasted for the duration of the clock period, making it impossible to increase the switching and operating frequency, i.e., shorten the period. A series of substrate investigations was performed, from redesign into coreless to bottom side to embedded capacitor technology. Each had a separate power integrity model and was measured in the same system environment (load board and test programs), at speed and under full traffic conditions. It was concluded that moving capacitors to the bottom side BGA, as shown in the Figure 9 diagram, sufficiently shortened the inductive loop of the power delivery network and subsequently improved clock jitter. The topside capacitors were replaced with 0204 SMT capacitors. Forty one were placed on VDD and 27 on VDDQ. Figure 10 shows the actual product assembly. BGA balls were strategically removed after detailed engineering analysis, so an optimized number of capacitors versus direct connections to the board planes could be made to minimize noise. Tradeoffs between capacitor placement and direct connection to the circuit board reference planes were simulated and optimized. Figure 11 shows the resulting clock jitter on the system board as a result of the package change. Significant reduction in deterministic jitter is clearly illustrated in the measurement. Normalized clock jitter is improved by 21% running actual system traffic, which allows for a corresponding speedup for the clock with the same noise tolerance characteristics. This solution is more cost effective, especially when compared to respinning the silicon or adding costly board filtering. It adds robustness to the current design and offers a path to upgradability for the device under test. Figure 11: Clock jitter improvement with new package. Figure 9: BGA side mounted capacitors. Figure 8: Clock jitter measured at the system level. Figure 10: Bottom side of the BGA-side capacitor package. Baseline Clock Jitter with Active Power Time Clock Jitter with Active Power Rail Time Clk period 2010 11th International Conference on Electronic Packaging Technology & High Density Packaging 133 This solution is also low risk with minimal development with the existing supply chain for substrate packaging and assembly. Reliability testing also shows this is a viable solution [8]. Conclusions Bandwidth requirements will continue to increase from one generation to the next in high end networking applications. As the industry moves along the same technology integration curve for semiconductor processes, more leverage can be found with inventive use of interconnect for additional performance improvements, which are visible not only at the component level, but can be measured in terms of system level performance. In some cases, as the example described, packaging can be a key a performance differentiator and even enable products that could not be built otherwise. References [1] International Technology Roadmap for Semiconductors, 2009 Edition. Process Integration, Devices, and Structures. [2] Moore, G., “Cramming more components onto integrated circuits”, Electronics Magazine, 1965. [3] International Technology Roadmap for Semiconductors, 2009 Edition. Executive Summary. [4] J. Priest, et. al., “Design Optimization of a High Performance FCAMP Package for Manufacturability and Reliability”, 2005 IEEE Electronic Components and Technology Conference, Lake Buena Vista, FL, May 31 – June 3, 2005. [5] J. Priest, et. al., “Challenges in Substrate Design, Assembly, and Reliability of a SiP Package for High End Networking Application”, IMAPS 2006 International Symposium on Microelectronics, San Jose, CA, October 8- 12, 2006. [6] ANSYS® Theory Reference, SAS IP Inc, 2002. [7] J. Savic, et. al., “Electrical Performance and Reliability Assessment of Advanced Substrate Technologies for High Speed Networking Applications”, 2009 IEEE Electronic Components and Technology Conference, San Diego, CA, May 26-29, 2009. [8] J. Savic, et. al., “Reliability of Advanced Packaging Technologies in High Speed Networking Applications”, 2010 IEEE Electronic Components and Technology Conference, Las Vegas, NV, June 1-4, 2010. 2010 11th International Conference on Electronic Packaging Technology & High Density Packaging 134 /ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 150 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages false /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 600 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False /Description > >> setdistillerparams > setpagedevice


Comments

Copyright © 2025 UPDOCS Inc.