Session 7 Presentation 2

## TestConX 2020

Heating Up - Thermal

## Bias-Stress Testing of Ultra High-Power Integrated Circuits: HTOL and ORM

#### David J Leary InCAL Technology



Virtual Event • May 11-13, 2020



TestConX Workshop

www.testconx.org

Heating Up - Thermal

#### Acronyms

- HTOL = High Temperature Operating Life
- ORM = Ongoing Reliability Monitoring
- BTI = Bias-Temperature Instability
- HCI = Hot Carrier Injection
- UHP = Ultra High-Power
- MCM = Multi-Chip Module
- FIT = Failures in 10<sup>9</sup> device hours





#### Introduction

Key Points.

- Loss of 10-year reliability margin in FinFET silicon.
- HTOL stress levels are critical to assuring adequate activation of underlying aging mechanisms.
- Critical new role for ORM.







www.testconx.org

Heating Up - Thermal

#### **UHP ASIC Attributes**

- Die sizes in the latest Si technologies have reached the constraint of the wafer mask reticle.
- Satellite die (Chiplets) are being added to the package, creating SiP (System in Package) products that challenge the equipment capabilities and test methodologies for new product qualification.





Bias-Stress Testing of Ultra High-Power Integrated Circuits: HTOL and ORM

2020

TestConX Workshop

www.testconx.org

Heating Up - Thermal

#### Leakage Power at Stress Conditions

- IC leakage power increases exponentially with temperature.
- Large monolithic die assembled in MCM and 2.5D packaging, comprise leakage power at HTOL stress conditions that consumes or exceeds the power capability of traditional HTOL systems.





Bias-Stress Testing of Ultra High-Power Integrated Circuits: HTOL and ORM



TestConX Workshop

www.testconx.org

#### **Dynamic Power at Stress Conditions**

 Dynamic power from functional stress patterns adds further to the power delivery challenge for biased-life reliability testing (HTOL).





Bias-Stress Testing of Ultra High-Power Integrated Circuits: HTOL and ORM

TestConX Workshop

www.testconx.org





Bias-Stress Testing of Ultra High-Power Integrated Circuits: HTOL and ORM



TestConX Workshop

www.testconx.org

#### Challenges

- High power is a key challenge to biased qualification testing.
- Loss of lifetime margin.

 Notably, electromigration,
T<sub>DDB</sub> and transistor aging mechanisms.

- Reduction of thermo-mechanical tolerances.
  - Si / Packaging stress interactions



Source: [8]

Bias-Stress Testing of Ultra High-Power Integrated Circuits: HTOL and ORM



Test**ConX**®

Heating Up - Thermal

#### **Key FinFET Aging Mechanisms**

- BTI / HCI Vt instability.
  - <110> orientation is more susceptible to trap generation; NBTI worse, PBTI is more robust <sup>[2,3]</sup>.
- FinFET self-heat.
  - Cross-sectional area for heat flow to Si is less than with PlanarFET.
  - Multiple fins exhibit thermal gradient; Central fins can be up to 50% hotter than adjacent fins<sup>[4]</sup>.



Source: [8]



Bias-Stress Testing of Ultra High-Power Integrated Circuits: HTOL and ORM



TestConX Workshop

www.testconx.org

Heating Up - Thermal

#### **Key Issue with UHP Devices**

- Global thermal gradient: exacerbates local FET heating.
- Electromigration lifetime has significantly less margin<sup>[5,6].</sup>
- For 7nm, Tj\_max for 10-year lifetime has been reduced (industry-wide) from 115°C to 105°C.
  - $\succ$  This may not be enough.

Test**ConX**®



Source: ANSYS<sup>[7]</sup>

Bias-Stress Testing of Ultra High-Power Integrated Circuits: HTOL and ORM



TestConX Workshop

www.testconx.org

Heating Up - Thermal

#### The Issue

- This reliability margin erosion has a bearing on how much stress is required to:
  - 1. Demonstrate and predict reliability at product use conditions (qualification), and
  - 2. Monitor reliability variance over a product's production life cycle.

## One can no longer be satisfied with short duration stress, or reduced stress.





Heating Up - Thermal

An HTOL qualification plan must preserve stress levels so that stress duration can be a reasonable indicator of reliability and lifetime at use conditions.

• IC products of the type referred to here typically have the expectation of 10-year lifetime.





The trend in the industry has been to compromise on stress conditions for high power ICs, ostensibly due to power constraints of the HTOL system and overall cost of qualification, by:

- Reducing Tj\_stress.
- Reducing V\_stress.
- Reducing the frequency of functional stress vectors and DFT-delivered stimulus.
- Running stress patterns sequentially, allowing some circuits to be idle while other circuits are toggled.





Session 7 Presentation 2

## TestConX 2020

Heating Up - Thermal

#### **Consequences for HTOL**

When stress temperature (Tj) is reduced, to reduce leakage and dynamic power...

- Thermal activation of thermally-activated reliability mechanisms is reduced by ~2x for every 10-degree reduction (assumes 0.7eV).
- A stress configured to demonstrate 10-year lifetime, only demonstrates 5 years when Tj is reduced by 10°C.

## $-10^{\circ}C \rightarrow 2x \rightarrow 5$ years





Heating Up - Thermal

#### **Consequences for HTOL**

When stress voltage (Vdd) is reduced, again, to contain power...

Voltage acceleration is reduced by ~3x (assumes a gamma factor of 15 V<sup>-1</sup>). A stress configured to demonstrate 10-year lifetime only demonstrates 3.5 years when Vdd is reduced by 10%.

## $-10\% \rightarrow 3x \rightarrow 3.5$ years





Heating Up - Thermal

# Compensating for the reduced acceleration factors by extending the duration of a 1,000 hr HTOL is not practical.

• For the stress reductions used above, a 1,000 hr stress plan would have to be increased to over 6,000 hrs, for 10-year equivalency.

Expensive because of tying up equipment and delaying product launch.

## 1000 hrs $\rightarrow$ 6000 hrs



Bias-Stress Testing of Ultra High-Power Integrated Circuits: HTOL and ORM



TestConX Workshop

www.testconx.org

Session 7 Presentation 2

## TestConX 2020

Heating Up - Thermal

#### **Ongoing Reliability Monitoring**

- ORM over an ASIC's production manufacturing life cycle is the allocation of randomly selected production parts to reliability testing, with regular (usually quarterly) reporting.
- ORM today does not assess wear-out reliability margin. Instead it reveals latent defects not activated and screened by production test





Heating Up - Thermal

#### **Considerations for ORM**

 When wear-out / aging failure modes have greater than 10-year lifetime margin, ORM is not expected to be a demonstration of lifetime or assessment of lifetime variability.



Bias-Stress Testing of Ultra High-Power Integrated Circuits: HTOL and ORM



TestConX Workshop

www.testconx.org

Heating Up - Thermal

#### **Considerations for ORM**

 UHP ASIC products using latest Si technology have consumed this reliability lifetime margin to the point where previously tolerated manufacturing variances can now manifest as lifetime limiters within the 10-year expectation.





Bias-Stress Testing of Ultra High-Power Integrated Circuits: HTOL and ORM

2020

20

Heating Up - Thermal

#### **Considerations for ORM**

 This reliability margin erosion has the concerning consequence that manufacturing process variances (eg Si buildup layer variances and packaging tolerances) that once had lifetime margin, no longer enjoy that margin and can manifest as early wear-out and/or increased FIT during the product lifetime.



Bias-Stress Testing of Ultra High-Power Integrated Circuits: HTOL and ORM



TestConX Workshop

www.testconx.org

Heating Up - Thermal

#### **Considerations for ORM**

- In the face of reliability margin erosion, ORM takes on a *critical new role*...
  - 1. To demonstrate that manufacturing variances are not producing a compromised reliability population of product in the field, and
  - 2. To identify trends and the underlying mechanisms for process control feedback.





Heating Up - Thermal

#### **Considerations for ORM**

- **Small Part Quantity:** Since Manufacturing and materials variances are systemic, and not random defects, a small part quantity for ORM is judged to be sufficient.
- Stress Duration / Level / Coverage: Instead of part quantity being paramount, stress duration (hours in stress) stress level (thermal and voltage stress) and stress coverage (at-speed pattern coverage) have become key factors.
- A single 1,000-hr Read Point: Since ORM is not a product qualification, but instead a reliability monitor, ORM does not require multiple ATE read points. A single read point can suffice for defect <u>and</u> wear-out monitoring.

Test**ConX**®

Bias-Stress Testing of Ultra High-Power Integrated Circuits: HTOL and ORM

23 2020

Heating Up - Thermal

#### **A solution for UHP ICs**

An HTOL system with the following design attributes is required to reach and safely sustain the HTOL / ORM voltage and temperature stress levels and functional coverage required for effective acceleration.

- Eliminate the back plane and provide per-DUT test resources, to preserve stress pattern signal integrity at high frequency.
- Locate power supplies at the DUT to mitigate voltage droop.
- Provide sufficient memory depth for large stress patterns.
- Support high frequency stress patterns.
- Per-DUT integrated fast-response cooling block.





#### Takeaways

High power ICs require a high power HTOL and ORM solution.

Previously tolerated manufacturing variances can now manifest as lifetime limiters within the 10-year life expectation.



Bias-Stress Testing of Ultra High-Power Integrated Circuits: HTOL and ORM



TestConX Workshop

www.testconx.org

Heating Up - Thermal

#### References

- 1. Intel 8<sup>th</sup> Gen Core with AMD Radeon RX Vega M graphics and Samsung 4xHBM2, "*New 8th Gen Intel Core Processors with Radeon RX Vega M Graphics Offer 3x Boost in Frames per Second in Devices as Thin as 17 mm,*" Intel Newsroom Jan 7, 2018, https://newsroom.intel.com/news/8th-gen-intel-core-radeon-rx-vega-m-graphics/#gs.mtbmum
- 2. R. Vaticonda, W. Wong, Y. K. Cao, "*Modeling and Minimization of PMOS NBTI effect for Robust Nanometer Design*," Design Automation Conference, 2006 43rd ACM/IEEE, vol., no., pp.1047-1052.
- 3. A. Kerber et al., "Device Reliability Metric for End-Of-Life Performance Optimization based on Circuit Level Assessment", IEEE IRPS, 978-1-5090-6641, 2017.
- 4. K. Derbyshire, *"Will Self-Heating Stop FinFETs?"* Semiconductor Engineering Apr 20, 2017.
- 5. J Watson and G Castro, "*High-Temperature Electronics Pose Design and Reliability Challenges*", Analog Dialogue vol 46, Apr 2012.
- 6. A Vel, "*How Reliable are Interconnects in 16nm FinFET Designs?*", Semiconductor Engineering Dec 5, 2013.
- 7. E. Sperling, "Chip Aging Accelerates," Semiconductor Engineering Feb 14, 2018.
- 8. D. Leary, "*The Case for a New HTOL Methodology for Ultra High Power ASICs,*" InCAL white paper Dec 5, 2019.
- 9. Wikipedia, in *MSWORD Pictures* Watt Meter





#### **COPYRIGHT NOTICE**

The presentation(s)/poster(s) in this publication comprise the proceedings of the 2020 TestConX Virtual Event. The content reflects the opinion of the authors and their respective companies. They are reproduced here as they were presented at the 2020 TestConX Virtual Event. The inclusion of the presentations/posters in this publication does not constitute an endorsement by TestConX or the workshop's sponsors.

There is NO copyright protection claimed on the presentation/poster content by TestConX. However, each presentation/poster is the work of the authors and their respective companies: as such, it is strongly encouraged that any use reflect proper acknowledgement to the appropriate source. Any questions regarding the use of any materials presented should be directed to the author(s) or their companies.

"TestConX" and the TestConX logo are trademarks of TestConX. All rights reserved.

#### www.testconx.org

TestConX Workshop