Search: 

Efficient Techniques to Overcome Scaled-CMOS Reliability Challenges

Students

Overview

This research is motivated by an imminent paradigm shift in hardware design resulting from the growing problem of hardware failures in future technologies. The traditional design paradigm assumes that no gate or interconnect will ever operate incorrectly during the lifetime of a design (except for high-end mainframes and safety-critical applications). Such a paradigm will be infeasible in future technologies. One way to break this barrier is to accept the fact that transistors and interconnects will be imperfect, and design robust systems that are failure-aware. To adopt this philosophy for most future systems, not only for mainframes, associated costs must be extremely small compared to duplication or Triple Modular Redundancy (TMR).

Our central vision is to develop enabling technologies and tools spanning multiple abstraction levels to design globally optimized robust systems targeting a wide range of applications without incurring the high cost of classical redundancy.

diagram.gif

Specific projects include:

Built-In Error Resilience

Architecture-aware circuit design techniques for correcting radiation-induced soft errors and erratic bit errors in latches, flip-flops and combinational logic.

Circuit Failure Prediction and Self-Correction

Circuit failure prediction circuits predict the occurrence of a circuit failure before errors actually appear in system data and states. This is in contrast to traditional error detection where a failure is detected after errors appear in system data and states. Circuit failure prediction is ideally suited for major reliability challenges such as circuit aging and early-life failures (also called infant mortality) and enables early self-correction.

Online Self-test

Online self-test is a special kind of self-test where a system tests itself during normal operation without any downtime visible to the end-user. It is ideal for circuit failure prediction, error detection based on periodic self-test, and system diagnostics required for effective self-repair.

Application-aware Robust System Design

Application-aware design techniques utilize the fact that a large class of future killer applications, such as Recognition, Mining and Synthesis (RMS), are inherently error resilient (due to their probabilistic nature) to design globally optimized robust systems that efficiently combine combine a large pool of low cost, ultra-fast and ultra-low-power and, hence, unreliable hardware no longer constrained by worst-case design.

Selected Publications

Journal publications:

E. Mintarno, J. Skaf, R. Zheng, J. Velamala, Y. Cao, S. Boyd, R.W. Dutton and S. Mitra, “Self-Tuning for Maximized Lifetime Energy-Efficiency in the Presence of Circuit Aging,” IEEE Trans. CAD, 2011.

Y. Li, Y.M. Kim, E. Mintarno, D. Gardner and S. Mitra, “Overcoming Early-Life Failure and Aging Challenges for Robust System Design,” IEEE Design and Test of Computers, Special Issue on Design for Reliability and Robustness, 2009 (Invited).

T. Chen, C. Ito, W. Loh, W. Wang, K. Doddapaneni, S. Mitra and R.W. Dutton, “Design Methodology and Protection Strategy for ESD-CDM Robust Digital System Design in 90-nm and 130-nm Technologies,” IEEE Trans. Electron Devices, 2009.

S. Mitra, N.R. Saxena, and E.J. McCluskey? , “Efficient Design Diversity Estimation for Combinational Circuits,” IEEE Trans. Comp., Vol. 53, Issue 11, pp. 1,483-1,492, Nov. 2004.

M. Tahoori and S. Mitra, “Techniques and Algorithms for Fault Grading of FPGA Interconnect Test Configurations,” IEEE Trans. Computer-Aided Design, Vol. 23, Issue 2, pp. 261-272, Feb. 2004.

K.S. Kim, S. Mitra and P.G. Ryan, “Delay Defect Characteristics and Testing Strategies,” IEEE Design and Test of Computers, Special Issue on Speed Test and Speed Binning of Complex ICs, Vol. 20, Issue 5, pp. 8-16, Sept.-Oct. 2003.

S. Mitra, L.J. Avra and E.J. McCluskey? , “Efficient Multiplexer Synthesis,” IEEE Design and Test of Computers, Vol. 17, No. 4, pp. 90-97, Oct. – Dec. 2000.

Conference publications:

Y. Kim, Y. Kameda, H. Kim, M. Mizuno and S. Mitra, “Low-Cost Gate-Oxide Early-life Failure Detection in Robust Systems,” Symposium VLSI Circuits, Honolulu, Hawaii, June 2010.

Y. Kim, T. Chen, Y. Kameda, M. Mizuno and S. Mitra, “Gate-Oxide Early-life Failure Identification using Delay Shifts,” IEEE VLSI Test Symposium, Santa Cruz, CA, April 2010.

S. Mitra, K. Brelsford and P. Sanda, “Cross-Layer Resilience Challenges: Metrics and Optimization,” IEEE/ACM Design Automation and Test in Europe, Dresden, Germany, March 2010 (Invited).

Y. Kanoria, A. Montanari and S. Mitra, “Statistical Static Timing Analysis using Markov Chain Monte Carlo,” IEEE/ACM Design Automation and Test in Europe, Dresden, Germany, March 2010.

E. Mintarno, Y. Cao, S. Boyd, R. Dutton and S. Mitra, “Optimized Self-Tuning to Maximize Lifetime Energy-Efficiency in the Presence of Circuit Aging,” IEEE/ACM Design Automation and Test in Europe, Dresden, Germany, March 2010.

R. Zheng, et al., “Circuit Aging Prediction for Low-Power Operation,” Custom Integrated Circuits Conf., San Jose, CA, Sept. 2009.

H. Baba and S. Mitra, “Testing for Transistor Aging,” IEEE VLSI Test Symp., Santa Cruz, CA, April 2009.

T.W. Chen, Y.M. Kim, K. Kim, Y. Kameda, M. Mizuno and S. Mitra, “Experimental Study of Gate-Oxide Early Life Failures,” Intl. Reliability Physics Symp., Toronto, Canada, April 2009.

M. Agarwal, et al., “Optimized Circuit Failure Prediction for Aging: Practicality and Promise,” Intl. Test Conf., Santa Clara, CA, Oct. 2008.

I. Loi, et al., “A Low-overhead Fault Tolerance Scheme for TSV-based 3D Network-on-Chip Links,” Intl. Conf. CAD (ICCAD), San Jose, CA, Nov. 2008.

T.W. Chen, K. Kim, Y. Kim and S. Mitra, “Gate-Oxide Early Life Failure Prediction,” IEEE VLSI Test Symp., San Diego, CA, April 2008.

S. Mitra, “ Globally Optimized Robust Systems to Overcome Scaled CMOS Challenges,” Design Automation and Test in Europe, Munich, Germany, March 2008 (Invited).

S. Mitra, “Circuit Failure Prediction for Robust System Design in Scaled CMOS,” International Reliability Physics Symp., Phoenix, AZ, May 2008 (Invited).

S. Mitra and M. Agarwal, “Circuit Failure Prediction to Overcome Scaled CMOS Reliability Challenges,” Intl. Test Conf., Santa Clara, CA, Oct. 2007 (Invited).

M. Agarwal, B. Paul and S. Mitra, “Circuit Failure Prediction and Its Application to Transistor Aging,” IEEE VLSI Test Symp., Berkeley, CA, April 2007.

P. Relangi and S. Mitra, “Erratic Bit Errors in Latches,” Intl. Reliability Physics Symp. (IRPS), Phoenix, AZ, April 2007.

S. Seshia, W. Li and S. Mitra, “Verification Guided Soft Error Resilience,” Design Automation and Test in Europe (DATE), Nice, France, April 2007.

T.W. Chen, C. Ito, W. Loh, W. Wang, S. Mitra and R.W. Dutton, “Marco-model for Post-breakdown 90nm and 130nm Transistors and its Applications in Predicting Chip-level Function Failure after ESD-CDM Events,” Intl. Reliability Physics Symp., Phoenix, AZ, April 2007.

N. Seifert, P. Slankard, M. Kirsch, B. Narasimham, V. Zia, C. Brookreson, A. Vo, S. Mitra and J. Maiz, “Radiation Induced Soft Error Rates of Advanced CMOS Bulk Devices,” IEEE Intl. Reliability Physics Symp., 2006.

R. Guo, S. Mitra, J. Lee, S. Sivaraj and M. Ameen, "Comparison of Test Metrics: Stuck-at, N-Detect and Gate-Exhaustive," IEEE VLSI Test Symp., 2006.

D.J. Leavins, K.S. Kim, S. Mitra and E. Rodriguez, “Robust Platform Design in Sub-65nm Technologies,” Custom Integrated Circuits Conference, 2005, (Invited).

K.Y. Cho, S. Mitra and E.J. McCluskey? , “Gate Exhaustive Testing,” Intl. Test Conf., 2005.

S. Mitra, T. Karnik, N. Seifert and M. Zhang, “Logic Soft Errors in Sub-65nm Technologies: Design and CAD Challenges,” Design Automation Conf., 2005.

M. Tahoori, S. Mitra and E.J. McCluskey? , “Fault Grading FPGA Interconnect Test Configurations,” IEEE Intl. Test Conf., pp. 608-617, 2002.

This site is powered by the TWiki collaboration platformCopyright © 2018 Stanford University