Resilient On-chip Memory Design in the Nano Era

Resilient On-chip Memory Design in the Nano Era
Author :
Publisher :
Total Pages : 219
Release :
ISBN-10 : 1321963971
ISBN-13 : 9781321963977
Rating : 4/5 (71 Downloads)

Book Synopsis Resilient On-chip Memory Design in the Nano Era by : Abbas Banaiyanmofrad

Download or read book Resilient On-chip Memory Design in the Nano Era written by Abbas Banaiyanmofrad and published by . This book was released on 2015 with total page 219 pages. Available in PDF, EPUB and Kindle. Book excerpt: Aggressive technology scaling in the nano-scale regime makes chips more susceptible to failures. This causes multiple reliability challenges in the design of modern chips, including manufacturing defects, wear-out, and parametric variations. By increasing the number, amount, and hierarchy of on-chip memory blocks in emerging computing systems, the reliability of the memory sub-system becomes an increasingly challenging design issue. The limitations of existing resilient memory design schemes motivate us to think about new approaches considering scalability, interconnect-awareness, and cost-effectiveness as major design factors. In this thesis, we propose different approaches to address resilient on-chip memory design in computing systems ranging from traditional single-core processors to emerging many-core platforms. We classify our proposed approaches in five main categories: 1) Flexible and low-cost approaches to protect cache memories in single-core processors against permanent faults and transient errors, 2) Scalable fault-tolerant approaches to protect last-level caches with non-uniform cache access in chip multiprocessors, 3) Interconnect-aware cache protection schemes in network-on-chip architectures, 4) Relaxing memory resiliency for approximate computing applications, and 5) System-level design space exploration, analysis, and optimization for redundancy-aware on-chip memory resiliency in many-core platforms. We first propose a flexible fault-tolerant cache (FFT-Cache) architecture for SRAM-based on-chip cache memories in single-core processors working at near-threshold voltages. Then, we extend the technique proposed in FFT-Cache, to protect shared last-level cache (LLC) with Non-Uniform Cache Access (NUCA) in chip multiprocessor (CMP) architectures, proposing REMEDIATE that leverages a flexible fault remapping technique while considering the implications of different remapping heuristics in the presence of cache banking, non-uniform latency, and interconnected network. Then, we extend REMEDIATE by introducing RESCUE with the main goal of proposing a design trend (aggressive voltage scaling + cache over-provisioning) that uses different fault remapping heuristics with salable implementation for shared multi-bank LLC in CMPs to reduce power while exploring a large design space with multiple dimensions and performing multiple sensitivity analysis. Considering multibit upsets, we propose a low-cost technique to leverage embedded erasure coding (EEC) to tackle soft errors as well as hard errors in data caches of a high-performance as well as an embedded processor. Considering non-trivial effect of interconnection fabric in memory resiliency of network-on-chip (NoC) platforms, we then propose a novel fault-tolerant scheme that leverages the interconnection network to protect the LLC cache banks against permanent faults. During a LLC access to a faulty area, the network detects and corrects the faults, returning the fault-free data to the requesting core. In another approach, we propose CoDEC, a Co-design approach to error coding of cache and interconnect in many-core architectures to reduce the cost of error protection compared to conventional methods. Proposing a system-wide error coding scheme, CoDEC guarantees end-to-end protection of LLC data blocks throughout the on-chip network against errors. Observing available tradeoffs among reliability, output fidelity, performance, and energy in emerging error-resilient applications in approximate computing era motivates us to consider application-awareness in resilient memory design. The key idea is exploiting the intrinsic tolerance of such applications to some level of errors for relaxing memory guard-banding to reduce design overheads. As an exemplar we propose Relaxed-Cache, in which we relax the definition of faulty block depending on the number and location of faulty bits in a SRAM-based cache to save energy. In this part of thesis, we aim at cross-layer characterization and optimization of on-chip memory resiliency over the system stack. Our first contribution toward this approach is focusing more on scalability of memory resiliency as a system-level design methodology for scalable fault-tolerance of distributed on-chip memories in NoCs. We introduce a novel reliability clustering model for effective shared redundancy management toward cost-efficient fault-tolerance of on-chip memory blocks. Each cluster represents a group of cores that have access to shared redundancy resources for protection of their memory blocks.


Resilient On-chip Memory Design in the Nano Era Related Books

Resilient On-chip Memory Design in the Nano Era
Language: en
Pages: 219
Authors: Abbas Banaiyanmofrad
Categories:
Type: BOOK - Published: 2015 - Publisher:

DOWNLOAD EBOOK

Aggressive technology scaling in the nano-scale regime makes chips more susceptible to failures. This causes multiple reliability challenges in the design of mo
Circadian Rhythms for Future Resilient Electronic Systems
Language: en
Pages: 208
Authors: Xinfei Guo
Categories: Technology & Engineering
Type: BOOK - Published: 2019-06-12 - Publisher: Springer

DOWNLOAD EBOOK

This book describes methods to address wearout/aging degradations in electronic chips and systems, caused by several physical mechanisms at the device level. Th
NANO-CHIPS 2030
Language: en
Pages: 597
Authors: Boris Murmann
Categories: Science
Type: BOOK - Published: 2020-06-08 - Publisher: Springer Nature

DOWNLOAD EBOOK

In this book, a global team of experts from academia, research institutes and industry presents their vision on how new nano-chip architectures will enable the
Dependable Embedded Systems
Language: en
Pages: 606
Authors: Jörg Henkel
Categories: Technology & Engineering
Type: BOOK - Published: 2020-12-09 - Publisher: Springer Nature

DOWNLOAD EBOOK

This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly wi
Nano-CMOS and Post-CMOS Electronics
Language: en
Pages: 439
Authors: Saraju P. Mohanty
Categories: Technology & Engineering
Type: BOOK - Published: 2016-04-28 - Publisher: IET

DOWNLOAD EBOOK

Continuing from volume 1, this volume outlines circuit- and system-level design approaches and issues for these devices. Topics covered include self-healing ana