Resilient On-chip Memory Design in the Nano Era

Resilient On-chip Memory Design in the Nano Era
Author :
Publisher :
Total Pages : 219
Release :
ISBN-10 : 1321963971
ISBN-13 : 9781321963977
Rating : 4/5 (71 Downloads)

Book Synopsis Resilient On-chip Memory Design in the Nano Era by : Abbas Banaiyanmofrad

Download or read book Resilient On-chip Memory Design in the Nano Era written by Abbas Banaiyanmofrad and published by . This book was released on 2015 with total page 219 pages. Available in PDF, EPUB and Kindle. Book excerpt: Aggressive technology scaling in the nano-scale regime makes chips more susceptible to failures. This causes multiple reliability challenges in the design of modern chips, including manufacturing defects, wear-out, and parametric variations. By increasing the number, amount, and hierarchy of on-chip memory blocks in emerging computing systems, the reliability of the memory sub-system becomes an increasingly challenging design issue. The limitations of existing resilient memory design schemes motivate us to think about new approaches considering scalability, interconnect-awareness, and cost-effectiveness as major design factors. In this thesis, we propose different approaches to address resilient on-chip memory design in computing systems ranging from traditional single-core processors to emerging many-core platforms. We classify our proposed approaches in five main categories: 1) Flexible and low-cost approaches to protect cache memories in single-core processors against permanent faults and transient errors, 2) Scalable fault-tolerant approaches to protect last-level caches with non-uniform cache access in chip multiprocessors, 3) Interconnect-aware cache protection schemes in network-on-chip architectures, 4) Relaxing memory resiliency for approximate computing applications, and 5) System-level design space exploration, analysis, and optimization for redundancy-aware on-chip memory resiliency in many-core platforms. We first propose a flexible fault-tolerant cache (FFT-Cache) architecture for SRAM-based on-chip cache memories in single-core processors working at near-threshold voltages. Then, we extend the technique proposed in FFT-Cache, to protect shared last-level cache (LLC) with Non-Uniform Cache Access (NUCA) in chip multiprocessor (CMP) architectures, proposing REMEDIATE that leverages a flexible fault remapping technique while considering the implications of different remapping heuristics in the presence of cache banking, non-uniform latency, and interconnected network. Then, we extend REMEDIATE by introducing RESCUE with the main goal of proposing a design trend (aggressive voltage scaling + cache over-provisioning) that uses different fault remapping heuristics with salable implementation for shared multi-bank LLC in CMPs to reduce power while exploring a large design space with multiple dimensions and performing multiple sensitivity analysis. Considering multibit upsets, we propose a low-cost technique to leverage embedded erasure coding (EEC) to tackle soft errors as well as hard errors in data caches of a high-performance as well as an embedded processor. Considering non-trivial effect of interconnection fabric in memory resiliency of network-on-chip (NoC) platforms, we then propose a novel fault-tolerant scheme that leverages the interconnection network to protect the LLC cache banks against permanent faults. During a LLC access to a faulty area, the network detects and corrects the faults, returning the fault-free data to the requesting core. In another approach, we propose CoDEC, a Co-design approach to error coding of cache and interconnect in many-core architectures to reduce the cost of error protection compared to conventional methods. Proposing a system-wide error coding scheme, CoDEC guarantees end-to-end protection of LLC data blocks throughout the on-chip network against errors. Observing available tradeoffs among reliability, output fidelity, performance, and energy in emerging error-resilient applications in approximate computing era motivates us to consider application-awareness in resilient memory design. The key idea is exploiting the intrinsic tolerance of such applications to some level of errors for relaxing memory guard-banding to reduce design overheads. As an exemplar we propose Relaxed-Cache, in which we relax the definition of faulty block depending on the number and location of faulty bits in a SRAM-based cache to save energy. In this part of thesis, we aim at cross-layer characterization and optimization of on-chip memory resiliency over the system stack. Our first contribution toward this approach is focusing more on scalability of memory resiliency as a system-level design methodology for scalable fault-tolerance of distributed on-chip memories in NoCs. We introduce a novel reliability clustering model for effective shared redundancy management toward cost-efficient fault-tolerance of on-chip memory blocks. Each cluster represents a group of cores that have access to shared redundancy resources for protection of their memory blocks.


Resilient On-chip Memory Design in the Nano Era Related Books

Resilient On-chip Memory Design in the Nano Era
Language: en
Pages: 219
Authors: Abbas Banaiyanmofrad
Categories:
Type: BOOK - Published: 2015 - Publisher:

DOWNLOAD EBOOK

Aggressive technology scaling in the nano-scale regime makes chips more susceptible to failures. This causes multiple reliability challenges in the design of mo
Circadian Rhythms for Future Resilient Electronic Systems
Language: en
Pages: 208
Authors: Xinfei Guo
Categories: Technology & Engineering
Type: BOOK - Published: 2019-06-12 - Publisher: Springer

DOWNLOAD EBOOK

This book describes methods to address wearout/aging degradations in electronic chips and systems, caused by several physical mechanisms at the device level. Th
NANO-CHIPS 2030
Language: en
Pages: 597
Authors: Boris Murmann
Categories: Science
Type: BOOK - Published: 2020-06-08 - Publisher: Springer Nature

DOWNLOAD EBOOK

In this book, a global team of experts from academia, research institutes and industry presents their vision on how new nano-chip architectures will enable the
Dependable Embedded Systems
Language: en
Pages: 606
Authors: Jörg Henkel
Categories: Technology & Engineering
Type: BOOK - Published: 2020-12-09 - Publisher: Springer Nature

DOWNLOAD EBOOK

This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly wi
Enabling the Internet of Things
Language: en
Pages: 527
Authors: Massimo Alioto
Categories: Technology & Engineering
Type: BOOK - Published: 2017-01-23 - Publisher: Springer

DOWNLOAD EBOOK

This book offers the first comprehensive view on integrated circuit and system design for the Internet of Things (IoT), and in particular for the tiny nodes at