An FPGA-Based Design of a Fault-Tolerant Shared Memory Structure

An FPGA-Based Design of a Fault-Tolerant Shared Memory Structure

Sowvik Dey, Mihir Kumar Mahata, Amiya Karmakar
DOI: 10.4018/IJECME.312258
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In this current era of smart computation, faster processing speed is needed. Execution of a process in a parallel manner can achieve higher throughput. For avoiding the Von-Neumann bottle-neck phenomena, the speed of the memories should be high. A high-speed memory can provide contiguous data to a high-speed processor that retains the high performance of that processor. Various high-speed memory technologies are existing in the market such as interleaved memory, cache memory, etc. Multiprocessing technology is also used to achieve high-performance computation. It is still challenging to make a high-speed memory device that can provide data to a multiprocessing system in a contiguous way. Shared memory can fulfill all of the requirements of a multiprocessing system. The efficiency of shared memory can be enhanced by introducing the fault-tolerant mechanism in it without affecting the inter-process communication, which will be discussed in this paper.
Article Preview
Top

Introduction

The semiconductor devices are the main focus of this current era of computation. Day-by-day human society depends on smart computation devices. A huge number of data are handled at every moment. Faster processors are used to accomplishing the challenges of modern human society. Even though sometimes multiple processors are getting together to achieve faster processing speed (Hwang, K., & Briggs, F., 1984), (Kogge, P., 1981). To design a system-on-chip (SoC), the multiple processors and memory devices are fabricated together in a single wafer (Olukotun et al., 2007). The processing elements are either of a multicore or multiprocessor type (Ji, W. et al., 2009), (Irabashetti, 2014).

To support activities of the high-speed processing systems, faster memories like cache memories and interleaved memories are also engaged (ACM Comput. Surveys, 1982), (Bhandarkar, D.P., 1975), (Burnett, G. & Coffinan, E.G., 1970), (Rau, B.R., 1979). Associative memory and multiport memory are also used to make an efficient multiprocessing system. But still, there are some disadvantages to these memories. The associative memory is quite slower and having high cost than ordinary memories. A huge number of buses are required in case of the multiport memory. Moreover, an external switch controller is required to control the buses of the multiport memory. Hence, these are not cost-efficient.

Another aspect of a faster computing device is reliability. Which can be achieved through the fault-tolerant mechanism (Choi, M., 2003). As discussed earlier, to achieve faster memory access, the memory interleaving technique is applied. A fault-tolerant interleaved memory can enhance the performance of a system. Faults in memory can tolerant in various ways. It may be bank-level fault-tolerance or location level fault-tolerance (Li, Y., Nelson, B., & Wirthlin, M., August, 2013), (Das, S., & Dey, S., May, 2014), (Das, S., & Dey, S., 2014). In the location level fault-tolerance mechanism, the number of wastage of memory is less. Thus location level fault-tolerance is more useful for any kind of fault-tolerant memory mechanism. In this method, the faulty locations in memory are bypassed and provides contiguous non-faulty memory locations. An FPGA based reconfigurable architecture can provide location level fault-tolerant memory systems (Das, S., & Dey, S., May, 2014), (Das, S., & Dey, S., 2014). Multiprocessing systems often use the shared memory to communicate among the processors, called inter-process communication. Many researchers are tried to enhance the performance of the multi-processing system by introducing fault-tolerance mechanisms in it (Mushtaq, H. et al., 2011). It is shown that, most of the researchers have designed different algorithms for making a fault tolerant multiprocessing system by replicating the processes (Ng, G. et al., 2005). or implemented additional processing elements (like Watchdog Processor) for monitoring the correctness of the processing sequence (Dal Cin, M. et al., 1993). Specially for fault tolerant shared memory for multiprocessing system (loosely coupled), they have developed algorithms that replicate the data among different local memories associated with each processing elements or to a backup server (Stumm, M., & Zhou, S., 1990). Hence huge amount of local memories required. It is still an important research area to find out the solutions for a good fault-tolerance mechanism that can enhance the working capability of a shared memory without affecting the inter-process communications and reduces the cost of memory management.

Complete Article List

Search this Journal:
Reset
Volume 12: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 11: 2 Issues (2022): 1 Released, 1 Forthcoming
Volume 10: 2 Issues (2021)
Volume 9: 2 Issues (2020)
Volume 8: 2 Issues (2019)
View Complete Journal Contents Listing