Contents lists available at ScienceDirect





Microelectronics Reliability

journal homepage: www.elsevier.com/locate/microrel

## A cross-layer aging-aware task scheduling approach for multiprocessor embedded systems



### Masoomeh Karami, Athena Abdi, Hamid R. Zarandi<sup>\*</sup>

Department of Computer Engineering and Information Technology, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran

#### ARTICLE INFO ABSTRACT This paper presents a cross-layer dynamic task scheduling approach that minimizes aging using the effective Keywords: Multiprocessor embedded systems parameters of lifetime reliability from different levels of abstraction. Along with technology advances and re-Task scheduling duction in the size of the transistors, lifetime reliability of multiprocessor-embedded systems has become a major Aging design challenge. Moreover, the antagonistic impact of this parameter on temperature and power consumption Lifetime reliability as the other design challenges of multiprocessor systems, make its improvement more challenging. Our proposed Temperature scheduling approach is employed during system operation to manage real-time changes and improves lifetime

1. Introduction

Power consumption

Multiprocessor systems on chip (MPSoCs) are widely used in many applications such as avionics, automotive electronics, and automation because of their high performance and low power consumption [1]. While technology advances and reduction in the size of transistors, provide benefits for power dissipation, fabrication cost and speed of the switching of the transistors, they also cause major design concerns such as lifetime reliability of the systems because of temperature variation. Various parameters affect lifetime reliability and can be categorized into five classes [2-4]: (I) electro-migration (EM), (II) stress migration (SM), (III) time-dependent dielectric breakdown (TDDB), (IV) bias temperature instability (BTI) and thermal cycling (TC). These failure mechanisms are widely taken effect from the temperature, frequency, voltage and stress time [5].

Since there is an antagonistic relation among the effective parameters of lifetime reliability and also the adversarial impact from these factors on power consumption and performance, maximizing the lifetime reliability of MPSoCs is challenging. Several studies have investigated different levels of abstraction to improve it by controlling its related parameters. Generally, these methods can be categorized as circuit-level or system-level approaches. Circuit-level approaches provide more accuracy at the cost of high-performance overhead in comparison to the system-level approaches. Moreover, applying the crosslayer approaches which combine information and the advantages of different levels and minimize their side effects is very effective in improving lifetime reliability [6,7].

reliability by considering temperature and power consumption as the other main constraints of multiprocessor systems design. Simulation results are performed on ESESC tool and the SPEC2006 benchmarks which show 47.1% improvement in lifetime reliability of the system as well as 11.22% and 13.57% enhancement in temperature and power consumption compared to the base schedule. Our proposed approach is also compared with the related research and its effectiveness in improving lifetime reliability is verified through several experiments.

> One of the most effective system-level approaches for improving the main design challenges of multiprocessor systems is to optimize them along with the task scheduling and mapping. Task scheduling and mapping can be performed statically at design time or/and dynamically at runtime [8-10]. Static scheduling is based on prediction of the characteristics of the system and its application and is appropriate for non-changing applications [8,11,12]. Conversely, dynamic scheduling performs during system operation and based on the real status of the system and application. In this case, the complexity of lifetime minimization method directly affects the performance overhead of the system and should be managed carefully in real-time applications [2,9,13,14]. The hybrid approach has the advantages of the using of static and dynamic scheduling methods simultaneously to make it more effective [10].

> In this paper, a cross-layer approach to improve the lifetime reliability of multiprocessor embedded systems is presented. Our proposed approach is dynamically applied at the system level and during task scheduling. It takes cross-layer information and predicts lifetime reliability based on its effective parameters. By controlling these parameters and applying improvement approaches, lifetime reliability, power consumption, temperature and performance can be managed. To

Corresponding author. E-mail addresses: mas.karami@aut.ac.ir (M. Karami), atena\_abdi@aut.ac.ir (A. Abdi), h\_zarandi@aut.ac.ir (H.R. Zarandi).

https://doi.org/10.1016/j.microrel.2018.04.015

Received 5 November 2017; Received in revised form 15 April 2018; Accepted 24 April 2018 0026-2714/ © 2018 Elsevier Ltd. All rights reserved.

implement the proposed scheduling approach on a multiprocessor system, ESESC (Enhanced Super ESCalar simulator for computer architects) simulation tool which is the enhancement of SESC is utilized [15]. The experimental results from ESESC simulation shows a 47.12% improvement in lifetime reliability after applying our proposed scheduling approach at the cost of about 5% performance overhead compared with base scheduling which has no enhancement. The main contributions of this paper can be summarized as follows:

- Integrating the transistor and system-level information of aging during online scheduling process to enhance aging more effectively with a low-performance cost
- Presenting a two-stage proactive-reactive online control approach to manage system aging
- Employing control actions as dynamic voltage and frequency scaling (DVFS), task swapping and processor suspension during system execution to postpone and manage the aging effects
- Considering high-performance mode in the proposed control approach to manage the performance cost

The rest of this paper is organized as follows. A brief explanation of previous related research is presented in Section 2. Section 3 provides system model and required preliminaries of aging. Then in Section 4 our proposed cross-layer scheduling approach is described. Experimental results and conclusion remarks are provided in Sections 5 and 6.

#### 2. Related work

Improving lifetime reliability is an important issue in the design of MPSoCs. Several studies have been performed to mitigate aging at different levels of abstraction [8,16]. Circuit-level methods are very effective in improving lifetime reliability but, because of their pessimistic predictions from system characteristics, force high performance and increase the cost of the overhead of the system [17-19]. Contrary, system-level approaches use high-level information and adaptation, which have a bounded impact on lifetime reliability but with applying low performance and cost overhead. Improving on the design challenges of MPSoCs during task scheduling is one of the most effective system-level approaches [8-10,20].

System-level methods can apply to the system at design time, runtime or both. In [11], a static scheduling algorithm is presented which minimizes lifetime reliability caused by the Electro-migration (EM) mechanism and uses the simulated annealing method to optimize the EM failure rate and performance. In [2], a dynamic scheduling approach to optimize the mean time to failure (MTTF) of MPSoCs is presented. In this scheduling approach to optimize the MTTF, the workloads are evenly distributed among the processing cores and larger tasks are assigned to low power cores having higher priorities. Moreover, to reduce the performance overhead of this runtime scheduling approach, DVFS is applied while executing the applications.

Another category of scheduling approaches is cross-layer methods which ignore the boundaries of the abstraction levels and utilize the information of one level in another one. In this way, the advantages of the different abstraction levels are integrated to enhance the characteristics of the proposed method. As an example, it is possible to consider the accuracy of circuit-level methods along with the low-performance cost of system-level approaches to explore the design space [6,7,21]. In [6], a cross-layer scheduling approach is presented which determines the critical instructions based on their circuit-level delays and assigns them to a specific ALU for execution. In case of aging, the delay of the circuit will increase, which will cause the deadline to be missed for critical instructions. Considering a distinctive ALU for critical instructions, to meet their deadline and also improve the lifetime reliability of the system as a result of hardware redundancy. In [21], a temperature management method is proposed which improves aging. In this proactive method, temperature is estimated by the processing units.

Using this prediction and to avoid hot spots on the chip, critical threads migrate between the cores to manage the temperature and enhance the aging effect. Cross-layer improvement approaches offer the joint advantages of circuit-level and system-level methods and are able to optimize lifetime reliability more effectively [6,7]. Moreover, these methods consider all the parameters which have an impact on aging from different levels of abstraction and optimize them comprehensively.

Furthermore, the lifetime improvement methods can be either reactive or proactive. Reactive approaches are applied to the system after the occurrence of failure such as [2] and the proactive ones try to predict the failure rate and postpone it such as [11]. Proactive methods are less costly and risky to the system because failure occurrence will stop, where reactive approaches have more coverage in failure detection. A combination of these approaches provides the best outcome [8,10]. As an example of this hybrid approach, [22] employs active replication of tasks and DVFS simultaneously during task scheduling.

Our proposed scheduling method falls into the cross-layer category because it utilizes the comprehensive effective parameters for lifetime reliability from different levels of abstraction. It consists of proactive and reactive actions to avoid the occurrence of failure and also mitigate them after their occurrence. Moreover, it considers lifetime reliability along with temperature, power consumption, and performance as the other principal design parameters of multiprocessor systems.

#### 3. Preliminaries

Our considered multiprocessor system consists of M homogeneous processing cores that are connected via a communication channel. For each processing core, a thermal sensor which is read periodically is considered. Application workloads consist of the preemptive and periodic tasks which are scheduled on a multiprocessor system. Each application task has a predefined execution time on processing elements based on its worst-case execution time (WCET). This execution time is adapted and scaled according to the operational frequency of the system.

In this paper, we address the problem of aging in multiprocessor embedded systems. There are several failure mechanisms which cause aging in integrated circuits: electro-migration (EM), stress migration (SM), time-dependent di-electric breakdown (TDDB), bias temperature instability (BTI) and thermal cycling (TC) [3,4]. Technology advances and reduction in size of transistors, make multiprocessor systems more vulnerable to these failure mechanisms [18].

EM refers to the movement of metal atoms from the interconnect wires due to the flow of current through it. EM rate is highly dependent on the temperature, current density, and metal [3,23,24]. The MTTF relation of EM which is estimated experimentally is presented in Eq. (1).

$$MTTF_{EM} = \frac{A_{EM}}{J^n} e^{\frac{E_{aEM}}{KT}}$$
(1)

where  $A_{EM}$  is a physical constant dependent on the characteristics of metal, *J* is the current density, *n* is an empirically determined constant,  $E_{aEM}$  is the activation energy for electro-migration, *K* is Boltzmann 's constant and *T* represents the temperature. The considered values for *n* and  $E_{aEM}$  are 2 and 0.9 [2-4].

SM refers to the movement of atoms in metal wires because of the thermal inconsistency of the metal and dielectric. Eq. (2) presents the experimental MTTF relation of SM.

$$MTTF_{SM} = A_{SM} |T_0 - T|^{-n} e^{\frac{L_{aSM}}{KT}}$$
<sup>(2)</sup>

where,  $A_{SM}$  is a fabrication-based constant, n is the dependency constant which is determined empirically,  $T_0$  is the stress-free temperature of the metal, K is Boltzmann 's constant, T is the temperature and  $E_{aSM}$  represents the activation energy of stress migration failure. The considered values of n, and  $E_{aSM}$  are 2 and 0.9 [2-4].

Download English Version:

# https://daneshyari.com/en/article/6945533

Download Persian Version:

https://daneshyari.com/article/6945533

Daneshyari.com