A33 Li

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless
Sensor Networks
WEI LI, University of Sydney and NICTA
FLAVIA
C. DELICATO, Federal University of Rio de Janeiro
ALBERT Y. ZOMAYA, University of Sydney
Most Wireless Sensor Network (WSN) applications require distributed signal and collaborative data
processing. One of the critical issues for enabling collaborative processing in WSNs is how to schedule
tasks in a systematic way, including assigning tasks to sensor nodes, and determining their execution and
communication sequence. Since WSN nodes are very resource constrained, mainly regarding their energy
supply, one major concern when scheduling tasks in such environments is to minimize and balance the
energy consumption, so that the system operational lifetime is maximized. We propose a heuristic-based
three-phase algorithm (TPTS) for allocating tasks to multiple clusters in hierarchical WSNs that aims at
finding a scheduling scheme that minimizes the overall energy consumption and balances the workload of
the system while meeting the applications deadline. The performance of the proposed algorithm and the
effect of several parameters on its behavior were evaluated by simulations, with promising results. The
experimental results show that the time and energy performance of TPTS are close to the time and energy
of benchmarks in most cases, while load balance is always provided.
Categories and Subject Descriptors: C.2.2 [Computer-Communication Networks]: Network Protocols
Wireless communication; C.2.4 [Computer-Communication Networks]: Distributed Systems
General Terms: Design, Algorithms, Management, Experimentation, Performance
Additional Key Words and Phrases: Wireless sensor networks, adaptive task scheduling, multiclusters,
multi-objective optimization, energy efficient, hierarchical network
ACM Reference Format:
Li, W., Delicato, F. C., and Zomaya, A. Y. 2013. Adaptive energy-efficient scheduling for hierarchical wireless
sensor networks. ACM Trans. Sensor Netw. 9, 3, Article 33 (May 2013), 34 pages.
DOI: http://dx.doi.org/10.1145/2480730.2480736
1. INTRODUCTION
Wireless Sensor Networks (WSNs) are composed of a large number of tiny batteryoperated devices endowed with sensing, processing, and wireless communication capabilities, as well as one or more sink nodes. Sink nodes are powerful devices, often
a personal computer, that are in charge of gathering all the sensor collected data,
further processing them, and of linking the WSN to external networks such as the
The work of F. C. Delicato was partially supported by the Brazilian National Council for Scientific and
Technological Development (CNPq), under grant nos. 311363/2011-3 and 470586/2011-7, and by FAPERJ.
The work of A. Y. Zomaya was supported by the Australian Research Council grant DP1097111.
Authors addresses: W. Li (corresponding author), School of Information Technologies, University of Sydney,
The Centre for Distributed and High Performance Computing, Building J12, Sydney, NSW 2006, Australia;
email: liwei@it.usyd.edu.au; F. C. Delicato, Federal University of Rio de Janeiro, Brazil; A. Y. Zomaya,
School of Information Technologies, University of Sydney, The Centre for Distributed and High Performance
Computing, Building J12, Sydney, NSW 2006, Australia.
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted
without fee provided that copies are not made or distributed for profit or commercial advantage and that
copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for
components of this work owned by others than ACM must be honored. Abstracting with credit is permitted.
To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this
work in other works requires prior specific permission and/or a fee. Permissions may be requested from
Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212)
869-0481, or permissions@acm.org.
c 2013 ACM 1550-4859/2013/05-ART33 $15.00

DOI: http://dx.doi.org/10.1145/2480730.2480736
ACM Transactions on Sensor Networks, Vol. 9, No. 3, Article 33, Publication date: May 2013.
33
33:2
W. Li et al.
Internet. Sensor nodes act in a collaborative way to accomplish sensing, processing,

and communication tasks in response to the needs of client applications. One major
feature that distinguishes WSN from traditional solutions for environmental monitoring based on wired sensors is that, once the nodes in WSN are endowed with (although
limited) computational capacity, such networks are not merely passive data collectors,
but instead they are able to perform simple processing operations on the collected data
before sending them to the powerful sink nodes for further, more complex processing.
Such capability of performing some computation over sensor collected data is known
as in-network processing [Heidemann et al. 2001]. Since the communication is the
most energy costly operation in WSNs, it is frequently better to perform as much innetwork processing as possible in the sensor data (for instance, by applying some data
aggregation operation within the node) in order to decrease the number of transmitted
messages, thus trading transmission for processing energy.
WSNs can be considered as application-oriented networks, in the sense that in order
to operate properly and to achieve their maximum usefulness, such networks need to
be optimized to the specific goals of the target application. Requirements of sensing
applications are often described as high-level missions such as detect fires, monitor
temperature in a given area, report the presence of an intruder, etc. These high-level
descriptions need to be translated to low-level tasks and executed by sensor nodes. The
translation as well as the distribution of the required work over the network nodes
should be preferably transparent to the application that is built on top of the WSNs.
Most WSN applications require distributed signal and collaborative data processing.
Therefore, the development of energy-efficient algorithms for collaborative applications
in WSNs has attracted the interest of recent research. In general, a collaborative application is composed of a number of tasks cooperating with each other through inter-task
communications to complete a common goal [Younis et al. 2002; Singh and Prasanna
2003; Alsalih et al. 2005; Karimi et al. 2009; Biswas et al. 2010]. By task, we mean
the required action(s) performed by a sensor upon occurrence of an event of interest
for the application. Dependencies between tasks are maintained by the exchange of
intermediate results between sensor nodes. A real-world example of processing collaborative applications in the domain of Wireless Camera Sensor Networks (WCSN)
is presented in Yuan and Eylem [2007]. In WCSNs, several applications such as image registration and distributed visual surveillance are required to process intensive
computational operations during their execution while meeting the inherent real-time
requirements of multimedia applications. For instance, distributed intelligence and information processing for multiple-camera surveillance are the primary methods used
in third-Generation Surveillance Systems (3GSS), where a large number of cameras
are connected with networks to perform missions like vision-based localization. Instead of sending original images to the sink node to compute the objects location in
the sink, sensors first estimate the objects location (tasks) through message exchanges
among several neighboring nodes, and then a data fusion method (other tasks) is used
to integrate data from different cameras to further eliminate estimation errors. Eventually, the result will be sent back to the sink node for further processing. Another
well-known real-world example of collaborative application in a hierarchical WSN is
the smart office building [Akyildiz et al. 2002]. In the scenario of smart office buildings,
sensor nodes deployed on each floor can be considered as a cluster, monitoring the environmental conditions, including temperature, humidity, and light, of that particular
floor. Each floor is equipped with one cluster head which acts as a local collector for the
sensed data from a set of sensor nodes. Considering that a simple monitoring application starts running on the WSN to determine whether the temperature in the office
building is comfortable, this application at least needs to detect the current temperature (tasks) from each floor (different clusters, different parts of the WSN) and send the
Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks
33:3
collected data back to the sink node through the cluster head. If the sensor nodes are
required to determine the temperature for a period of time, the average value needs
to be computed by using the collected data (tasks) and then either the result needs
to be sent back to the application though the sink node or the collected data (average
temperature) is required to be sent back to the sink for further processing only when a
predefined condition is met (e.g., report the data when the temperature is lower than
10 and/or higher than 30). In this second case, each sensor node needs to compare the
average temperature with the defined condition(s) (task(s)) to determine whether its
collected data needs to be sent back to the sink node or not. In a more sophisticated
monitoring application, the WSN nodes are endowed with sensing units and actuators,
thus being able to directly control such devices as air conditioners, for instance. In this
more complex scenario, the control decisions (tasks) and/or related control commands
(tasks) are executed on the WSN nodes whenever the predefined condition is triggered.
However, these tasks can run on different clusters (different parts of WSN, without
invoking the central control from the sink node). It is worth to note that many other
smart building applications share the common characteristics with the given smart
office building example, consequently, this kind of application can be collaboratively
completed by allocating the tasks to a number of clusters in a hierarchical WSN.
In order to enable collaborative processing in WSNs, how to schedule tasks in a
systematical way, including the assignment of tasks to sensor nodes and determining
their execution and communication sequence, is the most important issue to address. A
collaborative application can be represented as a task graph [Xie and Qin 2008], which
is often simply referred to as a Directed Acyclic Graph (DAG) [Zomaya 1995; Sinnen
2007] that is used for task scheduling.
One major difference between task scheduling in WSN and traditional scheduling
algorithms is that the latter ones focus on shortening the makespan. Instead, in WSNs
the major concern is not only time, but also energy, since each sensor has limited power
supply and the system is expected to run as long as possible after deployment. Besides,
balancing the energy consumption among the sensors is also import to extend the
network operational lifetime. One common argument for doing this is that if the energy
of certain nodes is depleted before the others, holes may appear in the sensing coverage
or the sensor network may become disconnected prematurely, thus leading to loss of the
system function. Considering the preceding requirements, the task scheduling problem
becomes a multiobjective scheduling problem of which the cost function includes energy
consumption, latency, and load balancing.
In this article, we consider a hierarchical wireless sensor network with a single sink
node. All the sensor nodes are grouped into clusters and each cluster has a leader,
called cluster head, that coordinates the tasks inside the cluster and communicates
with other cluster heads. A collaborative application executes periodically within an
infinite loop. According to Bakshi and Prasanna [2004], in embedded systems such as
WSNs, the application basically executes in an infinite loop, and the concept of an execution round is not clearly defined in several scenarios due to their data-driven behavior.
We assume an application-specific, predefined deadline for the application execution
time. After this deadline has elapsed, the application starts again. Our objective is
to find an optimal scheduling scheme that minimizes the overall energy consumption
and balances the workload of the system while meeting the applications deadline.
We present a high-level Three-Phase Task Scheduling (TPTS) scheme to achieve the
goal, which operates with the support from low-level protocols. First, a deadline distribution algorithm is employed to decompose the overall deadline to subdeadlines and
assign them to the corresponding tasks. Second, a cluster-based task partitioning algorithm is developed to distribute tasks to sensor clusters according to the overall goal
of minimizing the energy consumption of inter-cluster communication and balancing
33:4
W. Li et al.
the workload of sensor clusters based on their residual energy. Finally, a contentionaware scheduling algorithm is used to generate the scheduling scheme to satisfy the
multi-objective in terms of energy consumption, latency, and load balancing among
sensors in the cluster. Besides simultaneously addressing different (and sometimes
conflicting) goals, namely meeting application time requirements and saving energy,
our algorithm additionally tackles the issue of load balance in the system. Furthermore,
since this requirement of load balance is always important for some types of WSN applications, our algorithm provides a parameter that allows the user to tune how much this
requirement is relevant, thus tailoring the solution for each specific application need.
The research in this article makes the following contributions.
(1) We develop a heuristic-based three-phase task scheduling scheme called TPTS for
distributing tasks of a collaborative application which is represented by a DAG to
multiple clusters in a hierarchical WSN.
(2) We take into account multiple objectives in the task scheduling decision process,
including application deadline, energy consumption of a executing a task, and load
balance among clusters.
(3) We provide user-defined parameters to adjust the weight of objectives for each
application execution.
The rest of this article is organized as follows. A review of the related works is
provided in Section 2. Afterwards, we introduce the models used in our study and
formally define the research problem in Section 3. In Section 4, the details of our
proposed algorithm are presented and discussed. The simulation results and analysis
are given in Section 5. Finally, we conclude our study in Section 6.
2. RELATED WORK
The scheduling problems in wireless sensor networks have been widely investigated
in the literature recently, but most proposed algorithms are focused on job scheduling.
By job, we mean that a client mission is required to be performed by the network,
which cannot be divided into subjobs and has no obvious and clear relations to other
missions. A common design goal for these works is to minimize energy consumption
and extend system lifetime since sensors used in these works are assumed equipped
with nonrechargeable batteries that have limited lifetime. To achieve such a goal, a
popular solution consists of putting some (selected) sensors in the sleep mode and the
remaining ones in the active mode to accomplish the allocated jobs. When a sensor
stays in the sleep mode, it only consumes a tiny amount of energy compared to the
active mode [Kumar et al. 2004]. After a previously determined time period passes,
the sensors in the sleep mode have a chance to wake up to continue performing jobs
allocated to the system, and the sensors in the active mode also have a chance of
switching into sleep mode for energy saving. This process will be stopped when the
WSN is no longer able to fulfill the design purpose.
Depending on the network structure, the proposed scheduling algorithms can
be classified into two categories: nonhierarchical-based scheduling algorithms and
hierarchical-based scheduling algorithms. For the nonhierarchical-based scheduling
algorithms, the jobs are simply allocated to the network for processing. In such a WSN,
every sensor has the same role and functionality. Also, note that the nonhierarchicalbased scheduling algorithm may be used for hierarchical WSNs, but only limited
to within a single cluster. In Poornachandran et al. [2005], a nonhierarchical-based
scheduling algorithm called Multiple Sensor Unit Scheduling (MSUS) is presented
for determining the best power state of a sensor based on: (i) the timing and priority
requirements, (ii) the job allocated to the sensor, and (iii) the existing and the predicted future jobs. The results of simulations presented in the paper show that MSUS
33:5
is a low-complexity algorithm that requires little memory space and highly reduces
the energy consumption compared to the greedy scheduling technique. Berman et al.
[2005] proposed another nonhierarchical-based scheduling algorithm to maximize the
system lifetime subject to the constraints on battery lifetime and sensing coverage.
In this work, each sensor is in one of three states: active, idle, or vulnerable. In the
vulnerable state, if the sensor discovers that its sensing area cannot be fully covered
by any of its active or vulnerable neighbors it immediately turns itself into the active state. Otherwise, it enters the idle state if its sensing area can be fully covered
by either active neighbors or vulnerable neighbors with a higher energy level. This
scheduling algorithm can guarantee that any interesting area is covered by at least
one sensor. Therefore, it is possible for multiple neighboring sensors to enter the idle
state simultaneously so as to prolong system lifetime.
Alternatively, a WSN can have a hierarchical structure. Such networks are often
cluster based in which the cluster heads have a more prominent role than the other
sensors. The hierarchical based scheduling algorithms are aimed to distribute jobs over
several selected clusters for processing. In Sanli et al. [2005], the authors proposed a
hierarchical-based algorithm, called CTAS, to minimize loss events and energy consumption by exploiting power modes and overlapping sensing areas of sensor nodes.
CTAS is a two-phase scheduling approach for collaboratively executing jobs at both
group and individual levels between sensor nodes. CTAS implements the first-phase
schedulingat the group levelto schedule the event types and data transmissions
for a group of sensor nodes. Then, CTAS performs second-phase scheduling to schedule
the tasks of the event types assigned by the first-phase scheduling at each sensor node.
When a sensor node of a group is not assigned with a particular event type, the sensor
node shuts down its corresponding sensing unit until the next assignment phase of
first-level scheduling. It is important to note that only the sensing unit is turned off,
while the communication and processing units remain on.
Although some sort of collaborative nature is shown in the designs described before,
especially for the hierarchical-based scheduling algorithms, the relationship between
collaborative application and jobs is not clearly described, and no precedence constraint
of jobs is formally defined. Apart from switching the sensor states, there are several
other methods to save energy, such as reducing communication cost and reducing
control messages. Task scheduling is one possible approach to save energy in a WSN
by executing tasks on the same sensor without changing the states of the sensor, since
the cost of local communication (inter-task communication happens within the same
sensor) is generally considered as zero. A few works start to address the task scheduling
problem in a nonhierarchical WSN. According to the network topology, the existing
solutions can be further classified into two subcategories, single-hop nonhierarchical
WSN task scheduling and multihop nonhierarchical WSN task scheduling.
For single-hop nonhierarchical WSNs, the task allocation and scheduling problems
have been addressed and well-studied. Yu and Prasanna [2005] developed an EnergyBalanced Task Allocation (EBTA) algorithm to meet the deadline of a real-time application running on homogeneous sensor nodes connected via multiple wireless channels.
They formulated the problem as an integer linear programming issue and presented a
polynomial three-phase heuristic solution. However, they did not consider the broadcasting nature of wireless communication in their model and the multiple wireless
channel technique is not widely used in real-world sensor nodes. EcoMaps [Yuan et al.
2005] algorithm is proposed for energy-constraint applications with no deadline requirements. It aims to map and schedule the tasks jointly to achieve the minimum
schedule length which meets the requirement of minimizing energy consumption. Yuan
et al. presented RT-Maps in Yuan et al. [2006] which can guarantee the real-time application deadline with minimum energy consumption. Their algorithm also considers
33:6
W. Li et al.
utilizing the broadcast nature of wireless communication to conserve energy. Xie and
Qin [2008] presented a task scheduling algorithm called BEATA to solve the energydelay dilemma that exists in heterogeneous WSNs. BEATA aims at minimizing the
energy consumption while confining schedule lengths through task allocation.
The aforementioned techniques are concentrated on information processing in a
single-hop range, but in practical applications, sensors are normally randomly deployed
in an area of interest and form an irregular topology. Therefore, since the singlehop nonhierarchical WSN scheduling algorithms cannot be directly applied to realworld scenarios, proposals to tackle the issue of scheduling in multihop nonhierarchical
WSNs have been attracting researchers attention recently. In Yuan and Eylem [2007],
the authors proposed a multihop in-network processing algorithm called MTMS. In
their algorithm, the nature of multihop communication is handled by two steps. First,
they extended the representation of tasks from DAG to a hyper-DAG by replacing
a communication edge as a vertex and connecting this new vertex to those vertexes
from which it originally starts and ends. The cost of the communication vertex is
equal to the communication load in the DAG. Second, all the sensors are assumed
to connect to a virtual communication controller called C. In a specific time slot, the
algorithm running on the controller C determines whether processing a communication
task in the system will cause interference with other communication tasks which are
simultaneously being performed or not. If so, the algorithm will seek for another time
slot to process, otherwise, it allocates this time slot for the communication task to
process. However, their communication scheduling algorithm only considers one-hop
communication collision and some kinds of hidden communication collision may occur.
Furthermore, their model does not suitably address the communication concurrency;
in most cases, the wireless communications in the selected sensors are performed
sequentially.
All the existent works regarding task scheduling in WSNs assume that the application runs inside a nonhierarchical WSN or in a single cluster. Therefore, in spite of
the network being hierarchically organized, since there is no cooperation among clusters, from the point of view of scheduling, the algorithms behave as in the case of flat
topologies. In large-scale networks, a flat topology may be inefficient in terms of energy
consumption. A technique that can be adopted to promote the network scalability and
also to achieve higher energy efficiency related to data transmission is the clustering of
the network, generating a hierarchical logical topology. Since the distance among cluster members and the respective cluster head is often smaller than the distance between
these sensors and the sink node, sensors in a cluster save transmission energy [Younis
et al. 2003]. Clustering can also be beneficial for purposes of energy saving because
it favors data fusion procedures. Cluster members collaborate about recent data measurements and determine how much information should be transmitted to the client
application. By averaging data values collected within the cluster, the algorithm can
trade data resolution for transmission power. Also for energy saving, in areas where
there are a redundant number of sensors, a clustering algorithm can be used to select
which nodes better represent data samples for the region and which nodes can be put
in power-saving mode.
For a clustered WSN, the solution of placing all the processing steps into a single
cluster will cause inefficient utilization of system resources. More importantly, different
clusters may be endowed with different functions, and/or some tasks may need to be executed on a particular cluster. For instance, in heterogeneous WSNs, different clusters
may contain nodes endowed with different sensing devices. Or the application may be
interested in monitoring particular geographic areas that are covered by two or more
different clusters. Therefore, a scheduling algorithm that considers a WSN composed
of several clusters, sometimes comprising different numbers and types of sensors, is
33:7
required in several application and deployment scenarios. Furthermore, considering

the optimization goals of our work, namely minimizing energy consumption, meeting
time requirements, and assuring load balancing, none of these papers considers all
of them simultaneously. The best case is to consider both of them at the same time.
In this article, we will assume the application is running on multiple clusters and the
system is hierarchical as well, since we address these three objectives in our algorithm
simultaneously, thus providing a more comprehensive solution that we believe will be
more energy efficient in the long term.
3. TECHNICAL PRELIMINARIES
In this section, the models used to represent the hierarchical WSN system, a precedence
constrained collaborative application, the communication contention and the energy
models considered in our work are introduced respectively in Sections 3.1, 3.2, 3.3, and
3.4. In the last subsection, we present the problem statement.
3.1. System Model
In this work we assume a heterogeneous and hierarchical WSN. Our hierarchical WSN
system is composed of three types of devices: sink node (also known as base station),
Cluster Head (CH), and sensor nodes. The system is placed into a two-dimensional
region. We do not make any restriction about the algorithm for cluster formation and
cluster head election: several existent algorithms, such as the one used in Heinzelman
et al. [2000], can be adopted for this purpose. We only assume that, once the clusters
are established and the cluster head is elected, there is an exchange of messages so
that the CHs always know the geographical position, device types, and residual energy
of the sensors pertaining to their clusters. There is only one sink node in the system
that communicates with cluster heads through wireless connections. The cluster heads
are distributed around the sink node and each cluster head can communicate with
other cluster heads directly (one-hop communication) as well as with the sensors inside the cluster. It is located in the middle of each cluster. Moreover, the cluster head
is a device with more resources, both regarding processing and energy, than ordinary
sensor nodes. It does not perform computation tasks, only communication tasks. Each
cluster is composed of a set of static and homogeneous wireless sensor nodes with single
omnidirectional antennas forming a multihop network. Each sensor node can perform
computation and communication tasks simultaneously. We assume that time synchronization [Elson and Romer 2003] is perfectly achieved among sensor nodes. The cluster
is modeled as an undirected graph G = (V, E), where V = (v1 , v2 , . . . , vn) represents the
set of sensors with no Dynamic Voltage Scaling (DVS) function, and E = (e1 , e2 . . . , em)
represents the set of all possible communication links for the sensors. Since we assume
the sensor nodes are homogeneous regarding their resources, for any given computational task, all the sensors are identical in their processing time. Similarly, all the
sensors have the same communication range with diameter diam(v) k, where k is
the maximum communication range that a sensor can reach. Furthermore, a communication link is assumed to work at half duplex mode, which means the transmission
is in one direction at a time. In other words, the communication channel does not allow
transmitting and receiving data simultaneously.
3.2. Application Model
A collaborative application with a set of precedence-constrained tasks can be modeled

as a Directed Acyclic Graph (DAG). A DAG T = (V, E) is comprised of a set of vertices
V representing a number of nonpreemptive tasks and a set of directed and weighted
edges E representing the task dependence and the communication load among tasks.
The computation cost of a task is denoted by the number of CPU clock cycles to execute
33:8
W. Li et al.
Fig. 1. Examples of the two types of interferences: (1) primary interference, where node B sends and receives
data at the same time; (2) primary interference, where node B receives data from A and C at the same time;
(3) secondary interference, the hidden terminal issue.
comp
the task. The time for performing a task on sensor i can be determined by ti
=
N(i)/C PU (i), where N(i) is the computation cost of a task and CPU(i) is the CPU
speed of sensor i. For any given edge Ei j , it represents the dependence between two
vertices Vi and V j where Vi is indicated as the parent task of V j and V j is indicated
as the child task of Vi . Therefore, V j cannot start executing until it receives all the
data from its parent task. To determine the communication time between two tasks,
the following equation is employed: ticomm = data(i, j)/bandwidth(i, j), where data(i, j) is
the amount of data that needs to be transferred from task i to task j and bandwidth(i, j)
is the bandwidth in the communicating link between sensor nodes which were assigned
with task i and task j. In a DAG, a task without parent tasks is known as an entry
task and a task without child tasks is known as an exit task. For most applications in
WSNs, the entry task is the task to perform sensing or gathering raw data. Hence, in our
problem formulation, we stipulate the number of entry tasks is not less than the number
of sensor clusters in the system, called its task graph constraint, and restrict the
entry task assignment in the system. The entry task assignment restriction includes
two aspects: (1) each sensor cluster has at least one entry task assigned at each round;
(2) each sensor is allowed to process at most one entry task at each round.
3.3. Interference Model
To model the communication interference in wireless multihop networks, primary and

secondary interferences are defined in Krumke et al. [2001]. Primary interference occurs when a sensor node is simultaneously involved in more than one communication
action. The typical examples are sending and receiving at the same time, or receiving from two different transmitters. This feature constrains that the radios of sensor
nodes must work in half duplex mode, which means the radio cannot transmit and
receive data simultaneously. Actually, most of radios used in real-world deployments
of WSNs are half duplex for reasons of monetary costs (full duplex radios are much
more expensive). Under such a hardware configuration, the primary interference can
be easily determined since none of sensor nodes is capable of sending and receiving
data at the same time. Comparably, the secondary interference occurs when a receiver
communicates with a particular transmitter but also unexpectedly receives data from
another transmitter. This behavior is described as cochannel interference produced
by simultaneous transmissions of different nodes at the same frequency. Examples of
primary interference and secondary interference are given in Figure 1.
To determine the wireless communication interference, two kinds of models are generally used: the graph-based protocol model and the Signal-to-Interference-plus-NoiseRatio (SINR)-based physical model [Durmaz Incel et al. 2011]. Both the graph-based
protocol model and SINR model have their own merits. The SINR model focuses on the
effects of the aggregate interference observed by the receiver, due to all other transmitters, while the graph-based protocol model describes the interference between two
terminals and abstracts several aspects of communication that may complicate the interference modeling. Due to its consequent simplicity, the graph-based protocol model
has been extensively adopted in the design and evaluation of communication protocols,
as discussed in Cardieri [2010], Chafekar et al. [2007], and Moscibroda et al. [2006].
33:9
On the other hand, it is widely accepted that the SINR model is more accurate than
the graph-based protocol model. However, this higher accuracy comes at the expense of
increasing the complexity of problems, such as scheduling and topology control, when
the SINR model is used, as reported by Goussevskaia et al. [2007] and Blough et al.
[2009]. Therefore, as the author concluded in Cardieri [2010], the graph-based protocol
model is suitable in design and evaluation of communication protocols, particularly
appropriate for solving problems in the context of topology control and transmission
scheduling, while the SINR model is more appropriate for capacity evaluation and
to study other physical-layer-related issues. In our scenarios, the scheduling of the
communication tasks belongs to the scope of transmission scheduling, where it is being suggested to use the graph-based protocol model to determine the communication
interference. Moreover, in practical scenarios the simple SINR model does not work
very well either, mainly because it is inherently geometric. In the real world, antennas are not perfectly isotropic, and even more importantly the environment is often
obstructed by several objects, for example, vegetation, cars, and walls. Although these
issues are able to be integrated into the simple SINR model, in producing improved
SINR models, such models are prone to becoming extremely complicated. Consequently,
the high complexity of SINR model is not necessarily employed to determine the communication interference, especially when the developers are not experts on communication theory or the key research question is not directly related to hardware issues
[Moscibroda et al. 2006]. The graph-based protocol model, on the other hand, automatically incorporates the issues of imperfect (or even directional) antennas, as well as
terrains with obstructions. Consequently, the graph-based protocol interference model
is widely adopted in works addressing scheduling in multihop wireless networks and
higher-layer protocol design, since the conflict graph can be easily established to characterize the interference in the networks [Jain et al. 2005]. According to the aforementioned reasons, we believe that the graph-based protocol model is adequate and
appropriate for addressing the high-level communication tasks scheduling issue in our
study.
In the graph-based protocol model, each sensor node has a fixed transmission range r
and an interference range R, where R > r. We denote the ratio between the transmission
range and the interference range as y = Rr . In practice, 2 4 [Ma et al. 2009].
For easy of analysis, we select the interference range = 2. This indicates we should
avoid both types of interference, thus a sensor can only transfer data to its destination
at a specific time slot when none of its 2-hop neighbors (in the topology graph) is
transmitting data. This strategy for avoiding communication interference occurring
within the WSN system is adopted in our work for the algorithm design and the
simulations carried on to evaluate our proposal.
3.4. Energy Model
The energy model used in our article is adopted from Al-Obaidy and Ayesh [2008],
Baokang et al. [2008], and Yuan and Eylem [2007] and summarized shortly. Let Ecomp
be an energy consumption caused by a task running on a sensor. We denote the energy
consumption of executing N clock cycles with CPU speed f as
Vdd N
2
Ecomp(Vdd, f ) = NCVdd + Vdd I0 e nV r
,
(1)
f
f K(Vdd c),
(2)
where VT denotes the thermal voltage and C, I0 , n, K, and C are processor-dependent

parameters.
33:10
W. Li et al.
Considering the fact that a sensor consumes different energy when it sends or receives the same amount of data over the same distance within its communication
range, we distinguish them as Etx (l, d) and Erx (l), where l represents the size of the
transmission data and d represents the distance between two communication nodes.
The equations are listed as
Etx (l, d) = Eelec l + amp l d ,
(3)
Erx (l) = Eelec l,
(4)
where Eelec and amp are hardware-related parameters, which denotes radio energy dissipation and transmission amplifier energy dissipation, respectively. Eqs. (3) and (4) are
known as the first-order radio model [Heinzelman et al. 2000], which is a well-known
and generally accepted formula for estimating energy consumption of communication
in the field of low-energy radios (e.g., sensor), during the last decade. The value of in
Eq. (3) can be varied from 2 to 4; in our work we adopted the value of 2. In addition,
the values of hardware-related parameters used in Eqs. (1) to (4) can be found in Wang
and Chandrakasan [2002] and Shih et al. [2001].
It is important to note that the formulas for calculating computation and communication energy consumption that we have introduced in this section only consider the
energy expenditure directly associated with task execution. The other possible energy
consumptions for a sensor node, such as radio state transition, are not explicitly taken
into consideration. The main reason for making such an assumption is that our proposed scheduling scheme focuses on how to save energy by allocating tasks onto sensor
nodes under certain given constraints so as to extend the system lifetime. In other
words, our proposed algorithm is only concerned with the energy consumption of sensor nodes in the execution period of the application, not preexecution or postexecution.
The generated scheduling scheme determines the duty cycle and sleep cycle of sensor
nodes in each application execution period, where sensor nodes switch to sleep mode
for reducing energy consumption when no communication and computation tasks are
scheduled for them. This scheme performs ideally to eliminate energy consumptions of
a radio in other possible states (e.g., idle listening) for sensor nodes if a schedule-based
MAC-level protocol (e.g., TDMA) is employed in the system. This is a consequence of the
fact that the schedule-based MAC-level protocol has the great advantage of avoiding all
energy waste due to collision, idle listening, and overhearing since it is inherently collision free and it can simply follow the generated scheduling scheme from our proposed
algorithm to notify each radio in the sensor nodes when the radio should be active
and, more importantly, when not. With this important characteristic, the idle listening
can be ruled out since sensor nodes know beforehand when to expect incoming data
[Langendoen and Halkes 2005]. According to the this reasoning, if a TDMA-based protocol is running on the target WSNs, there is no need to consider the energy consumption
spent in idle listening mode as a part of the total energy consumption for performing communication tasks. In our study, unless specifically stated, the schedule-based
MAC-level protocol is assumed to be the default case, thus the energy consumption of
communication tasks is only computed from the first-order radio model.
3.5. Problem Statement
The goal of the task assignment and scheduling problem is to find out a scheme of
task allocations and their execution sequences on a given set of processors in order to
optimize the designed objective function. Let P represent the mapping function of a
task assignment which means that a computation task Ti is assigned to sensor i. After
the task allocation, the communication link is considered between nodes whenever corP
related tasks are not allocated to the same node. Lets suppose that Ecomp
(Ti )represents
33:11
P
the energy consumption of processing a computation task Ti , and Ecomm
(Ti ) represents
the communication energy consumption of a task Ti , including receiving data from
its predecessor tasks andtransmitting the
intermediate result of Ti to its successor
P
tasks. Thus, Ecomm
(Ti ) = nj=1 Etx (l, d) + m
k=1 Erx (l), where j represents the number of
successor tasks of task Ti and k represents the number of predecessor tasks of task
Ti . The goal of our algorithm is to find out a scheme that prolongs the system lifetime
by managing the energy consumption (including reducing and balancing energy consumption) subject to the time constraint of the application. The designed objectives are
given next.
Find a scheme P0
arg min energy (P)
(5)
to minimize
P
=
Etotal
n

P
Ecomp
(Ti ) +
i=1
n

P
Ecomm
(Ti )
(6)
i=1
subject to
Length(P) ADL,
(7)
where energy(P) and length(P) are respectively the overall energy consumption and the
schedule length of P, and ADL is the deadline of the application. In general, the task
allocation and scheduling issue has been proven as an NP-complete problem [Zomaya
1995]. As a consequence, heuristic algorithms are employed to solve the problem in
polynomial time.
4. THE THREE-PHASE TASK SCHEDULING ALGORITHM
The proposed algorithm, TPTS, aims to achieve the best trade-off between energy
saving and load balancing while meeting the deadline constraint of the application.
To the best of our knowledge, this is the first work that simultaneously addresses
these three different goals, trying to encompass all the major problems involved in
performing distributed tasks in WSNs. Our solution includes three major components,
namely: (i) the deadline distribution algorithm, (ii) the energy-load tunable graph
partitioning algorithm, and (iii) the multi-objective contention-aware list scheduling
algorithm. The first two algorithms run on the sink node and their goal is to generate
subtask graphs with relative deadlines for the clusters. The multi-objective contentionaware list scheduling algorithm runs on the cluster heads, which are referred to as
local schedulers in our work, and its goal is to generate a scheduling scheme in that
particular cluster. After running the scheduling algorithm, the cluster head assigns
each task of the allocated subtask graph to a proper sensor to meet the applications
deadline with the possible minimum energy consumption as well as balancing the load
among the sensors within the cluster.
4.1. The Deadline Distribution Algorithm
We consider a time-constrained application, which is restricted to commence at a specified time and to be finished within a given time span called the end-to-end deadline.
In our case, the distributed application is running on multiple clusters in the WSN
system. Performing inter-task communication among clusters results in additional
communication overhead in the system. Such inter-cluster communication leads to not
only more energy consumption on sensors, but also to a potential increase in the communication contention within the system. A good strategy to guarantee the application
can be finished in time is to distribute the end-to-end deadline over tasks before the
33:12
W. Li et al.
application is partitioned into subtask graphs and allocated to different clusters. As a

consequence, the global scheduling problem is converted to a local scheduling problem
to which each local scheduler finds out its best scheme. If each local scheduler can
guarantee its own subdeadline, the entire application execution will be guaranteed to
complete within the end-to-end deadline. Moreover, the individual task deadline can
provide more information to the local scheduler so that it can make fine-tuned and
more efficient decisions at the scheduling phase.
The conventional deadline distribution techniques [Saksena and Hong 1996;
Abdelzaher and Shin 1995] can be appropriately performed when the task-processor
assignment is known in advance. On the other hand, task assignment techniques
require information about individual task deadlines for generating a scheduling
scheme. As a consequence, a circular dependency between the deadline distribution
and task-processor assignment occurs. In this article, we adopt and modify the
deadline distribution algorithm proposed in Jonsson and Shin [1997] to distribute
the end-to-end deadline to each task in the task graph prior to the task-processor
assignment phase. The main idea of this deadline distribution algorithm is to put
all the tasks, including computation and communication tasks, into a set and find
out a critical path for the tasks in the set at each loop until the set is empty.
Since the task-processor assignment is not known before the deadline distribution,
the real communication cost cannot be estimated accurately. Two communication
cost estimation strategies are involved in the original design, one called CCNE
(Communication Cost NonExisting) strategy, which assumes that there will never
be inter-processor communication between tasks, and the other one called CCAA
(Communication Cost Always Assumed), which assumes that there always will be
inter-processor communication between tasks. The breadth-first traversal is employed
to determine the critical path for the tasks in , which has the minimum value of the
evaluation metric R. Although a task graph can contain more than one critical path,
once the critical path has been determined, the end-to-end deadline is distributed over
the tasks belonging to the critical path. The deadline distribution is governed by the
constraint that the ready time of a task must be equal to the deadline of its predecessor
in the critical path. Thus, all tasks in the path will be assigned their ready time
and end-to-end deadline. After this step, the ready time and deadline for those tasks
connected to the tasks in the critical path can be calculated. The ready time for each
task not in the critical path is set to the latest deadline of any predecessor task in the
critical path. Similarly, the deadline for each task not in the critical path is set to the
earliest ready time of any successor task in the critical path. Then, these critical-path
tasks are removed from the task graph. The preceding steps are continually performed
until all the tasks have been assigned their ready time and deadline. In our algorithm,
without knowing the task-processor assignment and inter-processor communication
cost, both computation and communication tasks are taken into account when finding
a critical path, which is known as the CCAA strategy in the original algorithm.
However, a different metric of R, called Rcomm , is adopted in our algorithm to determine
the critical path. Furthermore, the value of the distributed deadline should not be less
than the total actual execution time of the tasks in the critical path. The time difference
between the actual execution time and the deadline is called slack [Jonsson and Shin
1997]. It also represents the maximum amount of time that the execution of the task
can be delayed without missing its subdeadline. The pseudocode is given in Figure 2.
Several definitions of R are discussed in Jonsson and Shin [1997] for finding the
critical path, including: (i) allocate the slack to the sum of actual execution times of
all tasks in a path, (ii) allocate the slack to the proportion of task execution time,
(iii) allocate the slack to the sum of estimated execution time of all tasks in a path.
33:13
Fig. 2. The deadline distribution algorithm.
The common feature of these three definitions is the trend to allocate the slack onto
computation tasks in the critical path. However, none of these approaches is suitable
for our case for two reasons. First, since the sensors in our system are not equipped with
DVS function, allocating the slack to computation tasks contributes nothing to energy
saving. Second, the communication in the system will cause contention. An intertask communication may expend more time to complete due to the late start caused
by wireless communication interference. Allocating additional time to communication
tasks will increase the possibility that an application can be finished in time. Therefore,
we define a new approach, which only distributes the available path slack to the sum
of communication task execution time in the critical path. The ratio of slack that each
communication task in the critical path can be allocated is computed as

Rcomm = D
(8)
ti ci /
tcommj c j ,

where D denotes the end-to-end deadline of a critical path, tic denotes the total cost
i

of computation and communication tasks in a critical path, and tcommj c j denotes the
total communication cost in a critical path. Therefore, the deadline of a communication
task tcommj can be computed by ReadyTime(tcommj ) + c j (1 + Rcomm). The deadline of
a computation task just equals its ready time (the earliest time of a task can start processing) plus its execution cost. This approach gives more time for the communication
tasks to ensure they can meet their deadline when the communication tasks have a
late start due to some communication contention taking place in the system.
4.2. The Energy-Load Tunable Graph Partitioning Algorithm
The distribution of tasks among cooperating clusters is a key factor to balance the
workload in the system. This step attempts to find a solution to divide the task graph
into k disjoint partitions and distribute them over sensor clusters to balance the
workload of computational tasks by considering the residual energy of clusters, while
minimizing inter-cluster communication cost to reduce overall energy consumption
and shorten the schedule length. This is the well-known graph partitioning problem
[Zomaya 1995]. Since the complexity of the graph partitioning problem is normally
33:14
W. Li et al.
NP-complete [Garey et al. 1976], several heuristic methods have been developed to
obtain suboptimal results in reasonable time. Most proposed solutions are designed
to satisfy only a single constraint and to minimize only a single objective. In our case,
minimizing inter-cluster communication (short for energy) and balancing workload
among sensor clusters (short for load) are two different objectives. In general, when
two computational tasks are grouped into the same partition, the communication cost
between them is considered as 0. Besides, the distance between sensor clusters is
greater than the distances among nodes within a cluster. Therefore, more energy will
be consumed when transmitting the same amount of data traveling inter-cluster than
intra-cluster. The strategy to minimize the energy consumption can be achieved by
grouping as many as possible connected tasks into the same task group and allocating
such a group to the same sensor cluster. However, extending the system lifetime
implies not to overuse the energy of any cluster. This requires the computational
workload to be distributed evenly among clusters according to their remaining energy.
Grouping tasks into the same partition can be achieved by collapsing vertices and
edges of the task graph through a sequence of steps. This process is known as the
coarsening phase [Zomaya 1995]. Meanwhile, the operation preserves the properties of
the graph that are essential for finding a desired partition. There are many techniques
that can be used in the coarsening phase for grouping the vertices of the graph into disjoint partitions, and collapsing the vertices of each cluster into a single vertex. Heavy
Edge First Merge (HEFM) [Zomaya 1995] is the most popular algorithm for coarsening
due to its simplicity and relatively good performance. For a given task graph, HEFM
selects the heaviest weight edge from the task graph and merges the vertices connected
to that edge. The algorithm runs iteratively, until the expected number of clusters is
reached. However, this greedy coarsening process is prone to generate an extremely
large weight vertex compared to the weights of other vertices. Moreover, the entry-task
assignment restriction adopted in our work can be violated through the coarsening
process.
Unlike a single-objective graph partitioning algorithm, finding an optimal solution
for multi-objective graph partitioning is ambiguous, since there is no single overall optimal solution for the multi-objective graph partitioning, although there is an optimal
solution for each one of the individual objectives. Furthermore, a feasible solution that
provides good performance for one objective may result in worse performance for the
other objective. Before we can move forward to develop an algorithm to find out the good
solution for multi-objective graph partitioning, three related concepts, Pareto-optimal
points, Pareto-optimal, and Pareto frontier are introduced. In optimization theory, the
solutions that are of interest cannot be dominated by any other solution, regardless of
whether they have optimal values for any of the objectives. These are called Paretooptimal points [Makowski 1994]. A solution is Pareto-optimal [Makowski 1994] if there
is no feasible solution for which one can improve the value of any objective without
worsening the value of at least one other objective. The set of all Pareto-optimal points
is called the Pareto frontier [Makowski 1994]. Please note that multiple-objective optimization problems generally have many Pareto-optimal solutions. In the context of the
multi-objective partitioning problem, the user should specify the area along the Pareto
frontier in which she is interested, and performing such actions can be achieved by
controlling the trade-offs among the objectives. This is particularly difficult for objectives that are dissimilar in nature since they cannot be easily combined. Normalization
technique is a possible approach to combine dissimilar objectives. After that, the proposed algorithm computes the solution based on the user-specified preference values.
With these given values, the algorithm will allow one objective to move away from its
optimal value by some amount only if the other objective moves toward its optimal
33:15
value by more than that amount. Overall, the proposed algorithm should satisfy the
following criteria: (i) appropriately handle similar or dissimilar objectives by using
normalization technique, (ii) allow users to control the trade-offs among the objectives, and (iii) the produced results should be predictable and intuitive based on the
users inputs. To satisfy criteron (i) in our multi-objective graph partitioning problem,
each produced result needs to be normalized by the results generated by the selected
approaches for each individual objective, where one is energy related and the other
is load related. Let C1 and C2 be the best results regarding the partitioning of the
given graph with respect to a single objective, energy and load balance, respectively. To
reduce energy consumption of running a DAG on multiclusters, it is important to minimize the inter-cluster communication cost between partitions. This is due to the fact
that the communication cost within the same partition is considered as 0, as well as
the fact that no energy-saving technique is adopted to reduce the energy consumption
of computation tasks. C1 denotes the minimum sum of inter-cluster communication
cost between the designated numbers of partitions. This value can be simply obtained
by applying HEFM to the graph because this greedy search method always sorts the
edges in decreasing order and seeks for the task connected to the heaviest edge to
merge. When the designated numbers of partitions are achieved, the remaining edges
are those with the smallest values in the graph. The ideal load balance occurs whenever the ratio of the workload allocated to each cluster is as the same as the ratio of
remaining energy of sensor clusters. Under this circumstance, the load imbalance is
0. However, this situation is hard to achieve since the task workload often does not
perfectly match the remaining energy of sensor clusters. To achieve a practical best
load balance according to the remaining energy of each cluster, the graph partition
problem can be modeled as a multibin packing problem [Lemaire et al. 2006], which
aims at putting m objects of different sizes (tasks) into n bins with different heights
(clusters) with a specific objective (e.g., minimize the sum of height difference between
bins). The multibin problem can be easily narrowed down to a maximum cardinality
bin packing problem [Labbe et al. 2003] which tends to maximize the number of objects
packed into the n bins without exceeding bin capacities and splitting objects. Since the
bin packing problem is NP-hard [Coffman et al. 1997], it is unlikely that there exists a
polynomial-time algorithm to solve it optimally. Therefore, we use C2 to represent the
minimum sum of load imbalance between clusters, and the value of C2 is generated
by the best-fit rule [Lemaire et al. 2005], which is a heuristic solution for solving the
multibin packing problem. Let C10 and C20 be the inter-cluster communication cost and
load imbalance of a partitioning solution generated by our proposed graph partitioning
algorithm, respectively. We compare C10 and C20 to C1 and C2 separately and two ratios
are obtained. For each objective, the less value of ratio indicates the better solution for
that objective. To evaluate the overall performance, we add these two ratios up, and
the following equation is obtained
P=
C10
C0
+ 2,
C1
C2
(9)
where P denotes the quality of a solution generated by the multi-objective graph partitioning algorithm. The less value of P represents the better solution of the algorithm.
Eq. (9) is only capable to generate the graph partition when the application requires
equal importance between the two objectives (energy and load). However, this is not a
general solution for all WSN applications since they can have different requirements
and relevancies between these two objectives. In order to satisfy the criteria (ii) and
(iii), we introduce a user-defined parameter to tune the importance of each objective.
33:16
W. Li et al.
Fig. 3. The energy-load tunable graph partitioning algorithm.
Therefore, the previous equation will become

P=
C0
C10
+ (1 ) 2 .
C1
C2
(10)
Taking all the aforesaid factors into account, we propose an algorithm with finetuned control of the trade-offs among the objectives and the restrictions, called the
energy-load graph partitioning algorithm. The pseudocode is given in Figure 3.
The algorithm is composed of three steps, namely: (i) local heaviest-edge-first merge,
(ii) trade-off-based vertices merge, and (iii) narrow down. Based on the value of the
residual energy of each cluster and the user preference tolerance ranging from 0 to
1, the maximum workload of a cluster is computed. The user preference tolerance is
a parameter to set how much load imbalance can be tolerated inside the system. The
larger value of tolerance indicates the less importance of load balancing. Therefore,
the maximum workload is used as an upper bound to prevent the heaviest-edge-first
merge algorithm to generate a coarse vertex with very large weight. Besides, to ensure
that the vertices we operate at each coarsening phase contain at most one entry task,
we use local heaviest weight edge instead of global weight edge. In step 1, for each
entry task, we select the heaviest weight edge connected to it and put this edge into a
list. Then, we select the heaviest weight edge from the list. If the vertices connected to
33:17
the selected edge contain only one entry task after the merger, the action is actually
performed and the selected edge is removed from the list; otherwise, this edge is simply
removed from the list without merge. The procedure continues until the maximum
workload constraint for clusters no longer holds; then either the algorithm moves into
step 2 or, if no more vertices can be merged, the algorithm terminates. In step 2,
all the remaining unprocessed edges are put into a new list, and the lightest weight
edge is selected from this list. For all the coarse vertices containing an entry task,
we evaluate the quality of a potential solution by merging the vertex connected to
the lightest weight edge with every coarse vertex. The solution providing the least
value is selected and the real merge is processed. In this step, we intend to select
the lightest weight edge from the unprocessed edge list for reducing the inter-cluster
communication cost when we process the vertex merging. Lets assume that after the
step 1 processing, several remaining vertices are still waiting for further process to
be merged with m coarse vertices. The merge decision in the step 2 is determined by
Eq. (10). This means any remaining vertex has only 1/m chance to be merged with the
original parent node without causing extra inter-cluster communication cost. As we
know, the communication cost is a major source of energy consumption in a WSN, and
also the inter-cluster communication cost consumes more energy than intra-cluster
communication cost. Consequently, to select the lightest weight edge aims mainly
at reducing the possible inter-cluster communication cost. As mentioned earlier, the
comparable standard C1 for the minimum inter-cluster communication is generated by
using the HEFM algorithm. The comparable standard C2 for achieving the best load
balance is modeled as a multibin packing problem, and the best-fit rule [Lemaire et al.
2005] is used to obtain the result. If a task graph contains as the same number of
entrytasks as the number of sensor clusters in the hierarchical WSN, the algorithm
is completed and terminates here, otherwise, the algorithm reaches step 3. Until this
step, the number of the coarse vertices is still greater than the number of sensor
clusters. The purpose of this step is to keep merging the remaining coarse vertices
subject to the stated objectives. Based on the Eq. (10) the algorithm will evaluate every
possible combination of remaining vertices and find out the best solution and collapse
these two vertices. The procedure is continued until the remaining number of vertices
is eventually equal to the number of sensor clusters.
4.3. The Multiobjective Contention-Aware List Scheduling Algorithm
After the previously described phases 1 and 2, the partitioned task graph with subdeadlines will be assigned to the cluster heads. At this phase, the local scheduler
located at the cluster head generates an appropriate schedule to meet the subdeadline
of the allocated tasks in each cluster, while managing the cluster energy consumption.
Please note that each cluster head is only responsible for generating the scheduling
scheme of the allocated task graph partition, not the entire DAG. Consequently, the
global scheduling scheme for the DAG is generated in a distributed way. List scheduling
[Zomaya 1995] is a fundamental heuristics skeleton for task scheduling. It encompasses
two important criteria, the priority scheme for the tasks and the choice criterion for
the processor, to develop a scheduling algorithm. The priority scheme is classified into
two types, static and dynamic. Static priority means the schedule order of the tasks
is established before the actual scheduling process, while dynamic priority means the
schedule order of the tasks can be changed in the progress of the scheduling. In our
algorithm, the static priority is employed since we use each task deadline as its priority.
The processor selection is another hindrance when implementing a list scheduling algorithm. The difficulty in this step is because the list scheduling algorithm is required
to assign a task to a processor and attribute a start time to a task simultaneously.
Several heuristic algorithms for selecting a processor [Sinnen 2007; El-Rewini et al.
33:18
W. Li et al.
Input: Partitioned task graph, cluster topology
Fig. 4. The multi-objective list scheduling algorithm.
1994] have been presented. In general, they can be divided into two alternatives: (i) first
processor allocation, then assignment of start time and (ii) first assignment of start
time and then processor allocation. The second alternative is extremely difficult to
implement in a system that contains a finite number of processors, since tasks with
overlapping execution times have to be executed on different processors. Furthermore,
without knowledge of the processor allocation, it is impossible to compute the communication cost. Whenever two adjacent tasks are allocated to the same processor, the
communication cost is equal to 0. Otherwise, the communication cost is equal to the
value labeled on the DAG. Therefore, in our algorithm, the first alternative is adopted.
The pseudocode for our multiobjective contention-aware list scheduling algorithm
is given in Figure 4. Besides the partition of the task graph, it also takes the cluster
topology as an input parameter, which is modeled as an undirected graph represented
by an adjacent matrix.
The priority scheme for the tasks is implemented in our algorithm by sorting the
subdeadline of each task in nondecreasing order. A smaller value of subdeadline for a
task indicates it has higher priority to be processed.
Since in our case only the sensor nodes are able to perform computation tasks, the
processor selection is the process of sensor selection. The sensor selection is determined
by the following principle. First, the application deadline is guaranteed to be met by
finishing each task before its subdeadline. In our algorithm, we evaluate the finish
time of every sensor-task combination by applying Eq. (11).
t f (ni , su) = ts (ni , su) + c(ni ),
(11)
where t f (ni , su) denotes the finish time of task ni when running on sensor su, ts (ni , su)
denotes the start time of task ni on sensor su, and c(ni ) denotes the time for performing
task ni on sensor su. The start time ts (ni , su) depends on: (i) the maximum finish time
of task ni parent task n j , and (ii) the communication time for transferring data from
n j to ni . If the tasks n j to ni are allocated to the same sensor, then the communication
time is equal to 0. Otherwise, the communication time is equal tohopcounts time per
hop. The hopcounts is derived from the routing path between the locations where n j to
ni are allocated. We employ a Breadth-First Search (BFS) mechanism as the routing
algorithm to find the shortest path between two sensors. In the routing process, the
33:19
data is transmitted by a store-and-forward strategy, which means the data is stored

at every sensor on the path until the entire message has arrived and is subsequently
forwarded to the destination. The communication time of one hop is computed by
ticomm = data(i, j)/bandwidth(i, j) as introduced in the application model. Since the
sensors are homogeneous, c(ni ) is the same for all the sensors. Then the finish time of
a task-sensor combination is acquired.
After performing such evaluation for all the sensors inside the cluster, those sensors
that cannot finish the job in time (meeting the considered subdeadline) are eliminated.
For the remaining sensors, we estimate the energy consumption by taking the factor
of load balance into consideration, computed by
est
Etotal
(ni , su) = a (Ecomp(ni , su) + Ecomm(ni , su)),
(12)
est
is the estimated total energy consumption if task ni allocates to sensor su
where Etotal
to perform. a is the amplification coefficient used to take both the load balancing and
the energy consumption factors into account in the process of processor selection. To
determine the amplification coefficient, we design a linear (monotonically increasing)
function to compute its value. The function is given next.
a = Level Num Value of the level
(13)
In this function, the initial energy of a sensor is divided into N levels and each level
is associated with a value. The level of a sensor is computed by dividing its residual
energy by its initial energy. When the level is determined, the value of the level is also
known. Therefore, the amplification coefficient is obtained. The more levels are defined,
the more importance of load balancing. The actual energy consumption of computation
and communication can be computed by Eqs. (1), (3), and (4), respectively. Finally, we
select the sensor-task combination which can provide the lowest energy consumption
as the processor.
4.4. Computational Complexity and Comparative Analysis
For the sake of analysis, we assume that the DAG has n tasks (including m entry tasks)
with e edges, and the sensor network has c clusters, where each cluster contains s
sensors forming a k-hop topology. By using these assumptions, we can derive that the
average degree of computation tasks is e/n.
In the deadline distribution algorithm (Figure 2), the While iteration at line 2 executes O(logn) steps. To find a critical path (in line 3) takes O(n+e) steps. Assign deadline
and ready time to the nodes related to critical path tasks takes O(e/n) steps. Thus,
phase 1 needs O(logn((n+e)+e/n)) steps.
In the energy-load tunable graph partitioning algorithm (Figure 3), the While iteration in line 4 executes in O(logn) steps. To select the heaviest edge takes O(me/n)
steps. Thus, step 1 in the energy-load tunable graph partitioning algorithm uses a total O (lognme/n) steps. The While iteration in line 17 executes in O(loge) steps, and to
find out the best trade-off takes O(logn-m) steps. The last step takes logeO(m-c) steps.
Moreover, the While iteration in line 2 runs in O(log(n-c)) steps. Thus, phase 2 takes
O(log(n-c)(loge(me/n +o(m-c)))) steps.
In the multi-objective list scheduling algorithm, we assume that ni tasks and ei edges
are allocated to the cluster. The time complexity of sorting the earliest finish time is
O(ni logni ) since not all the tasks are allocated to the same cluster. To determine the
most appropriate node that offers the minimal energy consumption of a task before
the deadline, the time complexity is O(ksei /ni ). Thus, the time complexity of phase 3
is O(c(ni logni + ksei /ni )).
33:20
W. Li et al.
Taking all three phases into account, the overall complexity is O(logn((n+e)+e/n) +
log(n-c) (loge(me/n+o(m-c))) + c(ni logni + ksei /ni )).
In order to obtain an intuitive performance of the TPTS algorithm, in this section we
compare the time complexity of our proposal with other existing solutions. To the best
of our knowledge, there is no existing work addressing the same issue presented in
this article. Two selected algorithms MTMS [Yuan and Eylem 2007] and BEATA [Xie
and Qin 2008] designed for generating a scheduling scheme for nonhierarchical WSNs
are used for comparison purposes. MTMS (Multihop Task Mapping and Scheduling)
algorithm provides the in-network computation capacity required by arbitrary realtime applications in multihop nonhierarchical WSNs, which also aim to guarantee
application deadlines with minimum energy consumption. MTMS is a multiple-phase
algorithm, including a hyper-DAG extension, communication scheduling algorithm
with the TSSE (TSSE stands for the modified Min-Min algorithm) and DVS algorithm.
2
The overall complexity of MTMS is O(ensk+n+ ns ), where the TSSE contributes O(ensk)
2
and DVS algorithm contributes with O(n+ ns ) to the overall complexity of the algorithm,
respectively. BEATA (Balanced Energy-Aware Task Algorithm) aims at minimizing
the energy consumption of single-hop nonhierarchical heterogeneous WSNs while
confining schedule lengths through task allocations. BEATA is a one-phase approach
without considering the communication interference among sensors. The overall time
complexity of BEATA is O(ns log s +nkq)+ O(n+e), where k is a user-defined parameter
ranging from 1 to , and q is the maximum in-degree (the number of edges directed
into a vertex) of a task graph which also ranges from 1 to . Since the objectives
each algorithm tends to achieve are different, there is no absolutely fair comparison
and such comparison could be meaningless. However, if we consider a special case
in which the entire DAG is only allocated to a single-sensor cluster, the performance
comparison among three algorithms is more realistic. Under this circumstance, no task
partitioning is required to execute in the TPTS algorithm, thus the energy-load tunable
graph partitioning algorithm contributes nothing to the overall complexity of the TPTS
algorithm. Moreover, since all the tasks are allocated to the same cluster, c = 1, ni = n,
and ei = e. Hence, the overall time complexity of TPTS becomes O(logn((n + e) + e/n) +
(nlogn + kse/n)). These three algorithms, have similar time complexity when all the
tasks are allocated to the same sensor cluster. However, the TPTS algorithm will have
additional time complexity when the tasks are required to be allocated to multiple
clusters, and the performance of time complexity is worse than the two selected
algorithms.
5. SIMULATION RESULTS
In this section, the performance of the proposed three-phase task scheduling algorithm
is evaluated in a simulated environment. A discrete-event simulator was written in
Java programming language and all the simulations run on a 2GHZ CPU machine. In
the simulations, we assess the system performance according to four metrics: schedule
length, system lifetime, energy consumption, and load balancing. Since no similar
works (trading off three objectives in a multiclusters system) have been presented
in the WSN filed so far, we adopted some benchmarks to quantitatively evaluate the
performance of our proposed algorithm. The schedule length (also known as makespan)
is defined as the latest task completion time in the task graph. Two benchmarks are
introduced for comparison; the length of critical path is used as lower bound and the
value hops length of critical path is used as upper bound. In our article, the system
lifetime refers to the time the first sensor is drained of its energy [Dietrich and Dressler
2009]. In order to facilitate the analysis of the results, we present the system lifetime as
33:21
how many rounds of a DAG have been performed on the system. A bigger number means
a longer system lifetime, and it also indicates that the death time of the first sensor
occurs later. In TPTS, the residual energy of each cluster is used as the input parameter
to partition the DAG in phase 2 when an application is to be executed again. Thus,
the task graph partitioning algorithm is self-adaptive to the energy change among
clusters and generates different task partitions according to such change. We call this
behavior of the algorithm as dynamic case. If the task graph partitioning algorithm
uses a preset ratio to partition the DAG at each round, the energy change among
clusters will be ignored and the task partitions remain the same. We call this behavior
the static case. The static case will be used as a benchmark to evaluate our algorithm
performance, more precisely, in terms of the system lifetime, for the dynamic case. The
energy consumption includes the computation and communication energy expenditure
of all sensors in a simulation round. To demonstrate how well our solution performs
in terms of energy savings, we compare TPTS with the hypothetical case of minimum
energy consumption, described as follows. To run a DAG on our system, the energy
consumption for executing 1 unit of computation tasks is the same. The possible way to
minimize energy consumption is to reduce communication energy. An extreme situation
is that no intra-communication occurs in the system lifetime. Thus, we set up a system
in which each cluster only contains one sensor to simulate the extreme situation.
And we use it as the benchmark for the performance metric of energy consumption.
The last performance metric, load balance, is used to describe whether the clusters
consume energy in an even way. It is measured by the average residual energy of a
cluster compared to the average residual energy of the entire system. If the system
energy has no cluster whose energy is significantly different from the others, the load
balance among clusters is nicely achieved. The risk of shortening system lifetime in
overusing a cluster is mostly minimized.
To further evaluate the performance of the proposed algorithm, simulations are run
on arbitrary synthetic applications with randomly generated DAGs. We also evaluated
the impact of the following aspects on the system performance:
effect of task graph,
effect of cluster size,
effect of algorithm tunable parameters.
5.1. Simulation Setup
Before introducing our experimental results, we present the parameters used in our
simulator. The parameters of sensors in the simulated WSN are chosen to resemble
real-world sensors like the Mica family [Hill and Culler 2002] endowed with an
Intel StrongARM 1000 processor. According to Shih et al. [2001] and Wang and
Chandrakasan [2002], the energy-related parameters in Eqs. (1) to (4) are set as
follows: Eelec = 50nJ/b, amp = 10pJ/b/m2 , VT = 26mV, C = 0.67nF, I0 = 1.196mA, n =
21.26, K = 239.28MHz/V, Vdd = 1V, and c = 0.5V. The initial energy for each sensor is
0.0001J. Cluster head is assumed to be placed in the middle of the cluster. The hopcount
is considered as the distance from a sensor to its cluster head. The scheduled-based
MAC-level protocol (TDMA) is used as the default case for the simulations.
All synthetic DAGs used in the following are generated from TGFF [Dick et al. 1998],
which is a well-known pseudorandom task-graph generator widely used in scheduling
and allocation research. Unless specifically stated, the average workload of computation and communication tasks are both set to 10 with 50% variation.
In the following, all the presented data are averaged over at least 300 times, consequently, a 95% confidence interval with a 5% (or better) precision is achieved.
33:22
W. Li et al.
5.2. Effect of Task Graph
To evaluate the effect of task graphs, simulations are run on a sensor network with
2 clusters. There are 5 sensors in each cluster to build a multiple-hops network. The
task graphs are generated based on three common parameters: the number of entry
tasks, the in-degree of a task, and the out-degree of a task. The minimum number of
entry tasks is set to the number of clusters. The in-degree of a task and the out-degree
of a task are both bounded to 3. During simulation, the entry tasks are assigned to
the sensors which have the highest residual energy in each round. Two parameters
are used to examine the impact of different task graphs: the total number of tasks
and Communication-to-Computation Ratio (CCR). Literally, the total number of tasks
denotes how many tasks a DAG is composed of. CCR is a factor indicating the ratio of
average execution time of the communication tasks to that of the computation tasks.
5.2.1. Effect of the Number of Tasks. To investigate the effect of the number of tasks in
task graphs, different sizes of DAGs are randomly generated. Unlike parallel computing, the scale of task graph is not clearly defined in WSNs. In our simulation, four sets
of DAGs with 10, 40, 70, and 100 tasks represent small to large scale. Similar definitions are also adopted in Yu and Prasanna [2005] and Yuan and Eylem [2007]. For
fair comparison, these DAGs are run on the same randomly created 2-hops clusters.
Although the clusters share the same hopcounts, they have different internal topology.
In the simulations, the CCR is set to 1.
As shown in Figure 5(a), the application makespan is proportional to the number
of tasks. The actual schedule length is close to the length of the critical path while
the number of tasks is small. Due to the limit on the number of sensors in each
cluster, the application makespan is significantly increased after the number of tasks
reaches 40. We can observe from the same figure that for the lower bound, the length
of critical path is barely increased with the number of tasks. This reveals that the
incremental tasks are mostly located at the noncritical paths. With limited sensors,
the tasks cannot be processed the first time and the delay is cumulative. Thus, the
curve of makespan moves toward the upper bound. The system energy consumption
per round is also affected by the increment of the number of tasks. The more tasks
a DAG contains, the more computation and communication energy the system will
consume. As a result, the system lifetime is in inverse proportion to the number of
tasks. Compared with the benchmark, the dynamic case has longer system lifetime
because it proactively adjusts to the system changes. Figure 5(d) shows us no apparent
load imbalance between sensor clusters unless when the number of tasks is 10. Within
the system lifetime, there is average 10% load imbalance (difference in the energy
consumption) between two clusters. In the small scale, the DAG can only provide
very few options for the partitioning algorithm to generate the new task partitions.
Consequently, load imbalance emerges. Overall, the simulation results reveal that
TPTS can automatically manage the impact brought by increasing the number of
tasks. In addition, no cluster will be overused to shorten the system lifetime.
5.2.2. Effect of CCR. CCR indicates the importance of a communication task in a DAG,
which strongly affects the scheduling behavior, especially for the multi-hop scenarios.
We studied the impact of CCR on the algorithm performance by varying the average
value of communication tasks compared to a fixed average computation workload.
The total number of tasks is set to 30 and the average computation load is set to 10.
Simulations are run on the same sensor network configuration used in the previous
section.
According to the simulation result in Figure 6(a), the application makespan is also
proportional to CCR, but its effect is not as significant as the number of tasks. The
33:23
Fig. 5. Effect of the number of tasks: (a) makespan; (b) system lifetime; (c) energy consumption; (d) intercluster load balance.
makespan curve is close to the lower bound when CCR changes. On the one hand,
the total number of tasks doesnt change, so no extra computation tasks are required
to process. TPTS can fully utilize the available sensors and process the computation
tasks simultaneously. On the other hand, the increment of communication cost results
in a TPTS that tends to allocate the tasks to the same sensor to reduce the intracommunication energy. However, the increment of CCR value dramatically affects the
system lifetime since the inter-cluster communication cost becomes more and more
expensive. Even with the same level of communication, the increment of average communication cost makes the communication workload increase. Intuitively, this result
demonstrates that the communication task consumes more energy than the computation task in WSNs, which is well-known. This result is also shown in Figure 6(c),
where the increment of CCR causes more system energy consumption at each round.
No matter how CCR is varied, TPTS balances the load between clusters at all time.
5.3. Effect of Cluster Size
In this section, the performance impact of cluster size is investigated. The cluster size
has two implications, one dealing with changes inside a cluster (for instance, regarding
the topology, considering the dynamic nature of WSNs) and the other in changing the
number of clusters. We will investigate how TPTS performs when these conditions are
changed.
33:24
W. Li et al.
Fig. 6. Effect of CCR: (a) makespan; (b) system lifetime; (c) energy consumption; (d) inter-cluster load
balance.
The impact of change inside a cluster is mainly focused on changes of the topology
and the number of sensors. The simulations are tested with a randomly generated
30-tasks DAG scheduled on 1-hop, 2-hop, and 3-hop random clusters, respectively. The
number of sensors in each cluster is calculated by 4 hops. In the simulations, the CCR
is set to 1.
In Figure 7(a), the makespan slightly increases when the hop count increases. Although more sensors provide the possibility of parallel tasks execution, the increased
hops induce extra intra-cluster communication time when sensors need to exchange
data with others. Consequently, the length of inter-cluster communication is also affected since the communication between sensor and cluster head is intra-cluster communication as well. Another reason to increase the makespan is that the increased
task parallelism could generate potential communication interference. For these reasons, the system energy consumption is also increased. Moreover, the sensors number
increment does not result in the corresponding scale of system lifetime extension. This
indicates that the system lifetime is not only simply affected by the number of sensors, but also by the cluster topology. However, TPTS traces the system change and
dynamically partitions the DAG according to the residual cluster energy, which significantly prolongs the system lifetime along with the increment of hopcounts compared
to the static case. Our proposed algorithm balances the load between clusters properly
regardless of the variation of the size of the sensor cluster.
33:25
Fig. 7. Effect of cluster size, change inside a cluster: (a) makespan; (b) system lifetime; (c) energy consumption; (d) inter-cluster load balance.
The purpose of studying the number of clusters change is to investigate how TPTS
performs under a multiclusters system. Randomly generated 50-task DAGs are used
for scheduling on multiclusters (from 2 to 5) WSNs. The number of sensors in each
cluster is 5. In the simulations, the CCR is set to 1.
From Figure 8(a) and Figure 8(c), we can observe that TPTS is not significantly
affected by the change in the number of clusters. The performance is still close to
the benchmarks. The system lifetime is prolonged in proportion to the change in the
number of clusters. Compared to the static case, the system lifetime of the dynamic
case is on average 50% longer. No apparent load imbalance is shown as the number of
clusters is changed.
5.4. Effect of the Algorithms Tunable Parameters
In this section, we investigate how the system performance is affected by the tunable
parameters designed in our algorithm, including: application deadline in phase 1, and
tolerance value and user-defined parameter in phase 2 for the graph partition. The
application deadline is not only used to determine the subdeadline for each task in
phase 1, but also it is used to decide the task allocation in phase 3. The tolerance value
is used to manage how much load imbalance of the local heaviest-edge-first merge
method can be tolerated in the merging process. The user-defined parameter is
used to tune the relevance of an objective (energy or load) in the energy-load tunable
33:26
W. Li et al.
Fig. 8. Effect of the cluster size, change the number of clusters: (a) makespan; (b) system lifetime; (c) energy
consumption; (d) inter-cluster load balance.
graph partitioning algorithm. In the following simulations, all parameters are varied
from 0.1 to 0.9 to evaluate their effect on the system performance.
5.4.1. Effect of Application Deadline. We investigate the performance impact of different
application deadlines by using a randomly generated 30-task DAG and a randomly
generated WSN with 2 clusters, each cluster having 2 hops (between each sensor and
the CH). The number of sensors in each cluster is 5. In the simulations, the CCR is set
to 1. In this section, we only collect the data of makespan and energy consumption.
As shown in Figure 9(a), the makespan of TPTS keeps increasing up to the scenario in
which the deadline is set to 4 times the critical path. For the given DAG, the minimum
scheduling length of TPTS can be reached 1.33 times of the length of critical path.
The worst case is around 1.8 times. However, the energy consumption is contrary to
the schedule length. The lesser the schedule length the more energy TPTS consumes.
As mentioned earlier, the extra energy is mainly consumed by exchanging data by
neighbor sensors to enable parallel processing. When the deadline is set to not less
than 4 times the length of the critical path, the energy consumption of the system is
constant. Within the clusters, the computation tasks tend to be allocated to the same
sensor due to the fact that the residual time is large enough for serial processing, unless
that sensor has been overused.
33:27
Fig. 9. Effect of application deadlines: (a) makespan; (b) energy consumption.
5.4.2. Effect of Tolerance Value. In this section, the randomly generated task graph is
used with the same parameters as in the previous section. The same applies to the
WSN setting.
As shown in Figure 10(a), the makespan is slightly changed before the tolerance
value reaches 0.5. After this point, the makespan is a constant for the tested DAGs.
This behavior is caused by the tolerance value when set large enough so that the DAGs
are mainly partitioned by the local heaviest-edge-first algorithm. In spite of the fact
that in a round the clusters consume different amounts of energy, TPTS will trace the
energy difference and allocate the reverse part of the task partition to the clusters
in the next round. In other words, task partitions are distributed to the clusters in
a round-robin way. The system lifetime and the energy consumption show a similar
trend in the figures. As in previous simulations, TPTS maintains good load balance
between clusters.
5.4.3. Effect of User-defined Parameter . To further evaluate our proposed algorithm,
we turn off the function of local heaviest-edge-first merge in phase 2, and only use
the user-defined parameter to manage the graph partitioning. The simulations use the
same conditions as described for the previous two sections to randomly generate the
task graph and system settings. In the following figures, the user-defined parameter
represents the importance a given user/application attributes to energy consumption
in the WSN. More precisely, it guides the strategy adopted by the algorithm regarding
the effort in reducing the inter-cluster communication cost.
From Figure 11(a), we observe that the variation of the user-defined parameter only
slightly affects the makespan. Throughout the previous analysis, we have seen that the
makespan is mainly affected by the sensor capacity, as well as the application deadline
setting. Although the energy consumption is not changed significantly, the dynamic
case shows better performance than the static one, especially when the user-defined
parameter becomes larger. The larger the user-defined parameter is in the algorithm,
the less inter-cluster communication cost will be generated. This will potentially result
in imbalance in task partitions in the clusters allocation. Figure 11(b) demonstrates
that TPTS can nicely handle this situation. Load imbalance occurrs after the userdefined parameter reaches 0.6. The algorithm opts for generating low inter-cluster
communication cost task partitions, while the load balance between clusters is only
managed in a round-robin way.
33:28
W. Li et al.
Fig. 10. Effect of varying the tolerance value: (a) makespan; (b) system lifetime; (c) energy consumption;
(d) inter-cluster load balance.
5.4.4. Performance Comparison with MTMS. In order to further study the performance
of TPTS, we compare our solution with a previously proposed algorithm known as
MTMS [Yuan and Eylem 2007]. Since MTMS is designed for multihop nonhierarchical
WSNs, TPTS is designed for hierarchical WSNs and they have different objectives. We
restrict our simulations to ones running on a multihop nonhierarchal WSN to obtain
the relative fair performance comparison. The presented results are the average of
500 simulation runs of random DAGs with the total number of tasks set to 30, the
average computation load set to 10, and CCR set to 1. The performance metrics we
evaluate are schedule length and application energy consumption.
As shown in Figure 12(a), TPTS outperforms MTMS, with shorter makespan in the
simulations. The superior performance of TPTS comes mainly from the fact that the
sensor node which provides the best result for Eq. (12) is selected as the processor for
the specific task, as well as the fact that no DVS technique is employed to reduce the
energy consumption of computation tasks by fully utilizing the time slack within the
deadline of the application. However, the employment of the DVS technique produces
the major contribution to the energy consumption when MTMS and TPTS are running
under the same simulated scenario. As shown in Figure 12(b), TPTS and MTMS have
similar energy consumption when the deadline is set to a small value. This occurs
since there is not much time slack for DVS to reduce the energy consumption when
the computation tasks are executed. When the deadline value is increased, more time
slacks are available for DVS to further reduce the energy consumption of computation
33:29
Fig. 11. Effect of varying the tolerance value: (a) makespan; (b) system lifetime; (c) energy consumption;
(d) inter-cluster load balance.
Fig. 12. Performance comparison with MTMS: (a) makespan; (b) energy consumption.
33:30
W. Li et al.
tasks before the application deadline. However, for the energy consumption metric,
TPTS does not have a significant difference in performance compared to MTMS. This
situation is caused by the fact that communication tasks are the major energy consumer
in the WSN application.
5.4.5. Performance Comparison on Different Types of MAC Protocols. As we mentioned earlier, the schedule-based MAC-level protocol is used as the default case for supporting
and cooperating with TPTS to eliminate the possible energy consumption of a radio in
its idle listening mode. However, if a contention-based protocol (e.g., CSMA) is adopted,
there will be no designed mechanism to utilize the communication schedule generated
from our algorithm, thus the sensor nodes must be prepared to handle an incoming
message at any moment. Without further assistance from other nodes (e.g., sender,
cluster head, or sink node), the receiver has no option other than to stay in idle listening mode to monitor the channel continuously and check for any task assignment. In
this circumstance, the energy spent in the idle listening mode should be included in
the total energy consumption for performing communication tasks.
Although many approaches have been proposed to reduce the inherent idle listening
in sensor nodes by enforcing the radio periodically switching on for a short time in
each duty cycle, it is still very difficult to derive a generalized formula to indicate how
much time the sensor nodes will stay in the idle listening mode. Instead of adopting
any particular existing contention-based MAC protocol (e.g., S-MAC [Ye et al. 2002],
T-MAC [Dam and Langendoen 2003]), we assume that the radio for all the sensor nodes
is either in communicating (including sending and receiving) mode or idle listening
mode during their lifetime, once the main purpose of including the idle listening mode
into our energy model is not to evaluate the performance of our proposed algorithm
when run on the top of a specific MAC protocol. We adopted the equation used in Li and
Lazarou [2004] to extend our communication-related energy consumption. We express
the energy consumed by the radio during each idle listening period as E I (l) = Erx (l),
where is the ratio of the energy consumed in idle listening mode E I (l) to the energy
consumed in receiving mode Erx (l), and it ranges from 50% to 100%. We use 95% in
our model based on the measurement from Stemm and Katz [1997]. According to the
aforementioned changes, the total energy consumption of sensor nodes is changed to
p
Etotal =
n

i=1
P
Ecomp
(Ti ) +
n

i=1
P
Ecomm
(Ti ) +
m

0.95 EI (l),
(14)
j=1
where j denotes the sensor nodes that do not need to communicate with other nodes,
thus remaining in the idle listening mode and consuming a constant amount of energy
per time unit during the application execution period. We considered it as the worstcase scenario for TPTS, then compare it with the best-case scenario (TPTS running on
the top of a schedule-based MAC-level protocol) to reveal the performance difference
between them. The presented results are the average of 500 simulation runs of random
DAGs with the total number of tasks set to 30, the average computation load set to 10,
and CCR set to 1. The performance metrics we evaluate are system lifetime and the
average energy consumption for performing one application.
As shown in Figure 13(a) and Figure 13(b), TPTS running on the schedule-based
MAC protocol is obviously better than if it is running on the contention-based MAC
protocol since a significant amount of energy (up to 70%) is wasted by the radio being
permanently in the idle listening mode. However, this energy waste can be significantly
reduced by adopting an energy-efficient contention-based MAC-level protocol. The less
idle listening time is needed for a contention-based MAC-level protocol, the better
performance TPTS should have. It is important to emphasize that our algorithm is a
33:31
Fig. 13. Performance comparison on different types of MAC protocols: (a) system lifetime; (b) energy
consumption.
higher-level algorithm, and it is agnostic to the underlying MAC-level protocol used

in the network. Obviously the WSN design and the choice of the low-level protocols
impact the overall performance of the system. Therefore, in order to achieve the best
energy efficiency, all protocols of the network stack should be optimized for energy, in
a holistic way.
6. CONCLUSIONS AND FUTURE DIRECTIONS
In this article, we addressed the problem of scheduling a periodically running application on a multiclusters system, where each cluster has a multihop topology composed
of a number of homogeneous sensor nodes. The design objective of task distribution is
to map and schedule the tasks of an application considering energy consumption and
load balance to extend the system lifetime, while completing the application before its
deadline. A three-phase-based polynomial-time heuristic task scheduling algorithm,
called TPTS, is developed to accomplish the goal. TPTS provides several parameters to
allow users to tune the performance of the algorithm based on different requirements
and application scenarios. Furthermore, TPTS employs a feedback control mechanism
to guarantee that any cluster will not be overused within the system lifetime. The
experimental results show the time and energy performances of TPTS are close to the
time and energy of benchmarks in most cases, while load balance is satisfied.
There are several interesting future directions for this work. First, we will extend
our algorithm to a scenario where multiple applications arrive at the same time
and are simultaneously being performed in the WSN, with requirements potentially
conflicting. We will also consider the WSN composed of heterogeneous sensors,
regarding their capacities. With tasks from different applications, we need to adjust
the scheduling scheme to guarantee all the applications can be finished within their
deadline. In our current version, the cluster and sensors are assumed homogenous
for all the tasks. The involvement of heterogeneity makes us reconsider how to find
out the best trade-off among time, energy, and load balance. Second, energy-saving
techniques can be employed to study how to extend system lifetime. DVS is one of the
commonly used energy-saving techniques for most scheduling research. However, in
a WSN, the communication task often consumes more energy than the computation
task. Besides, our scheduling scheme runs on multiclusters, thus the system energy
will be mainly consumed by the communication tasks. DVS may not be the best
option for this case. We are particularly interested at how to combine communication
energy-saving techniques, such as modulation scaling [Schurgers et al. 2001], topology
33:32
W. Li et al.
control [Wang 2008], and duty cycling [Vigorito et al. 2007; Injong et al. 2008], with
our scheme to save energy and satisfy other requirements. Third, different system
lifetime definitions can be employed to study how a WSN system behaves. Currently,
we use the strictest definition wherein when the first sensor dies, the system dies.
However, in real systems, sensors are massively deployed for providing redundancy.
Normally, these systems can still be functional after several sensors have their energy
depleted. Moreover, some sensors could face hardware failure without running out
of energy. Thus, we need to adjust the algorithm to adapt to topology changes, while
producing an appropriate trade-off among time, energy, and load balancing.
REFERENCES
ABDELZAHER, T. F. AND SHIN, K. G. 1995. Optimal combined task and message scheduling in distributed realtime systems. In Proceedings of the 16th IEEE Real-Time Systems Symposium. IEEE Computer Society.
AKYILDIZ, I. F., SU, W., SANKARASUBRAMANIAM, Y., AND CAYIRCI, E. 2002. Wireless sensor networks: A survey.
Comput. Netw. 38, 393422.
AL-OBAIDY, M. AND AYESH, A. 2008. Optimizing autonomous mobile sensors network using pso algorithms. In
Proceedings of the International Conference on Computer Engineering and Systems (ICCES08). 199203.
ALSALIH, W., AKL, S., AND HASSANEIN, H. 2005. Energy-aware task scheduling: Towards enabling mobile
computing over manets. In Proceedings of the 19th IEEE International Parallel and Distributed
Processing Symposium (IPDPS05). Vol. 13, IEEE Computer Society.
BAKSHI, A. AND PRASANNA, V. K. 2004. Algorithm design and synthesis for wireless sensor networks. In
Proceedings of the International Conference on Parallel Processing (ICPP04). Vol. 421, 423430.
BAOKANG, Z., MENG, W., ZILI, S., JIANNONG, C., CHAN, K. C. C., AND JINSHU, S. 2008. Topology aware task
allocation and scheduling for real-time data fusion applications in networked embedded sensor systems.
In Proceedings of the 14th IEEE International Conference on Embedded and Real-Time Computing
Systems and Applications (RTCSA08). 293302.
BERMAN, P., CALINESCU, G., SHAH, C., AND ZELIKOVSKY, A. 2005. Efficient energy management in sensor
networks. In Ad Hoc and Sensor Networks.
BISWAS, S., GUPTA, S., YU, F., AND WU, T. 2010. A networked mobile sensor test-bed for collaborative
multi-target tracking applications. Wirel. Netw. 16, 13291344.
BLOUGH, D. M., CANALI, C., RESTA, G., AND SANTI, P. 2009. On the impact of far-away interference on
evaluations of wireless multihop networks. In Proceedings of the 12th ACM International Conference on
Modeling, Analysis and Simulation of Wireless and Mobile Systems. ACM Press, New York, 9095.
CARDIERI, P. 2010. Modeling interference in wireless ad hoc networks. IEEE Comm. Surv. Tutorials 12,
551572.
CHAFEKAR, D., KUMAR, V. S. A., MARATHE, M. V., PARTHASARATHY, S., AND SRINIVASAN, A. 2007. Crosslayer latency
minimization in wireless networks with sinr constraints. In Proceedings of the 8th ACM International
Symposium on Mobile Ad Hoc Networking and Computing. ACM Press, New York, 110119.
COFFMAN JR., E. G., GAREY, M. R., AND JOHNSON, D. S. 1997. Approximation algorithms for bin packing: A
survey. In Approximation Algorithms for NP-Hard Problems. PWS Publishing, 4693.
DAM, T. V. AND LANGENDOEN, K. 2003. An adaptive energy-efficient mac protocol for wireless sensor networks.
In Proceedings of the 1st International Conference on Embedded Networked Sensor Systems. ACM Press,
New York, 171180.
DICK, R. P., RHODES, D. L., AND WOLF, W. 1998. TGFF: Task graphs for free. In Proceedings of the
6th International Workshop on Hardware/Software Codesign. IEEE Computer Society, 97101.
DIETRICH, I. AND DRESSLER, F. 2009. On the lifetime of wireless sensor networks. ACM Trans. Sensor Netw. 5,
1, 139.
EL-REWINI, H., LEWIS, T. G., AND ALI, H. H. 1994. Task Scheduling in Parallel and Distributed Systems.
Prentice-Hall.
ELSON, J. AND ROMER

, K. 2003. Wireless sensor networks: A new regime for time synchronization. SIGCOMM
Comput. Comm. Rev. 33, 149154.
GAREY, M. R., JOHNSON, D. S., AND STOCKMEYER, L. 1976. Some simplified np-complete graph problems. Theor.
Comput. Sci. 1, 237267.
GOUSSEVSKAIA, O., OSWALD, Y. A., AND WATTENHOFER, R. 2007. Complexity in geometric sinr. In Proceedings
of the 8thACM International Symposium on Mobile Ad Hoc Networking and Computing. ACM Press,
New York, 100109.
33:33
HEIDEMANN, J., SILVA, F., INTANAGONWIWAT, C., GOVINDAN, R., ESTRIN, D., AND GANESAN, D. 2001. Building
efficient wireless sensor networks with low-level naming. In Proceedings of the 18th ACM Symposium
on Operating Systems Principles. ACM Press, New York, 146159.
HEINZELMAN, W. R., CHANDRAKASAN, A., AND BALAKRISHNAN, H. 2000. Energy-efficient communication protocol
for wireless microsensor networks. In Proceedings of the 33rd Annual Hawaii International Conference
on System Sciences. Vol. 12.
HILL, J. L. AND CULLER, D. E. 2002. Mica: A wireless platform for deeply embedded networks. IEEE Micro
22, 1224.
INCEL, O. D., GHOSH, A., KRISHNAMACHARI, B., AND CHINTALAPUDI, K. 2011. Fast data collection in tree-based
wireless sensor networks. IEEE Trans. Mobile Comput. 11, 1, 8699.
INJONG, R., WARRIER, A., AIA, M., JEONGKI, M., AND SICHITIU, M. L. 2008. Z-MAC: A hybrid mac for wireless
sensor networks. IEEE/ACM Trans. Netw. 16, 511524.
JAIN, K., PADHYE, J., PADMANABHAN, V. N., AND QIU, L. 2005. Impact of interference on multi-hop wireless
network performance. Wirel. Netw. 11, 471487.
JONSSON, J. AND SHIN, K. G. 1997. Deadline assignment in distributed hard real-time systems with relaxed
locality constraints. In Proceedings of the 17th International Conference on Distributed Computing
Systems (ICDCS97). IEEE Computer Society, 432.
KARIMI, H., KARGAHI, M., AND AZDANI, N. 2009. Energy-efficient cluster-based scheme for handling node failure
in real-time sensor networks. In Proceedings of the 8th IEEE International Conference on Dependable,
Autonomic and Secure Computing. IEEE Computer Society, 143148.
KRUMKE, S., MARATHE, M., AND RAVI, S. 2001. Models and approximation algorithms for channel assignment
in radio networks. Wirel. Netw. 7, 575584.
KUMAR, S., LAI, T. H., AND BALOGH, J. 2004. On k-coverage in a mostly sleeping sensor network. In Proceedings
of the 10th Annual International Conference on Mobile Computing and Networking. ACM Press, New
York, 144158.
LABBE , M., LAPORTE, G., AND MARTELLO, S. 2003. Upper bounds and algorithms for the maximum cardinality
bin packing problem. Euro. J. Oper. Res.149, 490498.
LANGENDOEN, K. AND HALKES, G. 2005. Energy-efficient medium access control. In Embedded Systems
Handbook. 34.2934.31.
LEMAIRE, P., FINKE, G., AND BRAUNER, N. 2005. The best-fit rule for multibin packing: An extension of grahams
list algorithms. In Multidisciplinary Scheduling: Theory and Applications, G. Kendall, E. K. Burke,
S. Petrovic, and M. Gendreau, Eds., Springer, 269286. http://link.springer.com/content/pdf/bfm%3A9780-387-27744-8%2F1.pdf.
LEMAIRE, P., FINKE, G., AND BRAUNER, N. 2006. Models and complexity of multibin packing problems. J. Math.
Model. Algor. 5, 353370.
LI, J. AND LAZAROU, G. 2004. Modeling the Energy Consumption of MAC Schemes in Wireless Cluster-Based
Sensor Networks. ACTA Press.
MA, J., LOU, W., WU, Y., LI, X.-Y., AND CHEN, G. 2009. Energy efficient tdma sleep scheduling in wireless sensor
networks. In Proceedings of the 28th IEEE Conference on Computer Communications (INFOCOM09).
IEEE, 630638.
MAKOWSKI, M. 1994. Methodology and a modular tool for multiple criteria analysis of lp models. Working paper
WP-94-102, International Institute for Applied Systems Analysis, Laxenburg. http://www.iiasa.ac.at/.
MOSCIBRODA, T. H., WATTENHOFER, R., AND WEBER, Y. 2006. Protocol design beyond graph-based models. In
Proceedings of the 5th Workshop on Hot Topics in Networks (HotNets).
POORNACHANDRAN, R., AHMAD, H., AND CAM, H. 2005. Energy-efficient task scheduling for wireless sensor
nodes with multiple sensing units. In Proceedings of the Conference on Performance, Computing and
Communications.
SAKSENA, M. AND HONG, S. 1996. An engineering approach to decomposing end-to-end delays on a distributed
real-time system. In Proceedings of the 4th International Workshop on Parallel and Distributed
Real-Time Systems. IEEE Computer Society, 244.
SANLI, H. O., POORNACHANDRAN, R., AND HASAN, C. 2005. Collaborative two-level task scheduling for wireless
sensor nodes with multiple sensing units. In Proceedings of the 2nd Annual IEEE Communications
Society Conference on Sensor and Ad Hoc Communications and Networks (SECON05). 350361.
SCHURGERS, C., RAGHUNATHAN, V., AND SRIVASTAVA, M. B. 2001. Modulation scaling for real-time energy aware
packet scheduling. In Proceedings of the IEEE Global Telecommunications Conference (GLOBECOM01).
Vol. 3656, IEEE, 36533657.
33:34
W. Li et al.
SHIH, E., CHO, S.-H., ICKES, N., MIN, R., SINHA, A., WANG, A., AND CHANDRAKASAN, A. 2001. Physical layer driven
protocol and algorithm design for energy-efficient wireless sensor networks. In Proceedings of the 7th
Annual International Conference on Mobile Computing and Networking. ACM Press, New York, 272287.
SINGH, M. AND PRASANNA, V. K. 2003. A hierarchical model for distributed collaborative computation
in wireless sensor networks. In Proceedings of the 17th International Symposium on Parallel and
Distributed Processing. IEEE Computer Society, 162166.
SINNEN, O. 2007. Task Scheduling for Parallel Systems. Wiley-Blackwell.
STEMM, M. AND KATZ, R. H. 1997. Measuring and reducing energy consumption of network interfaces in
hand-held devices. IEICE Trans. Comm. 80, 11251131.
VIGORITO, C. M., GANESAN, D., AND BARTO, A. G. 2007. Adaptive control of duty cycling in energy-harvesting
wireless sensor networks. In Proceedings of the 4th Annual IEEE Communications Society Conference
on Sensor, Mesh and Ad Hoc Communications and Networks (SECON07). 2130.
WANG, A. AND CHANDRAKASAN, A. 2002. Energy-efficient dsps for wireless sensor networks. IEEE Signal
Process. Mag. 19, 6878.
WANG, Y. 2008. Topology control for wireless sensor networks. In Wireless Sensor Networks and Applications,
Y. Li, M. T. Thai, and W. Wu, Eds., Signals and Communications Technology Series, Springer, 113147.
XIE, T. AND QIN, X. 2008. An energy-delay tunable task allocation strategy for collaborative applications in
networked embedded systems. IEEE Trans. Comput. 57, 329343.
YE, W., HEIDEMANN, J., AND ESTRIN, D. 2002. An energy-efficient mac protocol for wireless sensor networks. In
Proceedings of the 21st Annual Joint Conference of the IEEE Computer and Communications Societies
(INFOCOM02). Vol. 1563, IEEE, 15671576.
YOUNIS, M., YOUSSEF, M., AND ARISHA, K. 2002. Energy-aware routing in cluster-based sensor networks.
In Proceedings of the 10th IEEE International Symposium on Modeling, Analysis, and Simulation of
Computer and Telecommunications Systems. IEEE Computer Society, 129.
YOUNIS, M., YOUSSEF, M., AND ARISHA, K. 2003. Energy-aware management for cluster-based sensor networks.
Comput. Netw. 43, 649668.
YU, Y. AND PRASANNA, V. K. 2005. Energy-balanced task allocation for collaborative processing in wireless
sensor networks. Mob. Netw. Appl. 10, 115131.
YUAN, T. AND EYLEM, E. 2007. Cross-layer collaborative in-network processing in multihop wireless sensor
networks. IEEE Trans. Mobile Comput. 6, 297310.
YUAN, T., BOANGOAT, J., EKICI, E., AND OZGUNER, F. 2006. Real-time task mapping and scheduling for
collaborative in-network processing in dvs-enabled wireless sensor networks. In Proceedings of the
20th International Parallel and Distributed Processing Symposium (IPDPS).
YUAN, T., EKICI, E., AND OZGUNER, F. 2005. Energy-constrained task mapping and scheduling in wireless
sensor networks. In Proceedings of the IEEE International Conference on Mobile Ad Hoc and Sensor
Systems Conference.
ZOMAYA, A. 1995. Parallel and Distributed Computing Handbook. McGraw-Hill Professional.
Received March 2011; revised April 2012; accepted April 2012

A33 Li

Cargado por

Información del documento

Título original

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

A33 Li

Cargado por

Copyright:

Formatos disponibles

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless

Internet. Sensor nodes act in a collaborative way to accomplish sensing, processing,

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

required in several application and deployment scenarios. Furthermore, considering

A collaborative application with a set of precedence-constrained tasks can be modeled

To model the communication interference in wireless multihop networks, primary and

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

where VT denotes the thermal voltage and C, I0 , n, K, and C are processor-dependent

Erx (l) = Eelec l,

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

arg min energy (P)

application is partitioned into subtask graphs and allocated to different clusters. As a

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

Fig. 2. The deadline distribution algorithm.

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

Fig. 3. The energy-load tunable graph partitioning algorithm.

Therefore, the previous equation will become

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

Input: Partitioned task graph, cluster topology

Fig. 4. The multi-objective list scheduling algorithm.

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

data is transmitted by a store-and-forward strategy, which means the data is stored

a = Level Num Value of the level

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

5.2. Effect of Task Graph

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

Fig. 9. Effect of application deadlines: (a) makespan; (b) energy consumption.

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

higher-level algorithm, and it is agnostic to the underlying MAC-level protocol used

ELSON, J. AND ROMER

Adaptive Energy-Efficient Scheduling for Hierarchical Wireless Sensor Networks

También podría gustarte