Está en la página 1de 13

An efficient Algorithm for Task Allocation in Homogenous Computer Communication Network

Sunil Sharma1, K. R. Singh2 and P. K. Yadav3


1

Ph.d Scholar, Jodhpur National University, E-mail: sunilsahaj@gmail.com 2Meerut College, Meerut (U.P.) 3CBRI, Roorkee

ABSTRACT Computer Communication Network (CCN) is of current interest due to the advancement of microprocessors technology and computers networks. It consists of multiple computing nodes that communicate with each other by message passing mechanism. The advancement the new technologies in communication and information lead to the development of the CCN. The task allocation is an essential step for performance evaluation of CCN. In a CCN, a task is allocated to a processor in such a way that extensive Inter Task Communication (ITC) is avoided. The present deals the problem of m tasks and n homogenous processors (where m>>n). 1. INTRODUCTION

The rapid development in microprocessor technology and the availability of interconnecting networks have made the computer communicating system feasible and popular. With the help of computer communicating system, it is possible to utilize a remote computer facility or a database that does not exist in the local computer. This also facilitates parallel execution of programs which results in increased computational speed. Computer communicating systems find wide applications in industrial, scientific and commercial environments. Many factors have considerable influence on the performance of computer communication systems. Besides hardware aspects such as speed of processors, memories, interconnection networks, the software design aspects also have a bearing on the performance of the system. These include the method of splitting the application software into modules or tasks (task partitioning) and the method used to allocate these modules or tasks to processors (task allocation). The amount of parallelism inherent in the application problem and the size of tasks executed on each processor also influence the working of the system. The major questions to be investigated are task partitioning and task allocation strategies which influence the distributed software properties like inter task communication and potential for parallelism. Task partitioning is an earlier design step than task allocation; its effectiveness depends considerably on allocation. If these steps are not properly implemented and an increase the number of processors in the system may resulted in a decrease of the total throughput of the system [1]. Meanwhile, the traditional notions of best-effort and real-time processing have fractured into a spectrum of processing classes with different timeliness requirements [2-5]. Many systems are hard and

missing deadline is catastrophic [6-8], whereas in soft real-time system occasional violation of deadline constraints may not result in a useless execution of the application or calamitous consequences, but decreases utilization [9]. Dar-Tzen Peng et al [10] have derived optimal task assignments to minimize the sum of task execution and communication costs with the branchand-bound method and evaluated the computational complexity of this method using simulation techniques. Tzu-Chiang Chiang et al [11] has been reported the traditional methods for optimization and give exact answers but often tends to get stuck on local optima [11]. The dynamic task scheduling considering the load balancing issue is an NP-Hard problem has also been discussed by Hans-Ulrich Heiss et al and Yadav et al [12] [21]. Consequently, another approach is needed when the traditional methods cannot be applied. Modern heuristics are general purpose optimization algorithms discussed by Page, A.J et al [13]. Their efficiency or applicability is not tied to any specific problem-domain. Multiprocessor scheduling methods can be divided into list heuristics and Meta heuristics. Some of the Meta heuristics include Simulated Annealing algorithms, Genetic Algorithms have been reported by [14] [15] [16] [17][18] [19] [20]. Particle Swarm Optimization PSO yields faster convergence when compared to GA, because of the balance between exploration and exploitation in the search space and reported by Chunming Yang et al and Van Den Bergh et al [22] [23]. Recently Yadav et.al [24] discussed a task allocation model for reliability and cost optimization in distributed computing system. Assigning m tasks (modules) to n processors results in n m possible allocations. The problem of deriving an optimal task allocation model in terms of performance oriented time functions and system constraints are exponentially complex. In reality, deriving an optimal task allocation model becomes very complicated if multiple copies of tasks and data are considered. One of the reasons for considering module replication is to meet the fault tolerance requirement. The performance oriented cost functions are Execution Cost (EC), Inter-Task Communication Cost (ITCC). The constraints include limited memory capacity, load balancing and processing capacity of processors and dependence of certain tasks on certain processors. In order to make the best use of available computational power, it becomes essential to execute the tasks on processors in such a way that overall busy time of the system should be least. Keeping in view these facts, in this chapter we have developed an algorithm for the optimal utilization of available processors. The major thrust of research for evaluating the performance through tasks scheduling is centered on providing solutions that are scalable to large scale CCN. We present here a mechanism for systematic utilization of homogenous processors considering the Per Bit Processor Service Rate PSR ( ) and Inter Task Communication Cost ITCC (,) and Task Size TS () have taken in to consideration.

2.

ASSIGNMENT PROBLEM

The specific problem being addressed to is as follows: Given application program that consists of m communicating Tasks, T = {t1 ,t2 ,.tm}, and a homogenous distributed system with n processors, P = {p1 ,p2 ,.pn}, where it is assumed that m>>n. An allocation of tasks to processors is defined by a function Aalloc from the set T of tasks to the set P of processors such that: Aalloc:TP ,where Aalloc(i)=j if task ti is assigned to processor pj, 1im,1jn. For an assignment, the task set (TSj) of a processor can now be defined as the set of tasks allocated to the processor [25] TSj= {i: Aalloc (i) =j, j=1, 2n}

The following cost functions have been used while designing the mathematical model. 2.1. Tasks Size (TS): A task is a sequential program, which performs some predefined action and possibly communicates with other tasks in a system. The task size ts i (1 i m) of each task depends on the length of tasks and generally counted in bytes. 2.2. Processors Execution Rate (PER): The execution rate erj (1 j n ) of each

processor is the speed (bytes/ second) of the processor at which they execute the tasks. 2.3. Execution Cost (EC): The EC eij (1 i m , 1 j n) of each task ti depends on the capability of the processor pj to which it is assigned and the work to be performed by each task. The exaction time of each processor for a allocation are calculated using the following equation as:

(1) where TSj= {i: Aalloc (i) =j, 2.4. j=1, 2n}

Inter Tasks Communication Cost (ITCT): The ITCC cik of the interacting tasks ti and tk is incurred due to the data units exchanged between them during the process of execution. The inter-task communication times for each processor of a given allocation are calculated as:

(2) 2.5 Optimal Processing Cost (OPC) : The OPC is a function of the heaviest aggregate computation to be performed by each processor. The OPC for a given assignment Aalloc is defined conservatively (assuming that computation cannot be overlapped with communication) as shown below:

(3)

2.6

Assumptions

Several assumptions have been made to keep the algorithm reasonable in size while designing the algorithm. The program is assumed to be the collection of m tasks, which are to be executed on a set of n processors that have same processing capabilities. It is assumed that the size (in bytes) of each tasks and execution rate (in bytes/ second) for each processor is known. The size of each task has been defined in column matrix TSM= [tsi] ,where tsi represent the size of i-th task. The service rate of each processor is taken in the form of row matrix PSRM=[erj] , where erj denotes the service rate for j-th processor. Each task in the task set may communicate with zero or more other tasks in the set. Two communicating tasks incur an inter-processor communication cost when they are assigned to two distinct processors. The inter-processor communication cost is taken in the form of a symmetric matrix named as inter task communication time matrix ITCTM= [cij] order m, where cij represents the communication cost between task ti and tj. Further cij=cji. Whenever a group of tasks is assigned to the same processor, the ITCT between them is zero. The communication system of the processors is collision free, thus no messages are lost and all messages are sent in a finite amount of time .We assume a contention free communication for the processors. A processor can simultaneously execute a task and communicate with another processor. The overhead incurred by this is negligible, so for all practical purposes we will consider it as zero.

3.

Mathematical Model

The allocation of program tasks is to be carried out so that each task is assigned to a processor whose capabilities are most appropriate for the tasks, and the inter-processor cost is minimized. The present model passes through the following steps :

3.1

Calculation of execution cost: Using task size and execution rate, the execution
time eij of each task ti is calculated and taken in the form of execution time matrix, ETM=[eij] of order m x n. Where eij= tsi * erj (1 i m , 1 j n)

3.2

Cluster making and assignment of the clusters to the processors: Initially we assume that each program tasks form a distinct cluster Ci{ti}. Assigned tasks to the processor are stored in a linear array Tass={} and the tasks which are not assigned yet are stored in a linear array Tnon-ass={}. Initially linear array Tass={} is empty and Tnon-ass={} contains all m tasks. Tasks are clustered based on their communication requirement. Highly communicating tasks are clustered together to reduce communication delays. Usually number of tasks clusters should be equal to the number of processor so that one to one mapping may result. These clusters will be fixed throughout their execution. Since we have n number of processors in DPS, therefore we will make n number of tasks clusters. Cluster making and their assignment follows the following steps: 3.2 (a) number of
(i) (ii)

Initial assignments of n tasks: In this step we assign n tasks to n number of processors as:
Using ITCTM= [cij] calculate TCi=j=1nci,j for i=1 to m.

Arrange the tasks in ascending order of their TCi and store in a linear array TCL={}. (iii) Select starting n tasks from TCL= {}. (iv) Assign these n tasks to n processors at which their execution time eij is minimum. (v) Modify Tass={} by adding these tasks in Tass={}. (vi) (vii) Modify Tnon-ass={} by removing these tasks from Tnon-ass={}. Modify ETM=[eij] and ITCTM= [cij] by deleting these tasks and get METM=[eij] and MITCTM= [cij].

3.2 (b) Cluster making of remaining (m-n) tasks and their assignment: To
make the balanced load on each processor we make the restriction on each cluster. A cluster may contain up to (m-n)n maximum number of tasks and average load on the MAXCT (,) whose first system may be up to i=1mei,j/n with 20% variation. Upper diagonal values of MITCTM (,) are stored in descending order in a three dimension array column represent the first task (say rth task) ,second column represent the second

task (say s-th task)

and third column represent their communication time (c rs ) .

Initially each task is treated like a cluster Ci{ti } for i=1 to m. Store these clusters in a linear array CLS= {Ci, 1 i m }. Select the first tasks pair say ( tr , ts) (say tr Cr and ts Cs ) from MAXCT (, ,). If the sum of number of tasks for clusters Cr and Cs is less (m-n)n and total load for the clusters Cr and Cs , than fuse the than or equal to

clusters Cr with Cs otherwise select the next tasks pair from MAXCT (, ,). Modify CLS= {} by replacing the cluster Cr as Cr Cr Cs={tr , ts} and deleting the cluster Cs . Modify the MAXCT (, ,) by deleting this tasks pair (tr , ts) . Modify METM (, ) and MITCTM (,) as: a. Modify the METM(,) by adding s-th row into r-th . b. Reduce the communication time between tr and ts to zero. c. Add the communication time cs j to cr j for all j. d. Delete task ts from METM (,) and MITCTM (,). The above procedure is repeated till we do not get number get number of tasks clusters equal to number of processors.

3.3.

Assignment of the Clusters: After making the n number of tasks clusters we get a
modified METM (,) of order n whose rows are corresponding to clusters and assign these clusters to that each processor.

4.

Results & Discussions:

To justify the application and usefulness of the present method an example of a DPS is considered consisting of a set of n=3 processors P = {p1, p2, p3} connected by an arbitrary network and a set of m=9 executable tasks T = {t 1, t2, t3, t4, t5, t6, t7, t8, t9} which may be portion of an executable code or a data file. The size of each task and processors execution rate have been taken in the form of matrices TSM (,) and PSRM (,) respectively. The Inter tasks communication time between the tasks has been taken in the form of ITCTM (,) of order m. Input of the model: Number of processors in the system Number of tasks to be executed p1 0.0789 p2 0.078 9 p3 0.0789 (m) (n) = = 9 3

PSRM(,) =

t1 t1 t2 t3 t4 ITCC(,) = t5 t6 t7 t8 t9 34 44 4 76 89 9 21 00 0 23 01 0 22 21 1 44 42 2 24 53 2 56 0 8 1 0 4 0 3 4 0 0

t2 8 0 7 0 0 0 0 3 0

t3 10 7 0 1 0 0 0 0 0

t
4

t5 0 0 0 6 0 0 0 12 0

t6 3 0 0 0 0 0 0 0 1 2

t7 4 0 0 0 0 0 0 3 10

t8 0 3 0 8 12 0 3 0 5

t9 0 0 0 0 0 1 2 1 0 5 0

4 0 1 0 6 0 0 8 0

t1 t2 t3 t4 T S M (,) =

t5 t6 t7 t8

t9

12 3 66 54 2

Multiplying TSM (,) matrix with PSTM (,), we get execution time matrix as: TSM (,) * PSTM (,) =ECM (,) t1 t2 t3 t4 ECM(,)= t5 t6 t7 t8 t9 p1 2717.63 6067.33 1656.90 1815.49 1752.45 3504.90 1035.58 4428.11 5250.16 p2 2717.63 6067.33 1656.90 1815.49 1752.45 3504.90 1035.58 4428.11 5250.16 p3 2717.6 3 6067.3 3 1656.9 0 1815.4 9 1752.4 5 3504.9 0 1035.5 8 4428.1 1 5250.1 6

Applying the step 3.2 (a), we get the following output: Tass
Tnon_ass

= =

{t2,t6,t7} {t1,t3,t4,t5,t8,t9}.

Tasks t2 is assigned to processor p3. Tasks t6 is assigned to processor p1. Tasks t7 is assigned to processor p2. p1 t
1

p2 2717.6 3

p
3

2717.6 3

t
3

1656.9 0

1656.9 0

t
4

1815.4 9

1815.4 9

MECM(,)=

t
5

1752.4 5

1752.4 5

t
8

4428.1 1

4428.1 1

t
9

5250.1 6

5250.1 6

2 7 1 7 . 6 3 1 6 5 6 . 9 0 1 8 1 5 . 4 9 1 7 5 2 . 4 5 4 4 2 8 . 1 1 5 2 5 0 . 1 6

t1 t3 t4 MI TC C(, )= t5

t1 0 10 4 0

t3 10 0 1 0

t4 4 1 0 6

t5 0 0 6 0

t8 0 0 8 12

t9 0 0 0 0

t8 t9

0 0

0 0

8 0

12 0

0 5

5 0

Applying the step 3.2 (b), we get the following output: Maximum number of tasks in a cluster = (m-n)/n = (9-3)/3=2. Cluster1= { t5 ,t8}. Cluster2= { t1,t3}. Cluster3= { t9, t4} Assign these clusters to each processor as we have assumed the processors are homogeneous. The results are shown in the following table. Cluster Cluster1= { t5 ,t8} Cluster2= { t1,t3} Cluster3= { t9 ,t4} Processor p1 P2 P3

The results of the present algorithm are shown in the following table.

Processor-1 Processor-2 Processor-3 Average

EC 9685.45 5410.11 13132.98 9409.51

ITCC 40.00 36.00 64.00 46.67

TSC 9725.45 5446.11 13196.98 9456.18

The above result shows that the response time of the system is 13196.98 units. Figure -1 show the throughput and services rate of the processors are ideally linked. Form the figure it is concluded that both are directly proportionate.

Figure-1 Service rare and throughput of the processors The model discussed in this paper have been coded in Mat lab and tested on HP-dual core processor workstation considering the random sets of input data and found satisfactory. More

than 50 problems have been tested and the results are compared with the problem solved by Sing et al [26]. The following figure 2 shows the comparisons of the results computed by both the algorithms.

Figure 2 Results Comparison


From the figure it is concluded that the present algorithm give better results in 9% cases, 78% cases the results are similar with 10% variations and 11% results are worse. The algorithm reported by [26] gives only 2% better results and 20% worse results. It is concluded that the algorithm reported in this paper gives better results in most of the cases. REFRENCES 1. Chu, E.W., Lee, D., and Iffla, B., A Distributed processing system for naval data communication networks, Proceeding AFIPS Nat. Comput. Conference, , Vol.47, pp 783793, 1978 Z. Deng, J. W. Liu, and S. Sun, (1993), "Dynamic scheduling of hard real-time applications in open system environment," Tech. Rep., University of Illinois at Urbana-Champaign. G. Buttazzo and J. A. Stankovic, (1993), "RED: robust earliest deadline scheduling," in Proc. 3rd Intl. Workshop Responsive Computing Systems, Lincoln, pp. 100-111. S. M. Petters, (2000), "Bounding the execution time of real-time tasks on modern processors," in Proc. 7th Intl. Conf. Real-Time Computing Systems and Applications, Cheju Island, pp. 498-502. J. Zhu, T. G. Lewis, W. Jackson, and R. L. Wilson, (1995) "Scheduling in hard real-time applications," IEEE Softw., vol. 12, pp. 54-63. K. Taewoong, S. Heonshik, and C. Naehyuck, (1998) "Scheduling algorithm for hard realtime communication in demand priority network," in Proc. 10th Euromicro Workshop RealTime Systems, Berlin, Germany, pp. 45-52. C. L. Liu and J. W. Layland, (1973), "Scheduling algorithms for multi-programming in a hard-real-time environment," J. ACM, vol. 20 pp. 46-61 D. Babbar and P. Krueger, (1994), "On-line hard real-time scheduling of parallel tasks on partitionable multiprocessors," in Proc. Intl. Conf. Parallel Processing, pp. 29-38. W. Lifeng and Y. Haibin, (2003), "Research on a soft real-time scheduling algorithm based on hybrid adaptive control architecture," in Proc. American Control Conf, Lisbon, Portugal, pp. 4022-4027

2. 3. 4.

5. 6.

7. 8. 9.

10.

Dar-Tzen Peng, Kang G. Shin and Tarek F. Abdelzaher, (1997), Assignment and Scheduling Communicating Periodic Tasks in Distributed Real-Time Systems, IEEE Transactions On Software Engineering, Vol. 23, No.12, pp. 745-758. Tzu-Chiang Chiang, Po-Yin Chang, and Yueh-Min Huang, (2006), Multi-Processor Tasks with Resource and Timing Constraints Using Particle Swarm Optimization, IJCSNS International Journal of Computer Science and Network Security, Vol.6 No.4, pp. 71-77 Hans-Ulrich Heiss and Michael Schmitz, (1995), Decentralized Dynamic Load Balancing: The Particles Approach, Information Sciences, Vol. 84, No.2, pp.115 - 128. Abdelmageed Elsadek A and Earl Wells B, (1999), A Heuristic model for task, allocation in heterogeneous distributed computing systems, The International Journal of Computers and Their Applications, Vol. 6, No. 1, pp. 1 36. Page A.J and Naughton T.J, (2004), Framework for task scheduling in heterogeneous distributed computing using genetic algorithms, In 15th Artificial Intelligence and Cognitive Science Conference, Ireland, pp. 137146. Page, A.J and Naughton, T.J, (2005), Dynamic task scheduling using genetic algorithms for heterogeneous distributed computing, Proceedings of the 19 th Dynamic Task Scheduling with Load 487 IEEE/ACM International Parallel and Distributed Processing Symposium, Denver, USA, pp. 1530-2075. Annie S. Wu, Han Yu, Shiyuan Jin, Kuo-Chi Lin and Guy Schiavone, (2004),An Incremental Genetic Algorithm Approach to Multiprocessor Scheduling, IEEE Transactions on Parallel and Distributed Systems, Vol. 15, No.9, pp. 824 834. Zomaya A.Y and The Y.H, (2001), Observations on using genetic algorithms for dynamic load-balancing, IEEE Transactions on Parallel and Distributed Systems, Vol 12. No.9, pp. 899-911. Edwin S. H., Hou Ninvan Ansari and Hong Ren, (1994), A genetic algorithm for multiprocessor scheduling, IEEE Transactions on Parallel and Distributed Systems, Vol. 5, No. 2, pp. 113-120. Manimaran G and Siva Ram Murthy C, (1998), A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis, IEEE Transactions on Parallel and Distributed Systems, Vol. 9, No.11, pp. 1137 1152. Ruey-Maw Chen, and Yueh-Min Huang, (2001), Multiprocessor Task Assignment with Fuzzy Hopfield Neural Network Clustering Techniques, Journal of Neural Computing and Applications, Vol.10, No.1, pp. 12 -21 Yadav, P. K., Kumar, Singh, M. P, and Harendra Kumar (2008) Scheduling Algorithm: Tasks Scheduling Algorithm for Multiple Processors with dynamic Reassignment Journal of Computer System, Network and Communication, Vol. (2008), pp. 1-9. Chunming Yang, Simon D, (2005), A new particle swarm optimization technique, Proceedings of the International Conference on Systems Engineering, pp.164-169 Van Den Bergh, F, Engelbrecht, A.P, (2006) A study of particle swarm optimization particle trajectories, Information Sciences, pp. 937997

11.

12. 13.

14.

15.

16.

17.

18.

19.

20.

21.

22. 23.

24.

Yadav, P. K., Singh M.P, and Sharma Kuldeep,(2011) Task Allocation Model for Reliability and Cost Optimization in Distributed Computing System International Jurnal of Modelling, Simulation and Scientific Computing (IJMSSC), Ref.: Ms No. IJMSSC-D-1000029R3, Vol.2(2), pp 1-19 Elsade A.A.,Wells B.E. (1999)A Heuristic Model for Task Allocation in Heterogeneous Distributed Computing System The International Journal of Computers and Their Applications. Vol.6, no.1, pp 0-35 Singh, M.P., Kumar Harendra, and Yadav, P.K. Task Allocation Model for Optimal Utilization of Processors Capacity in Distributed System, International Journal of Mathematical Sciences and Engineering Applications, Vol. 3 No. IV, pp 289-304, 2009

25.

26.

También podría gustarte