Documentos de Académico
Documentos de Profesional
Documentos de Cultura
a r t i c l e
i n f o
Article history:
Received 5 October 2012
Received in revised form 20 January 2013
Accepted 12 February 2013
Available online 4 March 2013
Keywords:
Meshless methods
Element free Galerkin (EFG)
Preprocessing
Stiffness matrix assembly
Parallel computing
GPU acceleration
a b s t r a c t
Meshless methods have a number of virtues in problems concerning crack growth and propagation, large
displacements, strain localization and complex geometries, among other. Despite the fact that they do not
rely on a mesh, meshless methods require a preliminary step for the identication of the correlation
between nodes and Gauss points before building the stiffness matrix. This is implicitly performed with
the mesh generation in FEM but must be explicitly done in EFG methods and can be time-consuming. Furthermore, the resulting matrices are more densely populated and the computational cost for the formulation and solution of the problem is much higher than the conventional FEM. This is mainly attributed to
the vast increase in interactions between nodes and integration points due to their extended domains of
inuence. For these reasons, computing the stiffness matrix in EFG meshless methods is a very computationally demanding task which needs special attention in order to be affordable in real-world applications. In this paper, we address the pre-processing phase, dealing with the problem of dening the
necessary correlations between nodes and Gauss points and between interacting nodes, as well as the
computation of the stiffness matrix. A novel approach is proposed for the formulation of the stiffness
matrix which exhibits several computational merits, one of which is its amenability to parallelization
which allows the utilization of graphics processing units (GPUs) to accelerate computations.
2013 Elsevier B.V. All rights reserved.
1. Introduction
In meshless methods (MMs) there is no need to construct a
mesh, as in nite element method (FEM), which is often in conict
with the real physical compatibility condition that a continuum
possesses [1]. Moreover, stresses obtained using FEM are discontinuous and less accurate while a considerable loss of accuracy is
observed when dealing with large deformation problems because
of element distortion. Furthermore, due to the underlying structure
of the classical mesh-based methods, they are not well suited for
treating problems with discontinuities that do not align with element edges. MMs were developed with the objective of eliminating
part of the above mentioned difculties [2]. With MMs, manpower
time is limited to a minimum due to the absence of mesh and mesh
related phenomena. Complex geometries are handled easily with
the use of scattered nodes.
One of the rst and most prominent meshless methods is the
element free Galerkin (EFG) method introduced by Belytschko
et al. [3]. EFG requires only nodal data, no element connectivity
is needed to construct the shape functions. However a global background cell structure is necessary for the numerical integration.
Corresponding author.
E-mail addresses: alex@karatarakis.com (A. Karatarakis), pmetsis@gmail.com
(P. Metsis), mpapadra@central.ntua.gr (M. Papadrakakis).
0045-7825/$ - see front matter 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.cma.2013.02.011
64
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
ux; t
Ui xui t
i2S
where Ui are the shape functions, ui are the nodal values at particle i
located at position xi, and S is the set of nodes i for which Ui(x) 0.
The shape functions in Eq. (1) are only approximants and not interpolants, since ui u(xi).
The shape functions Ui are obtained from the weight
coefcients wi, which are functions of a distant parameter
r = kxi xk/di where di denes the domain of inuence (doi) of
node i. The domain of inuence is crucial to solution accuracy, stability and computational cost, as it denes the degree of continuity
between the nodes and the bandwidth of the system matrices.
The approximation uh is expressed as a polynomial of length m
with non-constant coefcients. The local approximation around a
, evaluated at a point x is given by
point x
pT xax
uhL x; x
contains
where p(x) is a complete polynomial of length m and ax
non-constant coefcients that depend on x
a0 x a1 x a2 x am x T
ax
pT x 1 x y ;
m3
pT x 1 x y x2
y2
xy ;
m6
Jx
n
X
2
wx xi uhL xi ; x ui
i1
n
X
wx xi pT xi ax ui 2
i1
Axax Wxu
where
Ax
n
X
wx xi pxi pT xi
i1
uh x pT xAx1 Wxu
10
which together with Eq. (1) leads to the derivation of the shape
function Ui associated with node i at point x:
Ui x pT xAx1 Wxi
11
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
The Galerkin weak form of the above formulation gives the discrete algebraic equation
Ku f
12
with
Ct
13
14
Ui;x
6
Bi 4 0
Ui;y
Ui;y 7
5
Ui;x
15
and in 3D problems by
Uj;x
6 0
6
6
6 0
Bi 6
6U
6 j;y
6
4 0
Uj;z
Uj;y
0
Uj;x
Uj;z
0
0 7
7
7
Uj;z 7
7
0 7
7
7
Uj;y 5
16
Uj;x
Z
X
X T
X
BG EBG
QG
G
Kij
BTi EBj dX
Z X
Z
UitdC Ui bdX
fi
fxdX
X
fnJ xN det J n n
17
where n are the local coordinates and det Jn(n) is the determinant of
the Jacobian.
(a)
65
18
(b)
in (a) EFG; (b) FEM, for the same number of nodes and Gauss points.
66
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
Table 1
Computing time required for all node-Gauss point correlations.
Example
Nodes
Gauss points
2D-1
2D-2
2D-3
25,921
75,625
126,025
102,400
300,304
501,264
23
300
836
1.3
3.4
5.4
0.5
1.0
1.4
3D-1
3D-2
3D-3
9,221
19,683
35,937
64,000
140,608
262,144
7
45
157
3.7
7.8
15.7
0.9
1.7
3.3
Global serial
Regioned serial
Regioned parallel
Table 2
Inuences per node and Gauss point for EFG and FEM.
2D
3D
EFG (doi = 2, 5)
FEM (QUAD4)
EFG (doi = 2, 5)
FEM (HEXA8)
100
25
16
4
1000
125
64
8
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
Table 3
Total number of node-Gauss point correlations in EFG and FEM.
Example
Nodes
Gauss points
Total correlations
Ratio
EFG
FEM
2D-1
2D-2
2D-3
25,921
75,625
126,025
102,400
300,304
501,264
2,534,464
7,463,824
12,475,024
409,600
1,201,216
2,005,056
6.2
6.2
6.2
3D-1
3D-2
3D-3
9,221
19,683
35,937
64,000
140,608
262,144
7,077,888
16,003,008
30,371,328
512,000
1,124,864
2,097,152
13.8
14.2
14.5
X
wi ppT i ; 8i 2 Infl:Nodes
i
X
X
wx i ppT i ; Ay
wy i ppT i ;
Ax
i
X
wz i ppT i ;
Az
pTA pTG A1 ; pAx pTA Ax A1 ; pAy pTA Ay A1 ; pAz
pTA Az A1
22
These matrixvector multiplications can be reused in several calculations for every inuenced node of a particular Gauss point. For
large size of the moment matrix A, the direct computation of its inverse is burdensome, so an LU factorization is typically performed
[2]. In this implementation, an explicit algorithm is used for the
inversion of the moment matrix in order to minimize the
calculations.
For each inuenced node i, the following three groups of calculations are then performed:
Ui wi pTA pi
2
3
Ui;x
wi f 0 1 0 0 gA1 pi Ui;x wi pTAx pi
T
U1
i;x wi;x pA pi
2
3
Ui;y
wi f 0 0 1 0 gA1 pi Ui;y wi pTAy pi
2
3
T
U1
Ui;z
wi f 0 0 0 1 gA1 pi Ui;z wi pTAz pi
i;y wi;y pA pi
8i 2 Infl:Nodes
67
19
T
U1
i;z wi;z pA pi
23
2X
wi
6 i
6
6
A6
6
6
4
X
i
wi xi
X
wi x2i
i
X
3
wi yi
7
i
7
X
wi xi yi 7
7
7
i
7
X
2 5
wi yi
20
Q G BTG EBG
24
(a)
Fig. 2. Domain of inuence of node
(b)
(a) EFG; (b) FEM, for the same number of nodes and Gauss points.
68
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
associated indexing to access the entries of K dominate the total effort for the formulation of the global stiffness matrix [35].
The Qij for an isotropic material in 3D elasticity takes the form:
2
2
Q ij
33
BTi
E Bj
36 66 63
Ui;x
6
4 0
0
Ui;y
Ui;y
Ui;x
Ui;z
Ui;z
Ui;y
36 k
Ui;z 6
6
76 k
0 56
6
Ui;x 6
6
4
32
Uj;x
76 0
76
76
76 0
76
76 U
76 j;y
76
54 0
l
l
Uj;y
0
Uj;x
Uj;z
0 7
7
7
7
0 7
7
7
Uj;y 5
Uj;z 7
26
Uj;z 0 Uj;x
2
3
Ui;x Uj;x M Ui;y Uj;y l Ui;z Uj;z l
Ui;x Uj;y k Ui;y Uj;x l
Ui;x Uj;z k Ui;z Uj;x l
6
7
Ui;y Uj;x k Ui;x Uj;y l
Ui;y Uj;y M Ui;x Uj;x l Ui;z Uj;z l
Ui;y Uj;z k Ui;z Uj;y l
Q ij 4
5
33
Ui;z Uj;x k Ui;x Uj;z l
Ui;z Uj;y k Ui;y Uj;z l
Ui;z Uj;z M Ui;y Uj;y l Ui;x Uj;x l
The computations of Eq. (24) can be broken into smaller operations for each combination of inuenced nodes i, j belonging to the
domain of inuence of the Gauss point:
25
E and Bi/Bj are never formed. Instead three values for E, the two
Lam parameters k, l and the P-Wave modulus M = 2l + k and
three values for Bi, specically Ni,x, Ni,y, Ni,z, are stored. Since some
of the multiplications are repeated, the calculations in Eq. (26) can
be efciently performed with 30 multiplications and 12 additions.
3.3.3. Summation of Gauss point contributions
Contrary to FEM, where the stiffness matrices are built on the
element level by integrating over the element Gauss points before
69
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
Table 4
Computing time for the formulation of the stiffness matrix in the CPU implementations of the Gauss-point wise approach.
Example
dof
Gauss points
Time (s)
Proposed GP
2D-1
2D-2
2D-3
51,842
152,250
252,050
102,400
300,304
501,264
107
313
502
12
34
53
9
9
9
3D-1
3D-2
3D-3
27,783
59,049
107,811
64,000
140,608
262,144
2,374
6,328
13,302
241
616
1165
10
10
11
Table 5
Comparison of the proposed Gauss point-wise method for the formulation of the
stiffness matrix when using sparse and skyline format.
dof
Gauss points
Example
dof
Gauss points
2D-1
2D-2
2D-3
51,842
152,250
252,050
102,400
300,304
501,264
66,221,715
331,150,875
713,161,275
4,110,003
12,129,675
20,287,275
16
27
35
3D-1
3D-2
3D-3
27,783
59,049
107,811
64,000
140,608
262,144
136,041,444
486,852,444
1,343,011,428
21,734,532
49,932,576
95,696,604
6
10
14
Ratio
Conventional
GP
Example
Table 6
Number of stored stiffness elements when using skyline and sparse format.
Time (s)
Sparse
Ratio
Skyline
2D-1
2D-2
2D-3
51,842
152,250
252,050
102,400
300,304
501,264
12
34
53
7
20
31
1.6
1.7
1.7
3D-1
3D-2
3D-3
27,783
59,049
107,811
64,000
140,608
262,144
241
616
1,165
68
174
329
3.5
3.5
3.5
Ratio
Sparse
70
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
(a)
(b)
Fig. 5. Interacting nodes: (a) EFG; (b) FEM.
points and each Gauss point has a list of inuenced nodes. Therefore, each node looks for interacting nodes in the lists of inuenced
nodes of its Gauss points. Fig. 6 shows node A which is inuenced
71
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
Table 7
Computing time required for a naive identication of interacting nodes and their
shared Gauss points.
Naive
Example
Nodes
All combinations
2D-1
2D-2
2D-3
25,921
75,625
126,025
335,962,081
2,859,608,125
7,941,213,325
Interacting
1,033,981
3,051,325
5,103,325
Time (s)
771
6908
23,380
3D-1
3D-2
3D-3
9221
19,683
35,937
42,518,031
193,720,086
645,751,953
2,418,035
5,554,625
10,644,935
608
3021
16,290
Fig. 7. Identifying Interacting node pairs by considering Gauss points near the
border of the domain of inuence.
Table 9
Computing time for the identication of interacting nodes by only inspecting Gauss
points near the border.
Example
Table 8
Computing time for the identication of interacting nodes.
Example
Time (s)
Serial
Parallel
2D-1
2D-2
2D-3
1.5
4.5
9.8
0.2
0.7
1.6
3D-1
3D-2
3D-3
20.1
42.6
85.6
2.8
5.6
11.2
Parallel
2D-1
2D-2
2D-3
0.2
0.5
0.8
<0.1
<0.1
<0.1
3D-1
3D-2
3D-3
0.5
0.9
1.6
<0.1
0.2
0.3
Table 10
Computing time to identify the shared Gauss points of an interacting node pair.
Example
Time (s)
Serial
Time (s)
Serial
Parallel
2D-1
2D-2
2D-3
2.1
6.1
8.8
0.4
1.2
1.5
3D-1
3D-2
3D-3
46.6
135.6
315.7
7.4
18.8
45.8
to a vast reduction of the required amount of computing time compared to the naive approach (Table 7) as can be seen in Table 10.
For further improvement, regioning (Fig. 8) can be utilized and
the results are shown in Table 11. The Gauss regions may be the
same as those in the initialization phase (Section 7) or can be different. Shared Gauss points are only searched within regions
shared by both node pairs. In both intersection identications, with
and without regions, each node pair can identify its shared Gauss
points independently of other node pairs, so parallelism offers very
good accelerations, as shown in Tables 10 and 11.
In the 2D examples considered, each region has 16 Gauss points
and the results are slightly worse with regioning because skipping
16 Gauss points per skipped region was not enough to compensate
for the added overhead. Higher number of Gauss points per region
eventually makes regioning worthwhile in the 2D examples. In the
72
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
Fig. 8. Region-wise search for interacting nodes. Only the shaded regions are inspected for shared Gauss points..
Table 11
Computing time to identify the shared Gauss points of an interacting node pair with
regioning.
Example
Time (s)
Serial
Table 12
Number of interacting node pairs in EFG and FEM.
Example
Nodes
2D-1
2D-2
2D-3
25,921
75,625
126,025
1,033,981
3,051,325
5,103,325
128,641
376,477
627,997
8.0
8.1
8.1
3D-1
3D-2
3D-3
9,221
19,683
35,937
2,418,035
5,554,625
10,644,935
118,121
256,361
474,305
20.5
21.7
22.4
EFG
Parallel
2D-1
2D-2
2D-3
2.4
6.8
11.0
0.6
1.6
2.8
3D-1
3D-2
3D-3
24.9
57.9
118.0
4.8
10.7
22.4
Ratio
FEM
3D examples, the extra dimension and the fact that each region has
64 Gauss points makes regioning more important. Regioning benets become greater as the number of Gauss points per region
increases.
4.2. Comparison to FEM for equal number of nodes and Gauss points
NZ 4 NP n2D;
where NP is the number of interacting node pairs and n is the number of nodes.
NZ 9 NP 3 n3D;
27
73
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
Table 13
Total Gauss point contributions for EFG and FEM.
Example
Gauss points
Total GP contributions
EFG
Ratio
FEM
Kij
Q ij
2D-1
2D-2
2D-3
102,400
300,304
501,264
32,725,544
96,647,624
161,681,224
1,024,000
3,003,040
5,012,640
32.0
32.2
32.3
3D-1
3D-2
3D-3
64,000
140,608
262,144
408,317,728
942,981,088
1,813,006,048
2,304,000
5,061,888
9,437,184
177.2
186.3
192.1
X T
Bi EBj :
28
The calculation of Qij matrices is performed as described in Section 10. The matrices Bi, Bj contain the shape function derivative
values calculated in the rst phase and each pre-calculated shape
function derivative is used a large number of times.
Both phases are amenable to parallelization, the rst with respect to Gauss points and the second with respect to interacting
node pairs, and involve no race conditions or the need for synchronization, which makes the interacting node pairs approach an ideal
method for massively parallel systems.
4.4. Sparse matrix format for the interacting node pairs approach
The nal values of each Kij submatrix are calculated and written
once in the corresponding positions of the global stiffness matrix
instead of being gradually updated as in the Gauss point-wise approach. Apart from the reduced number of accesses to the matrix,
this method does not require lookups, which allows the use of a
simpler and more efcient sparse matrix format, like the coordinate list (COO) format [38]. A simple implementation with three
arrays, one for row indexes, one for column indexes and one for
the value of each non-zero matrix coefcient is sufcient and is
easily applied both in the CPU and the GPU, while also requiring
less memory than a format that allows lookups. Note that the node
pair-wise method has no indexing time due to its nature, in contrast to the Gauss point-wise approach as described in Section 11.
This is why the computing time shown for the interacting node
pairs approach in Table 14 are lower in the CPU implementations
presented in Section 18.
4.5. Parallelization features of the interacting node pairs approach
The interacting node pairs approach has certain advantages
compared to the Gauss point-wise approach. The most important
Table 14
Computing time for the formulation of the stiffness matrix in the serial CPU implementations of the Gauss point-wise (GP) and node pair-wise (NP) approaches.
Example
dof
Gauss points
Proposed GP
Proposed NP
2D-1
2D-2
2D-3
51,842
152,250
252,050
102,400
300,304
501,264
107
313
502
12
34
53
11
28
47
3D-1
3D-2
3D-3
27,783
59,049
107,811
64,000
140,608
262,144
2374
6328
13,302
241
616
1165
134
328
645
Fig. 9. Schematic representation of the contribution of 3 Gauss points to the global stiffness matrix.
74
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
gathers all contributions from the Gauss points and writes to a specic memory location accessed by no other thread. Thus, this
method requires no synchronization or atomic operations. An
important benet of this approach is the indexing cost of the stiffness matrix elements. In the Gauss point-wise method each stiffness matrix element is updated a large number of times while in
the proposed interacting node pair approach the nal value is calculated and written only once.
5. GPU programming
Fig. 10. Scatter parallelism required for the Gauss point-wise approach.
Fig. 11. Gather parallelism implemented in the interacting node pairs approach.
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
75
Fig. 12. GPU processing ow paradigm: (1) data transfer to GPU memory, (2) CPU
instructions to GPU, (3) GPU parallel processing, (4) result transfer to main memory.
rst transferred from the host memory to the device global memory. Also, output data from the device needs to be placed here before being passed over to the host. Global memory is large in size
and off-chip. Constant memory also provides interaction with the
host, but the device is only allowed to read from it and not write
to it. It is small, but provides fast access for data needed by all
threads.
There are also other types of memories which cannot be accessed by the host. Data in these memories can be accessed in a
highly efcient manner. The memories differ depending on which
threads have access to them. Registers [CUDA] or private memories
[openCL] are thread-bound meaning that each thread can only
access its own registers. Registers are typically used for holding
variables that need to be accessed frequently but that do not need
to be shared with other threads. Shared memories [CUDA] or local
76
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
Fig. 17. Phase 1 concurrency level for the calculation of shape function values in
the GPU.
77
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
Table 15
Computing time for the formulation of the stiffness matrix in the GPU implementation of the interacting node-pair approach.
Example
dof
Gauss points
Total
51,842
152,250
252,050
102,400
300,304
501,264
0.05
0.13
0.21
0.19
0.56
0.89
0.2
0.7
1.1
3D-1
3D-2
3D-3
27,783
59,049
107,811
64,000
140,608
262,144
0.17
0.32
0.62
2.41
6.17
12.31
2.6
6.5
12.9
Table 16
Relative speedup
implementations.
Example
tested for the same 2D and 3D elasticity problems already used for
testing throughout this paper. The geometric domains of these
problems maximize the number of correlations and consequently
the computational cost for the given number of nodes. The examples are run on the following hardware. CPU: Core i7-980X which
has six physical cores (12 logical cores) at 3.33 GHz and 12 MB
cache. GPU: is a GeForce GTX680 with 1536 CUDA cores and
2 GB GDDR5 memory.
The performance of the Gauss point-wise (GP) and node pairwise (NP) approaches in the CPU are given in Table 14. The proposed Gauss point-wise approach is compared with the conventional one without the previously described improvements. The
performance of the GPU implementation of the node pair-wise
method is shown in Table 15. Speedup ratios of the GPU implementation compared to the CPU implementations is given in Table 16. The total elapsed time for the initialization phase and
formulation of the stiffness matrix with the conventional way is
shown in Table 17. By applying all techniques proposed in this paper and utilizing one GPU, we can achieve the results of Table 18,
which also shows the speedup compared to the conventional
implementation.
The identication of node pairs is performed in the CPU and the
formulation of the stiffness matrix in the GPU. Therefore, it is possible to have the CPU producing tasks (node pairs) and the GPU
processing them concurrently. This producerconsumer model
can be expanded to utilize all available hardware and is shown
Kernel 2
2D-1
2D-2
2D-3
ratios
of
GPU
implementation
compared
to
the
CPU
Proposed GP
Proposed NP
2D-1
2D-2
2D-3
450
457
456
50
50
48
46
41
43
3D-1
3D-2
3D-3
921
975
1,028
93
95
90
52
50
50
Table 17
Total serial CPU computing time for the conventional initialization phase and
formulation of the stiffness matrix.
Example
Formulation
Total
2D-1
2D-2
2D-3
23
300
836
107
313
502
130
613
1338
3D-1
3D-2
3D-3
7
45
157
2374
6328
13,302
2381
6373
13,459
Fig. 19. Phase 2 concurrency level for the calculation of stiffness coefcients in
the GPU.
78
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
Table 18
Best achieved total time for the initialization phase and formulation of the stiffness
matrix.
Example
Speedup
Node pairs
CPU parallel
Formulation
GPU
Total
2D-1
2D-2
2D-3
0.5
1.0
1.4
0.6
1.6
2.8
0.2
0.7
1.1
1.4
3.2
5.3
93
191
252
3D-1
3D-2
3D-3
0.9
1.7
3.3
4.8
10.9
22.7
2.6
6.5
12.9
8.2
19.1
38.9
289
334
346
Fig. 20. Schematic representation of the processing of node pairs utilizing all
available hardware.
Table 19
Best achieved total time for the initialization phase and formulation of the stiffness
matrix when using CPU and GPU concurrently.
Example
NP + formulation
Hybrid CPU/GPU
Fig. 23. Turbine blade example: Position of the 29,135 Gauss points.
Speedup
Total
2D-1
2D-2
2D-3
0.5
1.0
1.4
0.6
1.6
2.9
1.2
2.6
4.3
110
236
310
3D-1
3D-2
3D-3
0.9
1.7
3.3
5.0
11.5
24.0
5.9
13.2
27.3
403
481
494
Table 20
Turbine blade example: Total serial CPU computing time for the initialization phase
and formulation of the stiffness matrix with the Gauss point-wise method.
CPU time (s)
Conventional
Proposed
Initialization
Formulation
Total
3
1.4
341
36.9
344
38.3
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380
Table 21
Turbine blade example: Best achieved total time for the initialization phase and
formulation of the stiffness matrix.
Best achieved time (s)
Speedup
Initialization
CPU parallel
Node pairs
CPU parallel
Formulation
GPU
Total
0.5
0.7
0.5
1.7
79
Acknowledgments
This work has been supported by the European Research Council Advanced Grant MASTERMastering the computational challenges in numerical modeling and optimum design of CNT
reinforced composites (ERC-2011-ADG_20110209).
202
80
A. Karatarakis et al. / Comput. Methods Appl. Mech. Engrg. 258 (2013) 6380