Está en la página 1de 11

Space-Filling DOEs

• Design of experiments (DOE) for noisy data tend


to place points on the boundary of the domain.
• When the error in the surrogate is due to
unknown functional form, space filling designs
are more popular.
• These designs use values of variables inside
range instead of at boundaries
• Latin hypercubes uses as many levels as points
• Space-filling term is appropriate only for low
dimensional spaces.
• For 10 dimensional space, need 1024 points to
have one per orthant.
Monte Carlo sampling
• Regular, grid-like DOE runs the risk of deceptively
accurate fit, so randomness appeals.
• Given a region in design space, we can assign a
uniform distribution to the region and sample
points to generate DOE.
• It is likely, though, that some regions will be poorly
sampled
• In 5-dimensional space, with 32 sample points,
what is the chance that all orthants will be
occupied?
– (31/32)(30/32)…(1/32)=1.8e-13.
Example of MC sampling
1 3

0.8
2
0.6

0.4
1
0.2

0 0
0 0.5 1 0 0.5 1

• With 20 points there is


4
evidence of both clamping and
holes
2
• The histogram of x1 (left) and
x2 (above) are not that good
0
0 0.5 1 either.
Latin Hypercube sampling
• Each variable range divided into ny equal
probability intervals. One point at each interval.

1 2
2 3
3 4
4 1
5 5
Latin Hypercube definition
matrix
• For n points with m variables: m by n matrix, with
each column a permutation of 1,…,n
• Examples 1 2 4 
1 3  4 1 2
2 1  
  3 3 3
 3 2   
2 4 1 
• Points are better distributed for each variable,
but can still have holes in m-dimensional space.
Improved LHS
• Since some LHS designs are better than others, it is possible to try
many permutations. What criterion to use for choice?
• One popular criterion is minimum distance between points
(maximize). Another is correlation between variables (minimize).
1

0.9
• Matlab lhsdesign uses by
0.8
default 5 iterations to look for
“best” design. 0.7

• The blue circles were obtained 0.6

with the minimum distance 0.5

criterion. Correlation coefficient 0.4


is -0.7. 0.3
• The red crosses were obtained
0.2
with correlation criterion, the
coefficient is -0.055. 0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
More iterations
• With 5,000 iterations the two sets of designs improve.
• The blue circles, maximizing minimum distance, still have a
correlation coefficient of 0.236 compared to 0.042 for the red
crosses.
1

0.9

• With more iterations, 0.8

maximizing the 0.7

minimum distance 0.6


also reduces the size 0.5
of the holes better.
0.4
• Note the large holes
0.3
for the crosses around
(0.45,0.75) and 0.2

around the two left 0.1

corners. 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Reducing randomness further
• We can reduce
randomness further
1

by putting the point at 0.9

the center of the box. 0.8

0.7

• Typical results are


0.6

0.5

shown in the figure. 0.4

0.3

• With 10 points, all will 0.2

0.1

be at 0.05, 0.15, 0.25, 0


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

and so on.
Empty space
• In higher dimensions, the danger of large holes is greater. The figure is taken from
paper by Goel et al. (details in notes). It compares LHS design on right with D-optimal
design (optimal for noisy data).
• Instead of maximizing minimum distance it seems that it would be better to
minimize the volume of the largest void. Why don’t we do that?

Figure 2. Illustration of the largest spherical empty space inside the three-dimensional design space
(20 points): (a) D-optimal design and (b) LHS design.
Mixed designs
• D-optimal designs may leave much space
inside.
• LHS designs may leave out the boundary and
lead to large extrapolation errors.
• It may be desirable to combine the two.
• In low dimensional spaces you can add the
vertices to LHS designs.
• In higher dimensional spaces you can generate
a larger LHS design and choose a D-optimal
subset.
Problems
• Write a routine to generate LHS designs
and iterate using the two criteria and
compare how well you do against
lhsdesign for 10 points in 2 dimensions.
• Compare the maximum minimum distance
obtained with 1,000 iterations of lhsdesign
when you generate (n+1)(n+2) points in n
dimensions (typical number used to fit a
quadratic polynomial), for n=2, 4, 6.

También podría gustarte