library(maximin)
6 Computer experiment designs
6.1 Maximin design
Let us generate a 4-run maximin design in \([0, 1]^2\) for \(p=2\).
We will use the maximin()
function of the maximin
library
For generating a maximin design, we need to specify an initial design, and the function will add points to the design to develop a design that is as close to a maximin design as possible.
Let us add the origin as a point in the initial design.
<- matrix(c(0,0),1,2) D_initial
set.seed(3)
<- maximin(n = 3, p=2, Xorig = D_initial)$Xf D
par(pty="s") # Making a square plot (maintaining aspect ratio)
plot(D, pch = 16)
As expected, the design is the four corners of the feasible design space as the minimum distance between any two design points is the largest in this case.
Note that the algorithm that maximizes the minimum distance between points is an iterative algorithm that does not guarantee to find the global optimum. The more iterations it runs, the higher is the likelihood to find the global optimum. One can increase the number of iterations of the optimization algorithm, and / or execute the code multiple times, and compare the minimum distance between points to obtain a better maximin design.
Let us attempt to obtain a 5-run maximin design in \([0, 1]^2\) for \(p=2\).
set.seed(3)
<- maximin(n = 4, p=2, Xorig = D_initial)
D par(pty="s")
plot(D$Xf, pch = 16)
Let us compute the maximin distance for this design.
# Function to obtain the minimum distance between
# a point and a design
<- function(x, D)
minimum_distance
{<- (colSums((t(D)- x)^2))
sq_distances == 0] <- Inf
sq_distances[sq_distances return(sqrt(min(sq_distances)))
}min(apply(D$Xf, 1, minimum_distance, D = D$Xf))
[1] 0.4713757
Let us increase the iterations of the maximin()
function to see if that leads to a better design.
set.seed(3)
<- maximin(n = 4, p=2, Xorig = D_initial, T = 100)
D par(pty="s")
plot(D$Xf, pch = 16)
min(apply(D$Xf, 1, minimum_distance, D = D$Xf))
[1] 0.4714045
Increasing the number of iteration did not help. We can increase the iterations further. However, let us try to change the seed.
set.seed(4)
<- maximin(n = 4, p=2, Xorig = D_initial, T = 100)
D par(pty="s")
plot(D$Xf, pch = 16)
min(apply(D$Xf, 1, minimum_distance, D = D$Xf))
[1] 0.5962154
We got a better design, as the minimum distance between the points is higher. Let us increase the number of iterations to see if the design gets better.
set.seed(4)
<- maximin(n = 4, p=2, Xorig = D_initial, T = 200)
D par(pty="s")
plot(D$Xf, pch = 16)
min(apply(D$Xf, 1, minimum_distance, D = D$Xf))
[1] 0.6469957
The design gets better with increased iterations. Let us further increase the iterations.
set.seed(4)
<- maximin(n = 4, p=2, Xorig = D_initial, T = 400)
D par(pty="s")
plot(D$Xf, pch = 16)
min(apply(D$Xf, 1, minimum_distance, D = D$Xf))
[1] 0.6468853
The design did not get better. Let us change the seed again.
set.seed(5)
<- maximin(n = 4, p=2, Xorig = D_initial, T = 400)
D par(pty="s")
plot(D$Xf, pch = 16)
min(apply(D$Xf, 1, minimum_distance, D = D$Xf))
[1] 0.4714045
The design did not get better. Let us change use a for()
loop to try a lot of different seeds.
<- rep(0, 20)
min_dist for(s in 1:20)
{set.seed(s)
<- maximin(n = 4, p=2, Xorig = D_initial, T = 100)
D <- min(apply(D$Xf, 1, minimum_distance, D = D$Xf))
min_dist[s] }
The seed for the best design, i.e., the design with the maximum minimum distance between points is:
which.max(min_dist)
[1] 12
Let us plot the best design:
set.seed(12)
<- maximin(n = 4, p=2, Xorig = D_initial, T = 100)
D par(pty="s")
plot(D$Xf, pch = 16)
min(apply(D$Xf, 1, minimum_distance, D = D$Xf))
[1] 0.7071068
6.2 Maximin LHD
Let us generate a 4-run maximin LHD in \([0, 1]^2\) for \(p=2\).
We will use the maximinLHS()
function of the lhs
library
library(lhs)
set.seed(1)
<- maximinLHS(4, 2)
D par(pty="s")
plot(D, xlim = c(0, 1), ylim = c(0, 1), pch = 16)
min(apply(D, 1, minimum_distance, D = D))
[1] 0.4907556
Note that the design has 4 distinct projections for each of the two factors.
As with maximin designs, the function maximinLHS()
is an iterative algorithm that does not guarantee the global optimum. Thus, we need to run the algorithm with different seeds to find a good MaximinLHD.
<- rep(0, 20)
min_dist for(s in 1:20)
{set.seed(s)
<- maximinLHS(n = 4, k=2)
D <- min(apply(D, 1, minimum_distance, D = D))
min_dist[s] }
The seed for the best MaximinLHD is:
which.max(min_dist)
[1] 6
Let us plot the best MaximinLHD.
set.seed(6)
<- maximinLHS(4, 2)
D par(pty="s")
plot(D, xlim = c(0, 1), ylim = c(0, 1), pch = 16)
min(apply(D, 1, minimum_distance, D = D))
[1] 0.537882
6.2.1 MmLHD: Weakness
MaximinLHD have good projections in a single dimension, but projection properties in \(2, 3, ā¦, pā1\) dimensions may not be good.
Let us compare the projections of a 20-run 2-dimensional design with that of a 20-run 10-dimensional design on a two-dimensional subspace.
The best 20-run MmLHD for \(p=2\) is with seed:
<- rep(0, 10000)
min_dist for(s in 1:10000)
{set.seed(s)
<- maximinLHS(n = 20, k=2)
D <- min(apply(D, 1, minimum_distance, D = D))
min_dist[s]
}which.max(min_dist)
[1] 7312
The best 20-run MmLHD for \(p = 10\) is with seed:
<- rep(0, 10000)
min_dist for(s in 1:10000)
{set.seed(s)
<- maximinLHS(n = 20, k=10)
D <- min(apply(D, 1, minimum_distance, D = D))
min_dist[s]
}which.max(min_dist)
[1] 2541
Let us visualize the projections of the 20 run 2-dimensional MmLHD, and the 20-run 10-dimensional MmLHD on a 2-dimensional subspace, where the subspace is the subspace of the first two predictors.
set.seed(7312)
<- maximinLHS(20, 2)
D par(mfrow = c(1,2))
par(pty="s")
plot(D, pch = 16, main = "MmLHD(n = 20, p = 2)")
set.seed(2541)
<- maximinLHS(20, 10)
D par(pty="s")
plot(D[,1:2], pch = 16, main = "MmLHD(n = 20, p = 10)")
We observe that the projections of the 10-dimensional MmLHD are not as space-filling in the 2-dimensional subspace as the projections of the 2-dimensional MmLHD.
This issue motivates us to present a design that maximizes space-filling of the design projections in all multidimensional sub-spaces.
6.3 MaxProLHD & MaxPro
Let us generate a 20-run MaxProLHD and MaxPro designs in \([0, 1]^{10}\) for \(p=10\).
We will use the MaxProLHD()
and MaxPro()
functions of the MaxPro
library
library(MaxPro)
Let us find the best MaxProLHD design among 100 different MaxProLHDs. The seed that corresponds to the least MaxPro criterion is:
<- rep(0, 100)
maxpro_criterion for(s in 1:100)
{set.seed(s)
<- MaxProLHD(n = 20, p = 10)
Dm <- Dm$measure
maxpro_criterion[s]
}which.min(maxpro_criterion)
[1] 26
We will use the seed to visualize the projections of a 20-run MaxProLHD and MaxPro design in 10 dimensions on a 2-dimensional subspace.
set.seed(26)
<- MaxProLHD(20, 10)$Design
D1 <- MaxPro(D1)$Design
D2 par(mfrow = c(1,2))
par(pty="s")
plot(D1[,1:2], pch = 16, main = "MaxProLHD(n = 20, p = 10)")
par(pty="s")
plot(D2[,1:2], pch = 16, main = "MaxPro(n = 20, p = 10)")
Let us compare the 2-dimensional projections of the 20-run MaximinLHD with MaxPro design.
par(mfrow = c(1,2))
par(pty="s")
plot(D[,1:2], pch = 16, main = "MmLHD(n = 20, p = 10)")
par(pty="s")
plot(D2[,1:2], pch = 16, main = "MaxPro(n = 20, p = 10)")
Note that the projections of the MaxPro design in the 2-dimensional subspace are more space-filling than that of the MaximinLHD, as expected.
6.4 Low descrepancy sequences
We will use the library randtoolbox
to generate low-discrepancy sequences.
library(randtoolbox)
Loading required package: rngWELL
This is randtoolbox. For an overview, type 'help("randtoolbox")'.
6.4.1 Sobol
Let us visualize a 100-run 2-dimensional Sobol sequence.
<- sobol(100, 2)
D_sobol par(pty="s")
plot(D_sobol)
As we can see, low descrepancy sequences minimize the difference between the proportion of points in a subregion and the proportion of the volume of that subregion. However, they do not necessarily have good projections. The same is true for the Halton sequence in the next section.
6.4.2 Halton
Let us visualize a 100-run 2-dimensional Halton sequence.
<- halton(100, 2)
D_halton par(pty="s")
plot(D_halton)
6.5 Comparison of time taken by each design
Let us compare the time taken by each design to generate a 50-run design in 50 dimensions.
=50
n<- matrix(0, 10, 6)
time_measure
# Sobol
<- proc.time()[3]
time_start for(i in 1:10)
{<- sobol(n, 5)
D 1] <- proc.time()[3] - time_start
time_measure[i,
}
# Halton
<- proc.time()[3]
time_start for(i in 1:10)
{<- halton(n, 5)
D 2] <- proc.time()[3] - time_start
time_measure[i,
}
# Maximin
<- proc.time()[3]
time_start for(i in 1:10)
{<- matrix(rep(0, 5),1,5)
D_initial <- maximin(n = n-1, p=5, Xorig = D_initial)$Xf
D 3] <- proc.time()[3] - time_start
time_measure[i,
}
# MaximinLHD
<- proc.time()[3]
time_start for(i in 1:10)
{<- maximinLHS(n = n, k=5)
D 4] <- proc.time()[3] - time_start
time_measure[i,
}
# MaxProLHD
<- proc.time()[3]
time_start for(i in 1:10)
{<- MaxProLHD(n = n, p = 5)
Dm 5] <- proc.time()[3] - time_start
time_measure[i,
}
# MaxPro
<- proc.time()[3]
time_start for(i in 1:10)
{<- MaxProLHD(n = n, p = 5)$Design
D1 <- MaxPro(D1)$Design
d2 6] <- proc.time()[3] - time_start
time_measure[i,
}boxplot(time_measure, names = c("Sobol", "Halton", "Maximin", "MmLHD", "MProLHD",
"MaxPro"), ylab = "Time (in seconds)")
Maximin design takes the longest time as there are no constraints in the design criterion.
Let us remove the maximin design from the plot to visualize the comparison of other designs more clearly.
boxplot(time_measure[,-3], names = c("Sobol", "Halton", "MmLHD", "MProLHD",
"MaxPro"), ylab = "Time (in seconds)")
MaxProLHD and MaxPro take longer time than MmLHD as the MaxPro criterion is optimized.
Let us remove MaxProLHD and MaxPro from the plot to compare the time taken by MmLHD with low descrepancy sequences.
boxplot(time_measure[,-c(3,5:6)], names = c("Sobol", "Halton", "MmLHD"),
ylab = "Time (in seconds)")
As expected, MmLHD takes more time as it includes optimization of the MmLHD criterion.