# Locate the next design point for a (D)GP emulator or a bundle of (D)GP emulators using MICE

Source:`R/mice.R`

`mice.Rd`

This function searches from a candidate set to locate the next design point(s) to be added to a (D)GP emulator or a bundle of (D)GP emulators using the Mutual Information for Computer Experiments (MICE), see the reference below.

## Usage

```
mice(object, x_cand, ...)
# S3 method for gp
mice(object, x_cand, batch_size = 1, nugget_s = 1e-06, workers = 1, ...)
# S3 method for dgp
mice(
object,
x_cand,
batch_size = 1,
nugget_s = 1e-06,
workers = 1,
threading = FALSE,
aggregate = NULL,
...
)
# S3 method for bundle
mice(
object,
x_cand,
batch_size = 1,
nugget_s = 1e-06,
workers = 1,
threading = FALSE,
aggregate = NULL,
...
)
```

## Arguments

- object
can be one of the following:

the S3 class

`gp`

.the S3 class

`dgp`

.the S3 class

`bundle`

.

- x_cand
a matrix (with each row being a design point and column being an input dimension) that gives a candidate set from which the next design point(s) are determined. If

`object`

is an instance of the`bundle`

class,`x_cand`

could also be a list with the length equal to the number of emulators contained in the`object`

. Each slot in`x_cand`

is a matrix that gives a candidate set for each emulator included in the bundle. See*Note*section below for further information.- ...
any arguments (with names different from those of arguments used in

`mice()`

) that are used by`aggregate`

can be passed here.- batch_size
an integer that gives the number of design points to be chosen. Defaults to

`1`

.- nugget_s
the value of the smoothing nugget term used by MICE. Defaults to

`1e-6`

.- workers
the number of workers/cores to be used for the criterion calculation. If set to

`NULL`

, the number of workers is set to`(max physical cores available - 1)`

. Defaults to`1`

.- threading
a bool indicating whether to use the multi-threading to accelerate the criterion calculation for a DGP emulator. Turning this option on could improve the speed of criterion calculations when the DGP emulator is built with a moderately large number of training data points and the Matérn-2.5 kernel.

- aggregate
an R function that aggregates scores of the MICE across different output dimensions (if

`object`

is an instance of the`dgp`

class) or across different emulators (if`object`

is an instance of the`bundle`

class). The function should be specified in the following basic form:the first argument is a matrix representing scores. The rows of the matrix correspond to different design points. The number of columns of the matrix equals to:

the emulator output dimension if

`object`

is an instance of the`dgp`

class; orthe number of emulators contained in

`object`

if`object`

is an instance of the`bundle`

class.

the output should be a vector that gives aggregations of scores at different design points.

Set to

`NULL`

to disable the aggregation. Defaults to`NULL`

.

## Value

If

`object`

is an instance of the`gp`

class, a vector is returned with the length equal to`batch_size`

, giving the positions (i.e., row numbers) of next design points from`x_cand`

.If

`object`

is an instance of the`dgp`

class, a matrix is returned with row number equal to`batch_size`

and column number equal to one (if`aggregate`

is not`NULL`

) or the output dimension (if`aggregate`

is`NULL`

), giving positions (i.e., row numbers) of next design points from`x_cand`

to be added to the DGP emulator across different outputs. If`object`

is a DGP emulator with either`Hetero`

or`NegBin`

likelihood layer, the returned matrix has two columns with the first column giving positions of next design points from`x_cand`

that correspond to the mean parameter of the normal or negative Binomial distribution, and the second column giving positions of next design points from`x_cand`

that correspond to the variance parameter of the normal distribution or the dispersion parameter of the negative Binomial distribution.If

`object`

is an instance of the`bundle`

class, a matrix is returned with row number equal to`batch_size`

and column number equal to the number of emulators in the bundle, giving positions (i.e., row numbers) of next design points from`x_cand`

to be added to individual emulators.

## Details

See further examples and tutorials at https://mingdeyu.github.io/dgpsi-R/.

## Note

The column order of the first argument of

`aggregate`

must be consistent with the order of emulator output dimensions (if`object`

is an instance of the`dgp`

class), or the order of emulators placed in`object`

if`object`

is an instance of the`bundle`

class;If

`x_cand`

is supplied as a list when`object`

is an instance of`bundle`

class and a`aggregate`

function is provided, the matrices in`x_cand`

must have common rows (i.e., the candidate sets of emulators in the bundle have common input locations) so the`aggregate`

function can be applied.Any R vector detected in

`x_cand`

will be treated as a column vector and automatically converted into a single-column R matrix.

## References

Beck, J., & Guillas, S. (2016). Sequential design with mutual information for computer experiments (MICE): emulation of a tsunami model.
*SIAM/ASA Journal on Uncertainty Quantification*, **4(1)**, 739-766.

## Examples

```
if (FALSE) {
# load packages and the Python env
library(lhs)
library(dgpsi)
# construct a 1D non-stationary function
f <- function(x) {
sin(30*((2*x-1)/2-0.4)^5)*cos(20*((2*x-1)/2-0.4))
}
# generate the initial design
X <- maximinLHS(10,1)
Y <- f(X)
# training a 2-layered DGP emulator with the global connection off
m <- dgp(X, Y, connect = F)
# generate a candidate set
x_cand <- maximinLHS(200,1)
# locate the next design point using MICE
next_point <- mice(m, x_cand = x_cand)
X_new <- x_cand[next_point,,drop = F]
# obtain the corresponding output at the located design point
Y_new <- f(X_new)
# combine the new input-output pair to the existing data
X <- rbind(X, X_new)
Y <- rbind(Y, Y_new)
# update the DGP emulator with the new input and output data and refit with 500 training iterations
m <- update(m, X, Y, refit = TRUE, N = 500)
# plot the LOO validation
plot(m)
}
```