
Locate the next design point for a (D)GP emulator or a bundle of (D)GP emulators using MICE
Source:R/mice.R
      mice.RdThis function searches from a candidate set to locate the next design point(s) to be added to a (D)GP emulator or a bundle of (D)GP emulators using the Mutual Information for Computer Experiments (MICE), see the reference below.
Usage
mice(object, ...)
# S3 method for class 'gp'
mice(
  object,
  x_cand = NULL,
  n_cand = 200,
  batch_size = 1,
  M = 50,
  nugget_s = 1e-06,
  workers = 1,
  limits = NULL,
  int = FALSE,
  ...
)
# S3 method for class 'dgp'
mice(
  object,
  x_cand = NULL,
  n_cand = 200,
  batch_size = 1,
  M = 50,
  nugget_s = 1e-06,
  workers = 1,
  limits = NULL,
  int = FALSE,
  aggregate = NULL,
  ...
)
# S3 method for class 'bundle'
mice(
  object,
  x_cand = NULL,
  n_cand = 200,
  batch_size = 1,
  M = 50,
  nugget_s = 1e-06,
  workers = 1,
  limits = NULL,
  int = FALSE,
  aggregate = NULL,
  ...
)Arguments
- object
- can be one of the following: - the S3 class - gp.
- the S3 class - dgp.
- the S3 class - bundle.
 
- ...
- any arguments (with names different from those of arguments used in - mice()) that are used by- aggregatecan be passed here.
- x_cand
- a matrix (with each row being a design point and column being an input dimension) that gives a candidate set from which the next design point(s) are determined. If - objectis an instance of the- bundleclass and- aggregateis not supplied,- x_candcan also be a list. The list must have a length equal to the number of emulators in- object, with each element being a matrix representing the candidate set for a corresponding emulator in the bundle. Defaults to- NULL.
- n_cand
- an integer specifying the size of the candidate set to be generated for selecting the next design point(s). This argument is used only when - x_candis- NULL. Defaults to- 200.
- batch_size
- an integer that gives the number of design points to be chosen. Defaults to - 1.
- M
- the size of the conditioning set for the Vecchia approximation in the criterion calculation. This argument is only used if the emulator - objectwas constructed under the Vecchia approximation. Defaults to- 50.
- nugget_s
- the value of the smoothing nugget term used by MICE. Defaults to - 1e-6.
- workers
- the number of processes to be used for the criterion calculation. If set to - NULL, the number of processes is set to- max physical cores available %/% 2. Defaults to- 1.
- limits
- a two-column matrix that gives the ranges of each input dimension, or a vector of length two if there is only one input dimension. If a vector is provided, it will be converted to a two-column row matrix. The rows of the matrix correspond to input dimensions, and its first and second columns correspond to the minimum and maximum values of the input dimensions. This argument is only used when - x_cand = NULL. Defaults to- NULL.
- int
- a bool or a vector of bools that indicates if an input dimension is an integer type. If a single bool is given, it will be applied to all input dimensions. If a vector is provided, it should have a length equal to the input dimensions and will be applied to individual input dimensions. This argument is only used when - x_cand = NULL. Defaults to- FALSE.
- aggregate
- an R function that aggregates scores of the MICE across different output dimensions (if - objectis an instance of the- dgpclass) or across different emulators (if- objectis an instance of the- bundleclass). The function should be specified in the following basic form:- the first argument is a matrix representing scores. The rows of the matrix correspond to different design points. The number of columns of the matrix equals to: - the emulator output dimension if - objectis an instance of the- dgpclass; or
- the number of emulators contained in - objectif- objectis an instance of the- bundleclass.
 
- the output should be a vector that gives aggregate scores at different design points. 
 - Set to - NULLto disable aggregation. Defaults to- NULL.
Value
- If - x_candis not- NULL:- When - objectis an instance of the- gpclass, a vector of length- batch_sizeis returned, containing the positions (row numbers) of the next design points from- x_cand.
- When - objectis an instance of the- dgpclass, a vector of length- batch_size * Dis returned, containing the positions (row numbers) of the next design points from- x_candto be added to the DGP emulator.- Dis the number of output dimensions of the DGP emulator if no likelihood layer is included.
- For a DGP emulator with a - Heteroor- NegBinlikelihood layer,- D = 2.
- For a DGP emulator with a - Categoricallikelihood layer,- D = 1for binary output or- D = Kfor multi-class output with- Kclasses.
 
- When - objectis an instance of the- bundleclass, a matrix is returned with- batch_sizerows and a column for each emulator in the bundle, containing the positions (row numbers) of the next design points from- x_candfor individual emulators.
 
- If - x_candis- NULL:- When - objectis an instance of the- gpclass, a matrix with- batch_sizerows is returned, giving the next design points to be evaluated.
- When - objectis an instance of the- dgpclass, a matrix with- batch_size * Drows is returned, where:- Dis the number of output dimensions of the DGP emulator if no likelihood layer is included.
- For a DGP emulator with a - Heteroor- NegBinlikelihood layer,- D = 2.
- For a DGP emulator with a - Categoricallikelihood layer,- D = 1for binary output or- D = Kfor multi-class output with- Kclasses.
 
- When - objectis an instance of the- bundleclass, a list is returned with a length equal to the number of emulators in the bundle. Each element of the list is a matrix with- batch_sizerows, where each row represents a design point to be added to the corresponding emulator.
 
Details
See further examples and tutorials at https://mingdeyu.github.io/dgpsi-R/.
Note
The first column of the matrix supplied to the first argument of aggregate must correspond to the first output dimension of the DGP emulator
if object is an instance of the dgp class, and so on for subsequent columns and dimensions. If object is an instance of the bundle class,
the first column must correspond to the first emulator in the bundle, and so on for subsequent columns and emulators.
References
Beck, J., & Guillas, S. (2016). Sequential design with mutual information for computer experiments (MICE): emulation of a tsunami model. SIAM/ASA Journal on Uncertainty Quantification, 4(1), 739-766.
Examples
if (FALSE) { # \dontrun{
# load packages and the Python env
library(lhs)
library(dgpsi)
# construct a 1D non-stationary function
f <- function(x) {
 sin(30*((2*x-1)/2-0.4)^5)*cos(20*((2*x-1)/2-0.4))
}
# generate the initial design
X <- maximinLHS(10,1)
Y <- f(X)
# training a 2-layered DGP emulator with the global connection off
m <- dgp(X, Y, connect = F)
# generate a candidate set
x_cand <- maximinLHS(200,1)
# locate the next design point using MICE
next_point <- mice(m, x_cand = x_cand)
X_new <- x_cand[next_point,,drop = F]
# obtain the corresponding output at the located design point
Y_new <- f(X_new)
# combine the new input-output pair to the existing data
X <- rbind(X, X_new)
Y <- rbind(Y, Y_new)
# update the DGP emulator with the new input and output data and refit
m <- update(m, X, Y, refit = TRUE)
# plot the LOO validation
plot(m)
} # }