# Locate the next design point for a (D)GP emulator or a bundle of (D)GP emulators using PEI

Source:`R/pei.R`

`pei.Rd`

This function searches from a candidate set to locate the next design point(s) to be added to a (D)GP emulator or a bundle of (D)GP emulators using the Pseudo Expected Improvement (PEI), see the reference below.

## Usage

```
pei(object, x_cand, ...)
# S3 method for gp
pei(object, x_cand, pseudo_points = NULL, batch_size = 1, ...)
# S3 method for dgp
pei(
object,
x_cand,
pseudo_points = NULL,
batch_size = 1,
workers = 1,
threading = FALSE,
aggregate = NULL,
...
)
# S3 method for bundle
pei(
object,
x_cand,
pseudo_points = NULL,
batch_size = 1,
workers = 1,
threading = FALSE,
aggregate = NULL,
...
)
```

## Arguments

- object
can be one of the following:

the S3 class

`gp`

.the S3 class

`dgp`

.the S3 class

`bundle`

.

- x_cand
a matrix (with each row being a design point and column being an input dimension) that gives a candidate set from which the next design point(s) are determined. If

`object`

is an instance of the`bundle`

class,`x_cand`

could also be a list with the length equal to the number of emulators contained in the`object`

. Each slot in`x_cand`

is a matrix that gives a candidate set for each emulator included in the bundle. See*Note*section below for further information.- ...
any arguments (with names different from those of arguments used in

`pei()`

) that are used by`aggregate`

or`gp()`

(for emulating the ES-LOO errors) can be passed here.- pseudo_points
an optional matrix (with columns being input dimensions) that gives the pseudo input points for PEI calculations. See the reference below for further details about the pseudo points. When

`object`

is an instance of the`bundle`

class,`pseudo_points`

can also be a list with the length equal to the number of emulators in the bundle. Each element in the list is a matrix that gives the the pseudo input points for the corresponding emulator in the bundle. Defaults to`NULL`

. When`pei()`

is used in`design()`

,`pseudo_points`

will be automatically generated by`design()`

.- batch_size
an integer that gives the number of design points to be chosen. Defaults to

`1`

.- workers
the number of workers/cores to be used for the criterion calculation. If set to

`NULL`

, the number of workers is set to`(max physical cores available - 1)`

. Defaults to`1`

.- threading
a bool indicating whether to use the multi-threading to accelerate the criterion calculation for a DGP emulator. Turning this option on could improve the speed of criterion calculations when the DGP emulator is built with a moderately large number of training data points and the Matérn-2.5 kernel.

- aggregate
an R function that aggregates scores of the PEI across different output dimensions (if

`object`

is an instance of the`dgp`

class) or across different emulators (if`object`

is an instance of the`bundle`

class). The function should be specified in the following basic form:the first argument is a matrix representing scores. The rows of the matrix correspond to different design points. The number of columns of the matrix equals to:

the emulator output dimension if

`object`

is an instance of the`dgp`

class; orthe number of emulators contained in

`object`

if`object`

is an instance of the`bundle`

class.

the output should be a vector that gives aggregations of scores at different design points.

Set to

`NULL`

to disable the aggregation. Defaults to`NULL`

.

## Value

If

`object`

is an instance of the`gp`

class, a vector is returned with the length equal to`batch_size`

, giving the positions (i.e., row numbers) of next design points from`x_cand`

.If

`object`

is an instance of the`dgp`

class, a matrix is returned with row number equal to`batch_size`

and column number equal to one (if`aggregate`

is not`NULL`

) or the output dimension (if`aggregate`

is`NULL`

), giving positions (i.e., row numbers) of next design points from`x_cand`

to be added to the DGP emulator across different outputs.If

`object`

is an instance of the`bundle`

class, a matrix is returned with row number equal to`batch_size`

and column number equal to the number of emulators in the bundle, giving positions (i.e., row numbers) of next design points from`x_cand`

to be added to individual emulators.

## Details

See further examples and tutorials at https://mingdeyu.github.io/dgpsi-R/.

## Note

The column order of the first argument of

`aggregate`

must be consistent with the order of emulator output dimensions (if`object`

is an instance of the`dgp`

class), or the order of emulators placed in`object`

if`object`

is an instance of the`bundle`

class;If

`x_cand`

is supplied as a list when`object`

is an instance of`bundle`

class and a`aggregate`

function is provided, the matrices in`x_cand`

must have common rows (i.e., the candidate sets of emulators in the bundle have common input locations) so the`aggregate`

function can be applied.The function is only applicable to DGP emulators without likelihood layers.

Any R vector detected in

`x_cand`

and`pseudo_points`

will be treated as a column vector and automatically converted into a single-column R matrix.

## References

Mohammadi, H., Challenor, P., Williamson, D., & Goodfellow, M. (2022). Cross-validation-based adaptive sampling for Gaussian process models. *SIAM/ASA Journal on Uncertainty Quantification*, **10(1)**, 294-316.

## Examples

```
if (FALSE) {
# load packages and the Python env
library(lhs)
library(dgpsi)
# construct a 1D non-stationary function
f <- function(x) {
sin(30*((2*x-1)/2-0.4)^5)*cos(20*((2*x-1)/2-0.4))
}
# generate the initial design
X <- maximinLHS(10,1)
Y <- f(X)
# training a 2-layered DGP emulator with the global connection off
m <- dgp(X, Y, connect = F)
# generate a candidate set
x_cand <- maximinLHS(200,1)
# locate the next design point using PEI
next_point <- pei(m, x_cand = x_cand)
X_new <- x_cand[next_point,,drop = F]
# obtain the corresponding output at the located design point
Y_new <- f(X_new)
# combine the new input-output pair to the existing data
X <- rbind(X, X_new)
Y <- rbind(Y, Y_new)
# update the DGP emulator with the new input and output data and refit with 500 training iterations
m <- update(m, X, Y, refit = TRUE, N = 500)
# plot the LOO validation
plot(m)
}
```