# Locate the next design point for a (D)GP emulator or a bundle of (D)GP emulators using ALM

Source:`R/alm.R`

`alm.Rd`

This function searches from a candidate set to locate the next design point(s) to be added to a (D)GP emulator or a bundle of (D)GP emulators using the Active Learning MacKay (ALM), see the reference below.

## Usage

```
alm(object, x_cand, ...)
# S3 method for gp
alm(object, x_cand, batch_size = 1, M = 50, workers = 1, ...)
# S3 method for dgp
alm(object, x_cand, batch_size = 1, M = 50, workers = 1, aggregate = NULL, ...)
# S3 method for bundle
alm(object, x_cand, batch_size = 1, M = 50, workers = 1, aggregate = NULL, ...)
```

## Arguments

- object
can be one of the following:

the S3 class

`gp`

.the S3 class

`dgp`

.the S3 class

`bundle`

.

- x_cand
a matrix (with each row being a design point and column being an input dimension) that gives a candidate set from which the next design point(s) are determined. If

`object`

is an instance of the`bundle`

class,`x_cand`

could also be a list with the length equal to the number of emulators contained in the`object`

. Each slot in`x_cand`

is a matrix that gives a candidate set for each emulator included in the bundle. See*Note*section below for further information.- ...
any arguments (with names different from those of arguments used in

`alm()`

) that are used by`aggregate`

can be passed here.- batch_size
an integer that gives the number of design points to be chosen. Defaults to

`1`

.- M
the size of the conditioning set for the Vecchia approximation in the criterion calculation. This argument is only used if the emulator

`object`

was constructed under the Vecchia approximation. Defaults to`50`

.- workers
the number of processes to be used for the criterion calculation. If set to

`NULL`

, the number of processes is set to`max physical cores available %/% 2`

. Defaults to`1`

.- aggregate
an R function that aggregates scores of the ALM across different output dimensions (if

`object`

is an instance of the`dgp`

class) or across different emulators (if`object`

is an instance of the`bundle`

class). The function should be specified in the following basic form:the first argument is a matrix representing scores. The rows of the matrix correspond to different design points. The number of columns of the matrix equals to:

the emulator output dimension if

`object`

is an instance of the`dgp`

class; orthe number of emulators contained in

`object`

if`object`

is an instance of the`bundle`

class.

the output should be a vector that gives aggregations of scores at different design points.

Set to

`NULL`

to disable the aggregation. Defaults to`NULL`

.

## Value

If

`object`

is an instance of the`gp`

class, a vector is returned with the length equal to`batch_size`

, giving the positions (i.e., row numbers) of next design points from`x_cand`

.If

`object`

is an instance of the`dgp`

class, a matrix is returned with row number equal to`batch_size`

and column number equal to one (if`aggregate`

is not`NULL`

) or the output dimension (if`aggregate`

is`NULL`

), giving positions (i.e., row numbers) of next design points from`x_cand`

to be added to the DGP emulator across different outputs. If`object`

is a DGP emulator with either`Hetero`

or`NegBin`

likelihood layer, the returned matrix has two columns with the first column giving positions of next design points from`x_cand`

that correspond to the mean parameter of the normal or negative Binomial distribution, and the second column giving positions of next design points from`x_cand`

that correspond to the variance parameter of the normal distribution or the dispersion parameter of the negative Binomial distribution.If

`object`

is an instance of the`bundle`

class, a matrix is returned with row number equal to`batch_size`

and column number equal to the number of emulators in the bundle, giving positions (i.e., row numbers) of next design points from`x_cand`

to be added to individual emulators.

## Details

See further examples and tutorials at https://mingdeyu.github.io/dgpsi-R/.

## Note

The column order of the first argument of

`aggregate`

must be consistent with the order of emulator output dimensions (if`object`

is an instance of the`dgp`

class), or the order of emulators placed in`object`

if`object`

is an instance of the`bundle`

class;If

`x_cand`

is supplied as a list when`object`

is an instance of`bundle`

class and a`aggregate`

function is provided, the matrices in`x_cand`

must have common rows (i.e., the candidate sets of emulators in the bundle have common input locations) so the`aggregate`

function can be applied.Any R vector detected in

`x_cand`

will be treated as a column vector and automatically converted into a single-column R matrix.

## References

MacKay, D. J. (1992). Information-based objective functions for active data selection. *Neural Computation*, **4(4)**, 590-604.

## Examples

```
if (FALSE) {
# load packages and the Python env
library(lhs)
library(dgpsi)
# construct a 1D non-stationary function
f <- function(x) {
sin(30*((2*x-1)/2-0.4)^5)*cos(20*((2*x-1)/2-0.4))
}
# generate the initial design
X <- maximinLHS(10,1)
Y <- f(X)
# training a 2-layered DGP emulator with the global connection off
m <- dgp(X, Y, connect = F)
# generate a candidate set
x_cand <- maximinLHS(200,1)
# locate the next design point using ALM
next_point <- alm(m, x_cand = x_cand)
X_new <- x_cand[next_point,,drop = F]
# obtain the corresponding output at the located design point
Y_new <- f(X_new)
# combine the new input-output pair to the existing data
X <- rbind(X, X_new)
Y <- rbind(Y, Y_new)
# update the DGP emulator with the new input and output data and refit
m <- update(m, X, Y, refit = TRUE)
# plot the LOO validation
plot(m)
}
```