Linked (D)GP emulator construction

This function constructs a linked (D)GP emulator for a model chain or network.

Usage

lgp(struc, emulators = NULL, B = 10, activate = TRUE, verb = TRUE, id = NULL)

Arguments

struc

the structure of the linked emulator, which can take one of two forms:

a list contains L (the number of layers in a systems of computer models) sub-lists, each of which represents a layer and contains (D)GP emulators (represented by instances of S3 class gp or dgp) of computer models. The sub-lists are placed in the list in the same order of the specified computer model system's hierarchy. This option is deprecated and will be removed in the next release.
a data frame that defines the connection structure between emulators in the linked system, with the following columns:
- From_Emulator: the ID of the emulator providing the output. This ID must match the id slot in the corresponding emulator object (produced by gp() or dgp()) within emulators argument of lgp(), or it should be special value "Global", indicating the global inputs to the model chain or network. The id slot is either automatically generated by gp() or dgp(), or can be manually specified via the id argument in these functions or set with the set_id() function.
- To_Emulator: the ID of the emulator receiving the input, also matching the id slot in the corresponding emulator object.
- From_Output: a single integer specifying the output dimension of the From_Emulator that is being connected to the input dimension of the To_Emulator specified by To_Input. If From_Emulator is "Global", then From_Output indicates the dimension of the global input passed to the To_Emulator.
- To_Input: a single integer specifying the input dimension of the To_Emulator that is receiving the From_Output dimension from the From_Emulator.
Each row represents a single one-to-one connection between a specified output dimension of From_Emulator and a corresponding input dimension of To_Emulator. If multiple connections are required between two emulators, each connection should be specified in a separate row.

Note: When using the data frame option for struc, the emulators argument must be provided.

emulators

a list of emulator objects, each containing an id slot that uniquely identifies it within the linked system. The id slot in each emulator object must match the From_Emulator/To_Emulator columns in struc.

If the same emulator is used multiple times within the linked system, the list must contain distinct copies of that emulator, each with a unique ID stored in their id slot. Use the set_id() function to produce copies with different IDs to ensure each instance can be uniquely referenced.

B

the number of imputations used for prediction. Increase the value to refine representation of imputation uncertainty. If the system consists of only GP emulators, B is set to 1 automatically. Defaults to 10.

activate

a bool indicating whether the initialized linked emulator should be activated:

If activate = FALSE, lgp() returns an inactive linked emulator, allowing inspection of its structure using summary().
If activate = TRUE, lgp() returns an active linked emulator, ready for prediction and validation using predict() and validate(), respectively.

Defaults to TRUE. This argument is only applicable when struc is specified as a data frame.

verb

a bool indicating if the trace information on linked (D)GP emulator construction should be printed during the function call. Defaults to TRUE. This argument is only applicable when struc is specified as a data frame.

id

an ID to be assigned to the linked (D)GP emulator. If an ID is not provided (i.e., id = NULL), a UUID (Universally Unique Identifier) will be automatically generated and assigned to the emulator. Defaults to NULL.

Value

An S3 class named lgp that contains three slots:

id: A number or character string assigned through the id argument.
constructor_obj: a list of 'python' objects that stores the information of the constructed linked emulator.
emulator_obj, a 'python' object that stores the information for predictions from the linked emulator.
specs: a list that contains
1. seed: the random seed generated to produce the imputations. This information is stored for reproducibility when the linked (D)GP emulator (that was saved by write() with the light option light = TRUE) is loaded back to R by read().
2. B: the number of imputations used to generate the linked (D)GP emulator.
If struc is a data frame, specs also includes:
1. metadata: a data frame providing configuration details for each emulator in the linked system, with following columns:
  - Emulator: the ID of an emulator.
  - Layer: the layer in the linked system where the emulator is positioned. A lower Layer number indicates a position closer to the input, with layer numbering increasing as you move away from the input.
  - Pos_in_Layer: the position of the emulator within its layer. A lower Pos_in_Layer number indicates a position higher up in that layer.
  - Total_Input_Dims: the total number of input dimensions of the emulator.
  - Total_Output_Dims: the total number of output dimensions of the emulator.
2. struc: The linked system structure, as supplied by struc.

The returned lgp object can be used by

predict() for linked (D)GP predictions.
validate() for OOS validation.
plot() for validation plots.
summary() to summarize the constructed linked (D)GP emulator.
write() to save the linked (D)GP emulator to a .pkl file.

Details

See further examples and tutorials at https://mingdeyu.github.io/dgpsi-R/dev/.

Examples

if (FALSE) { # \dontrun{

# load the package and the Python env
library(dgpsi)

# model 1
f1 <- function(x) {
  (sin(7.5*x)+1)/2
}
# model 2
f2 <- function(x) {
  2/3*sin(2*(2*x - 1))+4/3*exp(-30*(2*(2*x-1))^2)-1/3
}
# linked model
f12 <- function(x) {
  f2(f1(x))
}

# training data for Model 1
X1 <- seq(0, 1, length = 9)
Y1 <- sapply(X1, f1)
# training data for Model 2
X2 <- seq(0, 1, length = 13)
Y2 <- sapply(X2, f2)

# emulation of model 1
m1 <- gp(X1, Y1, name = "matern2.5", id = "emulator1")
# emulation of model 2
m2 <- dgp(X2, Y2, depth = 2, name = "matern2.5", id = "emulator2")

struc <- data.frame(From_Emulator = c("Global", "emulator1"),
                    To_Emulator = c("emulator1", "emulator2"),
                    From_Output = c(1, 1),
                    To_Input = c(1, 1))
emulators <- list(m1, m2)

# construct the linked emulator for visual inspection
m_link <- lgp(struc, emulators, activate = FALSE)

# visual inspection
summary(m_link)

# build the linked emulator for prediction
m_link <- lgp(struc, emulators, activate = TRUE)
test_x <- seq(0, 1, length = 300)
m_link <- predict(m_link, x = test_x)

# OOS validation
validate_x <- sample(test_x, 20)
validate_y <- sapply(validate_x, f12)
plot(m_link, validate_x, validate_y, style = 2)

# write and read the constructed linked emulator
write(m_link, 'linked_emulator')
m_link <- read('linked_emulator')
} # }