This function implements single-core or multi-core predictions (with or without multi-threading) from GP, DGP, or linked (D)GP emulators.
Usage
# S3 method for class 'dgp'
predict(
object,
x,
method = NULL,
mode = "label",
full_layer = FALSE,
sample_size = 50,
M = 50,
cores = 1,
chunks = NULL,
...
)
# S3 method for class 'lgp'
predict(
object,
x,
method = NULL,
full_layer = FALSE,
sample_size = 50,
M = 50,
cores = 1,
chunks = NULL,
...
)
# S3 method for class 'gp'
predict(
object,
x,
method = NULL,
sample_size = 50,
M = 50,
cores = 1,
chunks = NULL,
...
)
Arguments
- object
an instance of the
gp
,dgp
, orlgp
class.- x
the testing input data:
if
object
is an instance of thegp
ordgp
class,x
is a matrix where each row is an input testing data point and each column is an input dimension.if
object
is an instance of thelgp
class created bylgp()
without specifying argumentstruc
in data frame form,x
can be either a matrix or a list:if
x
is a matrix, it is the global testing input data that feed into the emulators in the first layer of a system. The rows ofx
represent different input data points and the columns represent input dimensions across all emulators in the first layer of the system. In this case, it is assumed that the only global input to the system is the input to the emulators in the first layer and there is no global input to emulators in other layers.if
x
is a list, it should have L (the number of layers in an emulator system) elements. The first element is a matrix that represents the global testing input data that feed into the emulators in the first layer of the system. The remaining L-1 elements are L-1 sub-lists, each of which contains a number (the same number of emulators in the corresponding layer) of matrices (rows being testing input data points and columns being input dimensions) that represent the global testing input data to the emulators in the corresponding layer. The matrices must be placed in the sub-lists based on how their corresponding emulators are placed instruc
argument oflgp()
. If there is no global input data to a certain emulator, setNULL
in the corresponding sub-list ofx
.
This option for linked (D)GP emulators is deprecated and will be removed in the next release.
If
object
is an instance of thelgp
class created bylgp()
with argumentstruc
in data frame form,x
must be a matrix representing the global input, where each row corresponds to a test data point and each column represents a global input dimension. The column indices inx
must align with the indices specified in theFrom_Output
column of thestruc
data frame (used inlgp()
), corresponding to rows where theFrom_Emulator
column is"Global"
.
- method
the prediction approach to use: either the mean-variance approach (
"mean_var"
) or the sampling approach ("sampling"
). For DGP emulators with a categorical likelihood (likelihood = "Categorical"
indgp()
), only the sampling approach is supported. By default, the method is set to"sampling"
for DGP emulators with Poisson, Negative Binomial, and Categorical likelihoods and"mean_var"
otherwise.- mode
whether to predict the classes (
"label"
) or probabilities ("proba"
) of different classes whenobject
is a DGP emulator with a categorical likelihood. Defaults to"label"
.- full_layer
a bool indicating whether to output the predictions of all layers. Defaults to
FALSE
. Only used whenobject
is a DGP and linked (D)GP emulator.- sample_size
the number of samples to draw for each given imputation if
method = "sampling"
. Defaults to50
.- M
the size of the conditioning set for the Vecchia approximation in the emulator prediction. Defaults to
50
. This argument is only used if the emulatorobject
was constructed under the Vecchia approximation.- cores
the number of processes to be used for predictions. If set to
NULL
, the number of processes is set tomax physical cores available %/% 2
. Defaults to1
.- chunks
the number of chunks that the testing input matrix
x
will be divided into for multi-cores to work on. Only used whencores
is not1
. If not specified (i.e.,chunks = NULL
), the number of chunks is set to the value ofcores
. Defaults toNULL
.- ...
N/A.
Value
If
object
is an instance of thegp
class:if
method = "mean_var"
: an updatedobject
is returned with an additional slot calledresults
that contains two matrices namedmean
for the predictive means andvar
for the predictive variances. Each matrix has only one column with its rows corresponding to testing positions (i.e., rows ofx
).if
method = "sampling"
: an updatedobject
is returned with an additional slot calledresults
that contains a matrix whose rows correspond to testing positions and columns correspond tosample_size
number of samples drawn from the predictive distribution of GP.
If
object
is an instance of thedgp
class:if
method = "mean_var"
andfull_layer = FALSE
: an updatedobject
is returned with an additional slot calledresults
that contains two matrices namedmean
for the predictive means andvar
for the predictive variances respectively. Each matrix has its rows corresponding to testing positions and columns corresponding to DGP global output dimensions (i.e., the number of GP/likelihood nodes in the final layer).if
method = "mean_var"
andfull_layer = TRUE
: an updatedobject
is returned with an additional slot calledresults
that contains two sub-lists namedmean
for the predictive means andvar
for the predictive variances respectively. Each sub-list contains L (i.e., the number of layers) matrices namedlayer1, layer2,..., layerL
. Each matrix has its rows corresponding to testing positions and columns corresponding to output dimensions (i.e., the number of GP/likelihood nodes from the associated layer).if
method = "sampling"
andfull_layer = FALSE
: an updatedobject
is returned with an additional slot calledresults
that contains D (i.e., the number of GP/likelihood nodes in the final layer) matrices namedoutput1, output2,..., outputD
. Each matrix has its rows corresponding to testing positions and columns corresponding to samples of size:B * sample_size
, whereB
is the number of imputations specified indgp()
.if
method = "sampling"
andfull_layer = TRUE
: an updatedobject
is returned with an additional slot calledresults
that contains L (i.e., the number of layers) sub-lists namedlayer1, layer2,..., layerL
. Each sub-list represents samples drawn from the GP/likelihood nodes in the corresponding layer, and contains D (i.e., the number of GP/likelihood nodes in the corresponding layer) matrices namedoutput1, output2,..., outputD
. Each matrix gives samples of the output from one of D GP/likelihood nodes, and has its rows corresponding to testing positions and columns corresponding to samples of size:B * sample_size
, whereB
is the number of imputations specified indgp()
.
If
object
is an instance of thedgp
class with a categorical likelihood:if
full_layer = FALSE
andmode = "label"
: an updatedobject
is returned with an additional slot calledresults
that contains one matrix namedlabel
. The matrix has its rows corresponding to testing positions and columns corresponding to label samples of size:B * sample_size
, whereB
is the number of imputations specified indgp()
.if
full_layer = FALSE
andmode = "proba"
, an updatedobject
is returned with an additional slot calledresults
. This slot contains D matrices (where D is the number of classes in the training output), where each matrix gives probability samples for the corresponding class with its rows corresponding to testing positions and columns containing probabilities. The number of columns of each matrix isB * sample_size
, whereB
is the number of imputations specified in thedgp()
function.if
full_layer = TRUE
andmode = "label"
: an updatedobject
is returned with an additional slot calledresults
that contains L (i.e., the number of layers) sub-lists namedlayer1, layer2,..., layerL
. Each of firstL-1
sub-lists represents samples drawn from the GP nodes in the corresponding layer, and contains D (i.e., the number of GP nodes in the corresponding layer) matrices namedoutput1, output2,..., outputD
. Each matrix gives samples of the output from one of D GP nodes, and has its rows corresponding to testing positions and columns corresponding to samples of size:B * sample_size
.The sub-listLayerL
contains one matrix namedlabel
. The matrix has its rows corresponding to testing positions and columns corresponding to label samples of size:B * sample_size
.B
is the number of imputations specified indgp()
.if
full_layer = TRUE
andmode = "proba"
: an updatedobject
is returned with an additional slot calledresults
that contains L (i.e., the number of layers) sub-lists namedlayer1, layer2,..., layerL
. Each of firstL-1
sub-lists represents samples drawn from the GP nodes in the corresponding layer, and contains D (i.e., the number of GP nodes in the corresponding layer) matrices namedoutput1, output2,..., outputD
. The sub-listLayerL
contains D matrices (where D is the number of classes in the training output), where each matrix gives probability samples for the corresponding class with its rows corresponding to testing positions and columns containing probabilities. The number of columns of each matrix isB * sample_size
.B
is the number of imputations specified indgp()
.
If
object
is an instance of thelgp
class:if
method = "mean_var"
andfull_layer = FALSE
: an updatedobject
is returned with an additional slot calledresults
that contains two sub-lists namedmean
for the predictive means andvar
for the predictive variances respectively. Each sub-list contains K number (same number of emulators in the final layer of the system) of matrices named by theID
s of the corresponding emulators in the final layer. Each matrix has its rows corresponding to global testing positions and columns corresponding to output dimensions of the associated emulator in the final layer.if
method = "mean_var"
andfull_layer = TRUE
: an updatedobject
is returned with an additional slot calledresults
that contains two sub-lists namedmean
for the predictive means andvar
for the predictive variances respectively. Each sub-list contains L (i.e., the number of layers in the emulated system) components namedlayer1, layer2,..., layerL
. Each component represents a layer and contains K number (same number of emulators in the corresponding layer of the system) of matrices named by theID
s of the corresponding emulators in that layer. Each matrix has its rows corresponding to global testing positions and columns corresponding to output dimensions of the associated GP/DGP emulator in the corresponding layer.if
method = "sampling"
andfull_layer = FALSE
: an updatedobject
is returned with an additional slot calledresults
that contains K number (same number of emulators in the final layer of the system) of sub-lists named by theID
s of the corresponding emulators in the final layer. Each sub-list contains D matrices, namedoutput1, output2,..., outputD
, that correspond to the output dimensions of the GP/DGP emulator. Each matrix has its rows corresponding to testing positions and columns corresponding to samples of size:B * sample_size
, whereB
is the number of imputations specified inlgp()
.if
method = "sampling"
andfull_layer = TRUE
: an updatedobject
is returned with an additional slot calledresults
that contains L (i.e., the number of layers of the emulated system) sub-lists namedlayer1, layer2,..., layerL
. Each sub-list represents a layer and contains K number (same number of emulators in the corresponding layer of the system) of components named by theID
s of the corresponding emulators in that layer. Each component contains D matrices, namedoutput1, output2,..., outputD
, that correspond to the output dimensions of the GP/DGP emulator. Each matrix has its rows corresponding to testing positions and columns corresponding to samples of size:B * sample_size
, whereB
is the number of imputations specified inlgp()
.
If
object
is an instance of thelgp
class created bylgp()
without specifying thestruc
argument in data frame form, theID
s, that are used as names of sub-lists or matrices withinresults
, will be replaced byemulator1
,emulator2
, and so on.
The results
slot will also include:
Details
See further examples and tutorials at https://mingdeyu.github.io/dgpsi-R/.