
Two-stage DTR learning/evaluation pipeline with cross-fitting
Drmatch.Rdsimrun() runs one replicate (indexed by jj) of a two-stage dynamic
treatment regime (DTR) learning and evaluation pipeline using outer cross-fitting
(default 5 folds). Within each fold, the function imputes censoring-related
composite outcomes on the training partition, constructs double scores
(propensity/prognostic scores) when requested, performs matching-based
pseudo-outcome construction for stage 2 and stage 1, fits random-forest
policies via rfdtr, applies the learned rules to the held-out
fold, and returns fold-level value summaries and benchmark comparisons.
Usage
Drmatch(
data = data,
id.var = "PatientID",
eta2.var = "eta2",
Y1.var = "OS_time.1L",
Y2.var = "OS_time.2L",
delta.var = "deathInd.raw",
OY.var = "OS_time",
A1.var = "txgroup1L.sd",
A2.var = "txgroup2L0.sd",
exact1.vars = c("txgroup1L.sd", "gender.sd", "ECOG1st0.sd", "ECOG1st1.sd"),
exact2.vars = c("txgroup1L.sd", "gender.sd", "ECOG2nd0.sd", "ECOG2nd1.sd",
"txgroup2L0.sd"),
names.var1 = c("ageAt1L", "gender.sd", "ECOG1st0.sd", "ECOG1st1.sd", "Albumin1st",
"Lymphocyte1st"),
names.var2 = c("ageAt1L", "gender.sd", "ECOG2nd0.sd", "Lymphocyte2nd", "OS_time.1L",
"ECOG2nd1.sd", "Albumin2nd", "Lymphocyte2nd"),
usecov = FALSE,
cores = 5,
tau = NULL,
sl.seed = 1234,
A.SL.library = c("SL.glm", "SL.glmnet", "SL.ranger"),
Y.SL.library = c("LIB_COXlasso", "LIB_COXen", "LIB_AFTggamma", "LIB_RSF"),
A.method = "method.NNloglik",
Y.method = "ibll",
plotps = FALSE,
ngrid = 1000,
param.tune = NULL,
param.weights.fix = NULL,
param.weights.init = NULL,
optim.method = "Nelder-Mead",
maxit = 10000,
usepenalty = FALSE,
runseed = 2025,
useds = TRUE,
Taus = NULL,
adjustdelta1 = FALSE,
modeltype = "ranger",
usecv = TRUE,
doublepg = TRUE,
model.pg = "cox",
standardize = FALSE,
superLearn = TRUE,
pslink = "logit",
pglink = "lognormal",
distance = "mahalanobis",
method = "nearest",
K = 3,
replacement = TRUE,
cap_months = 36
)Arguments
- data
A
data.framecontaining all required columns used byGet_data(), matching, scoring, and policy learning.- id.var
Name of unique subject identifier column.
- eta2.var
Name of stage-2 entry indicator (1 = enters stage 2).
- Y1.var
Name of stage-1 observed time/outcome component.
- Y2.var
Name of stage-2 observed time/outcome component.
- delta.var
Name of event indicator (1 = event, 0 = censored).
- OY.var
Name of overall observed time (used as a boundary in donor filtering).
- A1.var
Name of stage-1 observed treatment/action.
- A2.var
Name of stage-2 observed treatment/action.
- exact1.vars
Character vector of variables used for exact matching at stage 1. Use
NULLorcharacter(0)to disable exact matching.- exact2.vars
Character vector of variables used for exact matching at stage 2. Use
NULLorcharacter(0)to disable exact matching.- names.var1
Character vector of covariate names used at stage 1 (scores/matching/model).
- names.var2
Character vector of covariate names used at stage 2 (scores/matching/model).
- usecov
Logical; if
TRUE, matching formulas are built fromnames.var1/names.var2. IfFALSE, matching may rely on score variables (e.g., pg/ps) produced by the pipeline.- cores
Integer number of workers used for internal parallel steps.
- tau
Optional horizon(s) passed to score construction; in the shown implementation,
rtauis set tocap_monthsfor both stages.- sl.seed
Seed passed to SuperLearner / score construction components.
- A.SL.library
Candidate learners for treatment (propensity) score models.
- Y.SL.library
Candidate learners for outcome/censoring models used in
ComputeScores().- A.method
Optional optimization method for propensity fitting in score construction.
- Y.method
Optional optimization method for outcome fitting in score construction.
- plotps
Logical; if
TRUE, generate propensity score overlap plots where implemented.- ngrid
Integer; grid size for internal optimization routines used by score construction.
- param.tune
Optional tuning structure passed into score construction.
- param.weights.fix, param.weights.init
Optional fixed/initial weights passed to score construction.
- optim.method
Optimization method used in score construction (e.g.,
"Nelder-Mead").- maxit
Maximum iterations for score construction optimizers.
- usepenalty
Logical; if
TRUE, defineSL.glmnet.tune1/SL.glmnet.tune2wrappers withpenalty.factorfor stage-specific propensity estimation.- runseed
Integer vector of seeds; replicate
jjusesrunseed[jj].- useds
Logical; if
TRUE, compute/use double scores viaget_doublescores.- Taus
Optional matrix/data.frame of candidate horizons (legacy/compatibility).
- adjustdelta1
Logical; if
TRUE, modifies stage-1 censoring indicator as inimpute_censored_outcomes()andget_doublescores().- modeltype
Character; model type for policy learning in
rfdtr(e.g.,"ranger").- usecv
Logical; if
TRUE, tune policy learner with internal CV inrfdtr.- doublepg
Logical; if
TRUE, use arm-specific prognostic scores (pg0/pg1) where applicable.- model.pg
Character; prognostic model family (e.g.,
"cox"or"aft") for score construction.- standardize
Logical; if
TRUE, standardize covariates forglmnetcomponents.- superLearn
Logical; if
TRUE, use SuperLearner; ifFALSE, use glmnet-only path where implemented byComputeScores().- pslink
Character; link for propensity estimation in score construction (e.g.,
"logit").- pglink
Character; distribution/link for prognostic score modeling (project-specific; e.g.,
"lognormal").- distance
Matching distance used by
matchpotential_DTR(e.g.,"mahalanobis").- method
Matching method used in
matchpotential_DTR(e.g.,"nearest"or"full").- K
Integer donor ratio for matching (number of donors per anchor).
- replacement
Logical; whether to allow donor reuse in nearest-neighbor matching.
- cap_months
Numeric truncation cap (months). Value summaries are computed after capping at this horizon.
Value
A data.frame with one row per outer fold (default 5). Columns include:
Fold-level value summaries (means and medians) under the learned DTR and under several alternative constructions produced inside the pipeline (e.g.,
Ttot,T2,T2.rf,TtotOSml,Ttot1LOSmed, etc.).Agreement metrics:
CCR1(stage-1 agreement) andCCR2(stage-2 agreement amongeta2==1).Observed and uniform-regime benchmarks (means/medians) at stage 2 and overall.
Diagnostics stored as list-columns, including RF best hyperparameters and predicted optimal actions.
The returned object also includes iter (replicate index) and fold.
If a fatal error occurs, NULL is returned.
Details
The outer loop is a 5-fold split created by caret::createFolds() on the
stage-1 observed treatment A1.var. For each fold:
Split into training and test partitions.
Standardize selected time variables in training (
OY.sd,Y1.sd,Y2.sd).Impute censoring-related composite outcomes on the training fold via
impute_censored_outcomes(internally calling stage-2 then stage-1 imputation).Optionally compute double scores on the imputed training fold via
get_doublescores.Stage 2 (among
eta2==1): build matched opposite-arm pseudo-outcomes viamatchpotential_DTR, tune/fit a random-forest policy viarfdtr, and compute an estimated stage-2 optimal value by substituting matched outcomes for units assigned to a non-optimal observed treatment.Stage 1: construct a pseudo-outcome
Rtilde(including the estimated stage-2 optimal value for stage-2 entrants), perform matching viamatchpotential_DTR, and tune/fit the stage-1 policy viarfdtr.Apply learned policies to the held-out fold and compute fold-level summaries: value estimates (means and medians, capped at
cap_months), agreement measures (CCR1,CCR2), and observed/uniform benchmark regimes.
See also
Get_data Get_data,
impute_censored_outcomes,
get_doublescores,
matchpotential_DTR,
rfdtr.