Complete Workflow: From Data to Decision

Overview

This vignette demonstrates the complete causaldef workflow, from data specification through to policy decision-making. We show how Le Cam deficiency theory translates abstract statistical concepts into actionable clinical insights.

The workflow consists of four stages:

  1. Specify → Define the causal problem
  2. Estimate → Compute deficiency for different adjustment strategies
  3. Diagnose → Validate assumptions using negative controls and sensitivity analysis
  4. Decide → Compute policy regret bounds and make informed decisions

Part 1: Gene Perturbation Study (Continuous Outcome)

We begin with the gene_perturbation dataset, which simulates a CRISPR knockout experiment. This illustrates the core workflow for continuous outcomes.

1.1 Data Description

library(causaldef)
library(ggplot2)

data(gene_perturbation)
str(gene_perturbation)
#> 'data.frame':    500 obs. of  6 variables:
#>  $ sample_id        : Factor w/ 500 levels "S1","S10","S100",..: 1 112 223 334 445 457 468 479 490 2 ...
#>  $ batch            : Factor w/ 4 levels "1","2","3","4": 3 4 4 2 2 2 4 3 4 4 ...
#>  $ library_size     : num  948877 984900 868332 1052105 880271 ...
#>  $ knockout_status  : Factor w/ 2 levels "Control","Knockout": 2 1 1 1 1 2 2 1 2 1 ...
#>  $ target_expression: num  8.64 11.54 12.32 12.94 8.29 ...
#>  $ housekeeping_gene: num  6.59 9.42 8.57 9.22 6.38 ...

Variables:

  • knockout_status: Treatment (Control vs. Knockout)
  • target_expression: Primary outcome (gene expression level)
  • housekeeping_gene: Negative control outcome (shouldn’t be affected by knockout)
  • batch, library_size: Technical confounders

Causal Structure:

      [Batch, Library Size]
             |
             v
   [Knockout] -----> [Target Expression]
             \
              \---X--> [Housekeeping Gene]  (no causal effect)

The housekeeping gene is affected by the same technical variations but NOT by the knockout, making it an ideal negative control.

1.2 Step 1: Specification

spec_gene <- causal_spec(
    data = gene_perturbation,
    treatment = "knockout_status",
    outcome = "target_expression",
    covariates = c("batch", "library_size"),
    negative_control = "housekeeping_gene",
    estimand = "ATE",
    outcome_type = "continuous"
)
#> ✔ Created causal specification: n=500, 2 covariate(s)

print(spec_gene)
#> 
#> -- Causal Specification --------------------------------------------------
#> 
#> * Treatment: knockout_status ( binary )
#> * Outcome: target_expression ( continuous )
#> * Covariates: batch, library_size 
#> * Sample size: 500 
#> * Estimand: ATE 
#> * Negative control: housekeeping_gene

1.3 Step 2: Deficiency Estimation

We compare three adjustment strategies:

  1. Unadjusted: Ignores technical confounders
  2. IPTW: Reweights samples to balance batch and library size
  3. AIPW: Augmented IPTW (doubly robust)
deficiency_gene <- estimate_deficiency(
    spec_gene,
    methods = c("unadjusted", "iptw", "aipw"),
    n_boot = 100 # Use more for production (e.g., 1000)
)
#> ℹ Inferred treatment value: Knockout
#> ℹ Estimating deficiency: unadjusted
#> ℹ Estimating deficiency: iptw
#> ℹ Estimating deficiency: aipw

print(deficiency_gene)
#> 
#> -- Deficiency Proxy Estimates (PS-TV) ------
#> 
#>      Method  Delta     SE               CI            Quality
#>  unadjusted 0.1424 0.0357 [0.1256, 0.2577] Insufficient (Red)
#>        iptw 0.0368 0.0139  [0.026, 0.0765]  Excellent (Green)
#>        aipw 0.0368 0.0120 [0.0273, 0.0708]  Excellent (Green)
#> Note: delta is a propensity-score TV proxy (overlap/balance diagnostic).
#> 
#> Best method: iptw (delta = 0.0368 )

Interpretation:

  • δ ≈ 0: The adjustment creates a “synthetic RCT” (observational data approximates what we’d see if knockout were randomized)
  • δ > 0.1: Substantial distributional mismatch remains; causal conclusions are uncertain
plot(deficiency_gene, type = "bar")

1.4 Step 3: Diagnose with Negative Control

The negative control diagnostic tests whether our adjustment removes ALL confounding, not just the measured confounders.

nc_test <- nc_diagnostic(
    spec_gene,
    method = "iptw",
    alpha = 0.05,
    n_boot = 100
)
#> ℹ Using kappa = 1 (conservative). Consider domain-specific estimation or sensitivity analysis via kappa_range.
#> ✔ No evidence against causal assumptions (p = 0.10891 )

print(nc_test)
#> 
#> -- Negative Control Diagnostic ----------------------------------------
#> 
#> * screening statistic (weighted corr): 0.0821 
#> * delta_NC (association proxy): 0.0821 
#> * delta bound (under kappa alignment): 0.0821 (kappa = 1 )
#> * screening p-value: 0.10891 
#> * screening method: weighted_permutation_correlation 
#> 
#> RESULT: NOT REJECTED. This is a screening result, not proof that confounding is absent.
#> NOTE: Your effect estimate must exceed the Noise Floor (delta_bound) to be meaningful.

Decision Logic:

Result Interpretation Action
falsified = FALSE No evidence of residual confounding Proceed with analysis
falsified = TRUE Residual confounding detected Add covariates or acknowledge limitations

1.5 Step 4: Policy Decision

Suppose we’re deciding whether to pursue this gene target for drug development. The utility is measured on a scale where: - 0 = no promise (no effect on expression) - 10 = maximum promise (strong effect)

bounds_gene <- policy_regret_bound(
    deficiency_gene,
    utility_range = c(0, 10)
)
#> Warning: Multiple fitted methods are available but `method` was not specified.
#> ℹ Using the smallest available delta across methods is optimistic after model
#>   selection.
#> ℹ For a pre-specified decision bound, call `policy_regret_bound()` with `method
#>   = '<chosen method>'`.
#> ℹ Transfer penalty: 0.3676 (delta = 0.0368)

print(bounds_gene)
#> 
#> -- Policy Regret Bounds -------------------------------------------------
#> 
#> * Deficiency delta: 0.0368 
#> * Delta mode: point 
#> * Delta method: iptw 
#> * Delta selection: minimum across fitted methods 
#> * Utility range: [0, 10]
#> * Transfer penalty: 0.3676 (additive regret upper bound)
#> * Minimax floor: 0.1838 (worst-case lower bound)
#> 
#> Note: this is a plug-in bound using a deficiency proxy rather than an identified exact deficiency.
#> Note: minimum-across-methods selection is optimistic after model selection.
#> 
#> Interpretation: Transfer penalty is 3.7 % of utility range given delta

Regret Bounds:

policy_regret_bound() reports:

  • Transfer penalty \(M\cdot\delta\): additive worst-case regret inflation term, and
  • Minimax safety floor \((M/2)\cdot\delta\): irreducible worst-case regret when \(\delta>0\).

Decision Rule: - If transfer_penalty < acceptable risk threshold → Proceed with confidence - If transfer_penalty > acceptable risk threshold → Seek more evidence

1.6 Effect Estimation

Finally, we estimate the causal effect using the best-performing method:

effect_gene <- estimate_effect(
    deficiency_gene,
    target_method = "iptw"
)

print(effect_gene)
#> 
#> -- Causal Effect Estimate ----------------------
#> Method:    iptw 
#> Type:      ATE 
#> Contrast:  Knockout vs Control 
#> Estimate:  -1.7085

Complete Report:

cat(sprintf(
    "
## Gene Perturbation Analysis Report

**Treatment Effect (IPTW-adjusted):** %.2f log2 expression units
**Deficiency (δ):** %.3f
**Negative Control Test:** %s (p = %.3f)
**Transfer Penalty:** %.3f on [0, 10] scale
**Minimax Safety Floor:** %.3f on [0, 10] scale

**Conclusion:** %s
",
    effect_gene$estimate,
    deficiency_gene$estimates["iptw"],
    ifelse(nc_test$falsified, "FAILED", "PASSED"),
    nc_test$p_value,
    bounds_gene$transfer_penalty,
    bounds_gene$minimax_floor,
    ifelse(nc_test$falsified,
        "Residual confounding detected. Interpret with caution.",
        ifelse(bounds_gene$transfer_penalty < 0.5,
            "Strong evidence supporting treatment effect.",
            "Moderate evidence; consider additional validation."
        )
    )
))

Gene Perturbation Analysis Report

Treatment Effect (IPTW-adjusted): -1.71 log2 expression units Deficiency (δ): 0.037 Negative Control Test: PASSED (p = 0.109) Transfer Penalty: 0.368 on [0, 10] scale Minimax Safety Floor: 0.184 on [0, 10] scale

Conclusion: Strong evidence supporting treatment effect.


Part 2: Hematopoietic Cell Transplantation (Survival Outcome)

Next, we analyze the hct_outcomes dataset, which mimics a retrospective registry study comparing conditioning regimens in HCT.

2.1 Data Description

data(hct_outcomes)
str(hct_outcomes)
#> 'data.frame':    800 obs. of  8 variables:
#>  $ id                    : int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ age                   : num  57 50 59 65 54 48 55 49 46 58 ...
#>  $ disease_status        : Factor w/ 3 levels "Advanced","Early",..: 2 2 3 2 1 2 2 1 2 2 ...
#>  $ kps                   : num  89 85 97 84 96 64 88 73 82 100 ...
#>  $ donor_type            : Factor w/ 3 levels "HLA-Matched",..: 1 2 2 3 3 2 2 1 1 2 ...
#>  $ conditioning_intensity: Factor w/ 2 levels "Myeloablative",..: 1 2 1 1 2 1 1 1 1 1 ...
#>  $ time_to_event         : num  14.31 11.05 20.23 2.9 2.22 ...
#>  $ event_status          : Factor w/ 3 levels "Censored","Death",..: 3 3 3 3 3 2 3 3 3 3 ...

# Summarize key variables
summary(hct_outcomes[, c("age", "kps", "time_to_event")])
#>       age             kps         time_to_event   
#>  Min.   :15.00   Min.   : 60.00   Min.   : 0.000  
#>  1st Qu.:42.00   1st Qu.: 77.00   1st Qu.: 3.305  
#>  Median :50.00   Median : 85.00   Median : 8.220  
#>  Mean   :49.96   Mean   : 84.26   Mean   :12.605  
#>  3rd Qu.:58.00   3rd Qu.: 92.00   3rd Qu.:17.905  
#>  Max.   :89.00   Max.   :100.00   Max.   :86.080
table(hct_outcomes$conditioning_intensity, hct_outcomes$event_status)
#>                
#>                 Censored Death Relapse
#>   Myeloablative      117   119     486
#>   Reduced              6    12      60

Clinical Context:

  • Myeloablative conditioning: High-intensity chemotherapy (younger, healthier patients)
  • Reduced-intensity conditioning: Lower dose (older, sicker patients)
  • Outcome: Time to death or relapse

The key challenge is confounding by indication: doctors assign treatment based on patient status, making naive comparisons biased.

2.2 Step 1: Survival Specification

Executable survival-effect code in this section requires a compatible survival runtime. In the current support matrix, that means R >= 4.0.

# Create binary event indicator
hct_outcomes$event <- ifelse(hct_outcomes$event_status != "Censored", 1, 0)

spec_hct <- causal_spec_survival(
    data = hct_outcomes,
    treatment = "conditioning_intensity",
    time = "time_to_event",
    event = "event",
    covariates = c("age", "disease_status", "kps", "donor_type"),
    estimand = "RMST",
    horizon = 24 # 24-month restricted mean survival time
)
#> ✔ Created survival causal specification: n=800, 677 events

print(spec_hct)
#> 
#> -- Survival Causal Specification ----------------------------------------
#> 
#> * Treatment: conditioning_intensity ( binary )
#> * Time: time_to_event (max = 86.08 )
#> * Event: event ( 677 events)
#> * Covariates: age, disease_status, kps, donor_type 
#> * Sample size: 800 
#> * Estimand: RMST 
#> * Horizon: 24

2.3 Step 2: Deficiency Estimation

deficiency_hct <- estimate_deficiency(
    spec_hct,
    methods = c("unadjusted", "iptw"),
    n_boot = 50 # Use more for production
)
#> ℹ Inferred treatment value: Reduced
#> ℹ Estimating deficiency: unadjusted
#> ℹ Estimating deficiency: iptw

print(deficiency_hct)
#> 
#> -- Deficiency Proxy Estimates (PS-TV) ------
#> 
#>      Method  Delta     SE               CI            Quality
#>  unadjusted 0.3030 0.0474 [0.2422, 0.4295] Insufficient (Red)
#>        iptw 0.0644 0.0252   [0.045, 0.135]   Caution (Yellow)
#> Note: delta is a propensity-score TV proxy (overlap/balance diagnostic).
#> 
#> Best method: iptw (delta = 0.0644 )

Clinical Interpretation:

The deficiency tells us how much our observational evidence differs from what an RCT would provide:

  • Unadjusted δ: High, reflecting strong selection bias (young patients → myeloablative)
  • IPTW δ: Lower, IPTW reweights to balance baseline covariates

2.4 Step 3: Confounding Frontier

Beyond point estimates, we can map a sensitivity analysis showing how deficiency varies with hypothetical unmeasured confounding:

frontier <- confounding_frontier(
    spec_hct,
    alpha_range = c(-2, 2), # Confounding path: U → Treatment
    gamma_range = c(-2, 2), # Confounding path: U → Outcome
    grid_size = 30
)
#> ℹ Computing benchmarks for observed covariates...
#> ✔ Computed confounding frontier: 30x30 grid

print(frontier)
#> 
#> -- Confounding Frontier (Confounding Lower Bound) --------------------------
#> 
#> * Grid size: 30 x 30 
#> * alpha range: [-2, 2]
#> * gamma range: [-2, 2]
#> * Model: gaussian 
#> 
#> -- Deficiency Summary --
#> 
#> * Min delta: 0 
#> * Max delta: 0.0525 
#> * Mean delta: 0.013 
#> 
#> 48 % of grid has near-zero deficiency (delta < 0.01)
plot(frontier)
#> Warning: The following aesthetics were dropped during statistical transformation: fill.
#> ℹ This can happen when ggplot fails to infer the correct grouping structure in
#>   the data.
#> ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
#>   variable into a factor?

Reading the Frontier Map:

  • Center (α = 0 or γ = 0): No unmeasured confounding → δ = 0
  • Corners: Strong confounding on both paths → high δ
  • Observed covariates (dots): Benchmark strengths of measured confounders

If an unmeasured confounder would need extreme strength (beyond observed benchmarks) to substantially increase δ, conclusions are robust.

2.5 Step 4: Policy Regret and RMST Effect

# Utility = months of survival (horizon = 24)
bounds_hct <- policy_regret_bound(
    deficiency_hct,
    utility_range = c(0, 24)
)
#> Warning: Multiple fitted methods are available but `method` was not specified.
#> ℹ Using the smallest available delta across methods is optimistic after model
#>   selection.
#> ℹ For a pre-specified decision bound, call `policy_regret_bound()` with `method
#>   = '<chosen method>'`.
#> ℹ Transfer penalty: 1.5457 (delta = 0.0644)

print(bounds_hct)
#> 
#> -- Policy Regret Bounds -------------------------------------------------
#> 
#> * Deficiency delta: 0.0644 
#> * Delta mode: point 
#> * Delta method: iptw 
#> * Delta selection: minimum across fitted methods 
#> * Utility range: [0, 24]
#> * Transfer penalty: 1.5457 (additive regret upper bound)
#> * Minimax floor: 0.7728 (worst-case lower bound)
#> 
#> Note: this is a plug-in bound using a deficiency proxy rather than an identified exact deficiency.
#> Note: minimum-across-methods selection is optimistic after model selection.
#> 
#> Interpretation: Transfer penalty is 6.4 % of utility range given delta

Clinical Regret Bounds:

If δ = 0.05 with M = 24 months: - Transfer penalty = 24 × 0.05 = 1.2 months - Minimax safety floor = 0.5 × 24 × 0.05 = 0.6 months

This means even with perfect decision-making, the observational evidence can inflate regret by up to about 1.2 months on the 0–24 month utility scale, and there is an unavoidable worst-case floor of about 0.6 months when \(\delta>0\).

# Estimate RMST difference
effect_hct <- estimate_effect(
    deficiency_hct,
    target_method = "iptw",
    contrast = c("Myeloablative", "Reduced")
)

print(effect_hct)
#> 
#> -- Causal Effect Estimate ----------------------
#> Method:    iptw 
#> Type:      RMST Difference 
#> Contrast:  Myeloablative vs Reduced 
#> Estimate:  2.949 
#> Horizon:   24

2.6 Complete Decision Framework

delta_iptw <- deficiency_hct$estimates["iptw"]
transfer_penalty <- bounds_hct$transfer_penalty
minimax_floor <- bounds_hct$minimax_floor
rmst_diff <- effect_hct$estimate

# Decision logic
if (delta_iptw < 0.05) {
    evidence_quality <- "HIGH (approximates RCT)"
} else if (delta_iptw < 0.15) {
    evidence_quality <- "MODERATE (acceptable with caveats)"
} else {
    evidence_quality <- "LOW (substantial bias risk)"
}

# Benefit-to-risk ratio
if (!is.na(rmst_diff) && !is.na(transfer_penalty) && transfer_penalty > 0) {
    benefit_to_risk <- abs(rmst_diff) / transfer_penalty
    recommendation <- ifelse(benefit_to_risk > 2,
        "Recommend treatment with stronger effect",
        "Evidence inconclusive; consider shared decision-making"
    )
} else {
    benefit_to_risk <- NA
    recommendation <- "Unable to calculate benefit-to-risk ratio"
}

cat(sprintf(
    "
## HCT Treatment Decision Report

**RMST Difference (IPTW):** %.2f months (%s favored)
**Deficiency:** %.3f
**Evidence Quality:** %s
**Transfer Penalty:** %.2f months
**Minimax Safety Floor:** %.2f months

**Benefit-to-Risk Ratio:** %.1f:1
**Recommendation:** %s

### Clinical Translation

The observational evidence suggests %s conditioning provides approximately
%.1f additional months of relapse-free survival within the first 2 years.

However, the transfer penalty is %.1f months and the minimax safety floor is %.1f months
on the 0--24 month utility scale. Clinicians should weigh this against individual patient factors.
",
    abs(rmst_diff),
    ifelse(rmst_diff > 0, "Myeloablative", "Reduced"),
    delta_iptw,
    evidence_quality,
    transfer_penalty,
    minimax_floor,
    benefit_to_risk,
    recommendation,
    ifelse(rmst_diff > 0, "myeloablative", "reduced-intensity"),
    abs(rmst_diff),
    transfer_penalty,
    minimax_floor
))

HCT Treatment Decision Report

RMST Difference (IPTW): 2.95 months (Myeloablative favored) Deficiency: 0.064 Evidence Quality: MODERATE (acceptable with caveats) Transfer Penalty: 1.55 months Minimax Safety Floor: 0.77 months

Benefit-to-Risk Ratio: 1.9:1 Recommendation: Evidence inconclusive; consider shared decision-making

Clinical Translation

The observational evidence suggests myeloablative conditioning provides approximately 2.9 additional months of relapse-free survival within the first 2 years.

However, the transfer penalty is 1.5 months and the minimax safety floor is 0.8 months on the 0–24 month utility scale. Clinicians should weigh this against individual patient factors.


Part 3: Comparative Analysis Across Studies

3.1 When Is Observational Evidence Sufficient?

Study δ (IPTW) Transfer Penalty Evidence Quality
Gene Perturbation Low (~0.01) Very Low Excellent
HCT Outcomes Moderate (~0.05-0.15) 1-4 months Acceptable with caveats

3.2 General Workflow Summary

┌─────────────────────────────────────────────────────────────────┐
│  SPECIFY: causal_spec() / causal_spec_survival()                │
│           ↓ Define treatment, outcome, covariates, NC           │
├─────────────────────────────────────────────────────────────────┤
│  ESTIMATE: estimate_deficiency()                                │
│            ↓ Compare unadjusted, IPTW, AIPW, TMLE, etc.        │
│            ↓ Select method with lowest δ                        │
├─────────────────────────────────────────────────────────────────┤
│  DIAGNOSE: nc_diagnostic() + confounding_frontier()             │
│            ↓ Test whether assumptions are falsified             │
│            ↓ Map sensitivity to unmeasured confounding          │
├─────────────────────────────────────────────────────────────────┤
│  DECIDE: policy_regret_bound() + estimate_effect()              │
│          ↓ Compute transfer penalty / minimax floor             │
│          ↓ Report effect with uncertainty qualification         │
└─────────────────────────────────────────────────────────────────┘

References

  1. Akdemir, D. (2026). Constraints on Causal Inference as Experiment Comparison: A Framework for Identification, Transportability, and Policy Learning. DOI: 10.5281/zenodo.18367347

  2. Le Cam, L., & Yang, G. L. (2000). Asymptotics in Statistics: Some Basic Concepts. Springer.

  3. VanderWeele, T. J., & Ding, P. (2017). Sensitivity Analysis in Observational Research: Introducing the E-value. Annals of Internal Medicine.