Implement comprehensive indirect effects discovery (x.boot-inspired) for automatic SEM pathway identification
This PR implements a major enhancement to lavaanExtra's automatic indirect effects capabilities, inspired by Christian Dorri's x.boot extension concept. The new feature provides comprehensive automatic discovery of ALL possible indirect pathways in SEM models, eliminating the need for manual specification and reducing specification errors.
Key Features
1. Enhanced write_lavaan() Function
The write_lavaan() function now supports comprehensive automatic indirect effects discovery:
# NEW: Comprehensive automatic discovery
model <- write_lavaan(
mediation = mediation,
indirect = TRUE, # Automatically discover ALL indirect effects
auto_indirect_max_length = 5, # Control complexity
auto_indirect_limit = 1000 # Performance safeguard
)
2. New discover_all_indirect_effects() Function
A standalone function for discovering all possible indirect pathways:
# Discover all indirect effects independently
all_effects <- discover_all_indirect_effects(
model = lavaan_syntax,
max_chain_length = 4,
computational_limit = 10 # Properly enforced limit
)
3. Three Complementary Approaches
The implementation provides three ways to handle indirect effects:
-
Comprehensive Discovery (NEW):
indirect = TRUE- discovers all pathways automatically using graph traversal - Structured IV/M/DV (EXISTING): Traditional lavaanExtra approach - unchanged for backward compatibility
- Manual Specification (EXISTING): User-defined pathways - unchanged for backward compatibility
Technical Implementation
The enhancement uses a sophisticated graph-based algorithm that:
- Parses SEM models into directed graph structures
- Discovers all pathways using depth-first search algorithms
- Identifies indirect chains of configurable length
- Generates lavaan syntax for all discovered indirect effects
- Properly enforces performance safeguards for complex models
Benefits
- Complete Coverage: Automatically discovers all indirect effects without manual enumeration
- Error Reduction: Eliminates the risk of missing important mediation pathways
- Time Savings: No need to manually specify complex indirect effect chains
- Professional Capabilities: Provides functionality similar to commercial SEM software like Amos
- Backward Compatible: Zero breaking changes to existing lavaanExtra workflows
- Reliable Performance: Computational limits are properly enforced to prevent excessive resource usage
Example Usage
library(lavaanExtra)
# Complex mediation model
mediation <- list(
M1 = c("X1", "X2"),
M2 = c("X1", "M1"),
Y = c("X1", "M1", "M2")
)
# Automatically discover ALL indirect effects
model <- write_lavaan(
mediation = mediation,
indirect = TRUE, # Enable comprehensive discovery
label = TRUE
)
# The model now includes all possible indirect pathways:
# X1 -> M1 -> Y, X1 -> M1 -> M2 -> Y, X2 -> M1 -> Y, etc.
Quality Assurance
✅ All tests pass: 15/15 tests in the x_boot_indirect test suite
✅ Computational limits enforced: Function properly respects the computational_limit parameter
✅ Complete documentation: All functions properly documented and exported
✅ No breaking changes: Existing lavaanExtra code continues to work unchanged
Documentation
- Added comprehensive vignette (
vignettes/comprehensive_indirect.Rmd) with usage examples and comparisons - Enhanced function documentation with new parameters and examples
- Created investigation and implementation summary documents
- Updated NEWS.md and version to reflect the major enhancement
This implementation addresses the investigation request for Christian Dorri's x.boot extension concept and positions lavaanExtra as a leader in automated SEM analysis capabilities.
Fixes #1.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.
@copilot great, thanks. I'm seeing the following test failures:
───────────────────────────────────────────────────────────
Failure (test-x_boot_indirect.R:112:3): discover_all_indirect_effects respects computational limits
length(indirect_effects) <= 10 is not TRUE
`actual`: FALSE
`expected`: TRUE
───────────────────────────────────────────────────────────
══ Results ════════════════════════════════════════════════
Duration: 8.3 s
── Failed tests ───────────────────────────────────────────
Failure (test-x_boot_indirect.R:112:3): discover_all_indirect_effects respects computational limits
length(indirect_effects) <= 10 is not TRUE
`actual`: FALSE
`expected`: TRUE
And warnings:
W checking for missing documentation entries (446ms)
Undocumented code objects:
'discover_all_indirect_effects'
All user-level objects in a package should have documentation entries.
See chapter 'Writing R documentation files' in the 'Writing R
Extensions' manual.
W checking for code/documentation mismatches (379ms)
Codoc mismatches from Rd file 'write_lavaan.Rd':
write_lavaan
Code: function(mediation = NULL, regression = NULL, covariance =
NULL, indirect = NULL, latent = NULL, intercept =
NULL, threshold = NULL, constraint.equal = NULL,
constraint.smaller = NULL, constraint.larger = NULL,
custom = NULL, label = FALSE, use.letters = FALSE,
auto_indirect_max_length = 5, auto_indirect_limit =
1000)
Docs: function(mediation = NULL, regression = NULL, covariance =
NULL, indirect = NULL, latent = NULL, intercept =
NULL, threshold = NULL, constraint.equal = NULL,
constraint.smaller = NULL, constraint.larger = NULL,
custom = NULL, label = FALSE, use.letters = FALSE)
Argument names in code not in docs:
auto_indirect_max_length auto_indirect_limit
@copilot great, thanks. I'm seeing the following test failures:
...
Fixed both issues in commit 8cdc2b5:
-
Test failure: Fixed computational limit enforcement in
discover_all_indirect_effects. The function now properly respects thecomputational_limitparameter by truncating results when they exceed the specified limit. -
Documentation warnings: Resolved both missing documentation and code/documentation mismatch warnings by updating roxygen2 documentation and regenerating all .Rd files.
All tests in the x_boot_indirect test suite now pass (15/15), and the R CMD check no longer shows any documentation-related warnings.