Logistic regression using purposeful selection
Source:R/purposeful_selection_algorithm.R
purposeful_selection_algorithm.Rd
Uses purposeful selection algorithm to identify a regression model. Three steps exist: 1. Unadjusted logistic regression to identify candidate variables under a p-value threshold (entry_criteria) 2. Multivariable regression of candidate variables. Iterative process, starting from the value with the highest p-value, variables are retained if they fall under category of a significant variable (< retention_criteria) or confounding (i.e., effect size +/- confounding criteria). 3. All variables that failed step 1 are introduced to the model and retained if p-value < retention criteria (retention_criteria) Reference: https://scfbm.biomedcentral.com/articles/10.1186/1751-0473-3-17
Usage
purposeful_selection_algorithm(
outcome,
variables,
dataset,
entry_criteria = 0.2,
retention_criteria = 0.1,
confounding_criteria = 0.2
)
Arguments
- outcome
Outcome of interest
- variables
Exposure variables of interest. Must be numeric or one-hot encoded
- dataset
Dataframe that contains the trait and exposure variables
- entry_criteria
P-value criteria for entry into the model. Default = 0.2
- retention_criteria
P-value criteria for retention into the model. Default = 0.1
- confounding_criteria
Percent change of effect size. Set value to be very high (i.e., 1000) if testing for confounding is not desired. Default = 0.2.