Skip to contents

This function performs regression on our dataset

Usage

phyloaware_regression(
  trait,
  variables,
  df,
  first_present = NULL,
  patient_id = NULL,
  culture_date = NULL,
  multivariable = NULL,
  stepwise_direction = NULL,
  entry_criteria = NULL,
  retention_criteria = NULL,
  confounding_criteria = NULL
)

Arguments

trait

Trait of interest

variables

Exposure variables of interest. Must be numeric or one-hot encoded

df

Dataframe that contains the trait, exposure variables, and other requested variables.

first_present

Boolean (i.e., TRUE, FALSE) whether to select a participant's first isolate with the trait.

patient_id

Patient identifier variable stored in dataframe df. Required if first_present == TRUE.

culture_date

Culture date used to select the participant's first isolate. Must be stored as a variable in the dataframe df. Required if first_present == TRUE.

multivariable

Defines the multivariable selection strategy. Options include: 'purposeful', 'AIC', 'pvalue', and 'multivariable.'

stepwise_direction

Direction of stepwise selection. Options include: 'both', 'backward', or 'forward'. For more information, see stats::step.

entry_criteria

P-value for defining candidate variables for multivariable regression. Used in pvalue and purposeful selection. Suggestion: 0.2.

retention_criteria

P-value for retaining candidate variables in model Used in pvalue and purposeful selection. Suggestion: 0.1.

confounding_criteria

Percent change of effect size. Set value to be very high (i.e., 1000) if testing for confounding is not desired. Default = 0.2.

Value

List with datasets, univariable, and multivariable results (if requested)

Details

Alongside univariable regression, multivariable options included in this implementation: 1. Multivariable: Standard multivariable regression of all variables 2. pvalue: A p-value informed logistic regression 3. AIC: Stepwise AIC 4. Purposeful selection: Iterative model selection accounting for p-value and confounding. Hosmer & Lemeshow 2000