Simulation and Extrapolation for Measurement Error Correction

simexreg is used to correct a regression object with a variable measured with error via SIMEX by Cook et al. (1994) and Küchenhoff et al. (2006).

simexreg(
  reg = NULL,
  formula = NULL,
  data = NULL,
  weights = NULL,
  MEvariable = NULL,
  MEvartype = NULL,
  MEerror = NULL,
  variance = FALSE,
  lambda = c(0.5, 1, 1.5, 2),
  B = 200
)

# S3 method for class 'simexreg'
coef(object, ...)

# S3 method for class 'simexreg'
vcov(object, ...)

# S3 method for class 'simexreg'
sigma(object, ...)

# S3 method for class 'simexreg'
formula(x, ...)

# S3 method for class 'simexreg'
family(object, ...)

# S3 method for class 'simexreg'
predict(object, ...)

# S3 method for class 'simexreg'
model.frame(formula, ...)

# S3 method for class 'simexreg'
print(x, ...)

# S3 method for class 'simexreg'
summary(object, ...)

# S3 method for class 'summary.simexreg'
print(x, digits = 4, ...)

# S3 method for class 'simexreg'
update(object, ..., evaluate = TRUE)

Arguments

reg: naive regression object. See Details.
formula: regression formula
data: new dataset for reg
MEvariable: variable measured with error
MEvartype: type of the variable measured with error. Can be continuous or categorical (first 3 letters are enough).
MEerror: the standard deviation of the measurement error (when MEvartype is continuous) or the misclassification matrix (when MEvartype is categorical).
variance: a logical value. If TRUE, estimate the var-cov matrix of coefficients through Jackknife. Default is FALSE.
lambda: a vector of lambdas for SIMEX. Default is c(0.5, 1, 1.5, 2).
B: number of simulations for SIMEX. Default is 200.
object: an object of class simexreg
...: additional arguments
x: an object of class simexreg
digits: minimal number of significant digits. See print.default.
evaluate: a logical value. If TRUE, the updated call is evaluated. Default is TRUE.

Value

If MEvariable is not in the regression formula, reg is returned. If MEvariable is in the regression formula, an object of class simexreg is returned:

call: the function call,
NAIVEreg: the naive regression object,
ME: a list of MEvariable, MEvartype, MEerror, variance, lambda and B,
RCcoef: coefficient estimates corrected by SIMEX,
RCsigma: the residual standard deviation of a linear regression object corrected by SIMEX,
RCvcov: the var-cov matrix of coefficients corrected by SIMEX,

...

Details

reg fitted by lm, glm (with family gaussian, binomial or poisson), multinom, polr, coxph or survreg is supported.

Methods (by generic)

coef(simexreg): Extract coefficients corrected by simexreg
vcov(simexreg): Extract the var-cov matrix of coefficients corrected by simexreg
sigma(simexreg): Extract the residual standard deviation of a linear regression object corrected by simexreg
formula(simexreg): Extract the regression formula
family(simexreg): Extract the family of a regression of class lm or glm
predict(simexreg): Predict with new data
model.frame(simexreg): Extract the model frame
print(simexreg): Print results of simexreg nicely
summary(simexreg): Summarize results of simexreg nicely
update(simexreg): Update simexreg

Functions

print(summary.simexreg): Print summary of simexreg nicely

References

Carrol RJ, Ruppert D, Stefanski LA, Crainiceanu C (2006). Measurement Error in Nonlinear Models: A Modern Perspective, Second Edition. London: Chapman & Hall.

Cook JR, Stefanski LA (1994). Simulation-extrapolation estimation in parametric measurement error models. Journal of the American Statistical Association, 89(428): 1314 - 1328.

Küchenhoff H, Mwalili SM, Lesaffre E (2006). A general method for dealing with misclassification in regression: the misclassification SIMEX. Biometrics. 62(1): 85 - 96.

Stefanski LA, Cook JR (1995). Simulation-extrapolation: the measurement error jackknife. Journal of the American Statistical Association. 90(432): 1247 - 56.

Examples


if (FALSE) { # \dontrun{
rm(list=ls())
library(CMAverse)

# lm
n <- 1000
x1 <- rnorm(n, mean = 5, sd = 3)
x2_true <- rnorm(n, mean = 2, sd = 1)
error1 <- rnorm(n, mean = 0, sd = 0.5)
x2_error <- x2_true + error1
x3 <- rbinom(n, size = 1, prob = 0.4)
y <- 1 + 2 * x1 + 4 * x2_true + 2 * x3  + rnorm(n, mean = 0, sd = 2)
data <- data.frame(x1 = x1, x2_true = x2_true, x2_error = x2_error,
                   x3 = x3, y = y)
reg_naive <- lm(y ~ x1 + x2_error + x3, data = data)
reg_true <- lm(y ~ x1 + x2_true + x3, data = data)
reg_simex <- simexreg(reg = reg_naive, data = data, 
MEvariable = "x2_error", MEvartype = "con", MEerror = 0.5, variance = TRUE)
coef(reg_simex)
vcov(reg_simex)
sigma(reg_simex)
formula(reg_simex)
family(reg_simex)
predict(reg_simex, newdata = data[1, ])
reg_simex_model <- model.frame(reg_simex)
reg_simex_update <- update(reg_simex, data = data, weights = rep(1, n))
reg_simex_summ <- summary(reg_simex)
                
# glm
n <- 1000
x1 <- rnorm(n, mean = 5, sd = 3)
x2_true <- sample(x = c(1:3), size = n, prob = c(0.2,0.3,0.5), replace = TRUE)
MEerror <- matrix(c(0.8,0.1,0.1,0.2,0.7,0.1,0.05,0.25,0.7), nrow = 3)
x2_error <- x2_true
for (j in 1:3) {
  x2_error[which(x2_error == c(1:3)[j])] <-
    sample(x = c(1:3), size = length(which(x2_error == c(1:3)[j])),
           prob = MEerror[, j], replace = TRUE)
}
x2_true <- as.factor(x2_true)
x2_error <- as.factor(x2_error)
x3 <- rnorm(n, mean = 2, sd = 1)
linearpred <- 1 + 0.3 * x1 - 1.5*(x2_true == 2) - 2.5*(x2_true == 3) - 0.2 * x3
py <- exp(linearpred) / (1 + exp(linearpred))
y <- rbinom(n, size = 1, prob = py)
data <- data.frame(x1 = x1, x2_true = x2_true, x2_error = x2_error,
                   x3 = x3, y = y)
reg_naive <- glm(y ~ x1 + x2_error + x3, data = data, family = binomial("logit"))
reg_true <- glm(y ~ x1 + x2_true + x3, data = data, family = binomial("logit"))
reg_simex <- simexreg(reg = reg_naive, data = data, 
MEvariable = "x2_error", MEerror = MEerror, variance = TRUE, MEvartype = "cat")
} # }