R Regression imputation on missing data

问题内容:

Hi! I’m trying to apply regression imputation on miss values of a dataset ‘chmiss’ from package ‘faraway’ and library ‘faraway’, but the code I have so far is having trouble to fit regression with dataframe when dropping a column happens the same time. Could anyone give me a hand on correcting the code?

X <- chmiss
for(j in c(1:4,6)){
     new_Y <- X[,j]
     new_X <- X[,c(-j,-5)]
     new_XY <- cbind(new_X,new_Y)
     temp_lm <- lm(new_Y~.,data=new_XY)
     X[is.na(new_Y),j] <- predict(temp_lm,new_X[is.na(new_Y),c(-j,-5)])
}

问题评论:

答案:

答案1:

Try this:

library(faraway)
data(chmiss)
X <- chmiss
for(j in c(1:4,6)){
  new_Y <- X[,j]
  new_X <- X[,c(-j,-5)]
  new_XY <- cbind(new_X,new_Y)
  temp_lm <- lm(new_Y~.,data=new_XY)
  X[is.na(new_Y),j] <- predict(temp_lm,new_X[is.na(new_Y),]) ## difference here
}

You remove the columns c(-j,-5) already to create new_X, so when you do it again for the predict call it drop useful columns instead.

答案评论:

原文地址:

https://stackoverflow.com/questions/47756124/r-regression-imputation-on-missing-data

添加评论

友情链接:蝴蝶教程